Qdrant Vector Database Guide: Open-Source AI Search for Australia
Master Qdrant for building powerful vector search applications. Complete guide to self-hosted and cloud deployment, filtering, and production optimisation for Australian developers.
Qdrant is a high-performance, open-source vector database that's rapidly becoming the preferred choice for developers who need full control over their AI infrastructure. For Australian businesses with data sovereignty requirements or those seeking cost-effective scaling, Qdrant offers a compelling combination of performance, flexibility, and deployment options.
This comprehensive guide covers everything from local development to production deployment of Qdrant, with practical examples and Australian business context throughout. Whether you're building a RAG system, semantic search engine, or recommendation platform, understanding Qdrant empowers you to build AI applications with complete infrastructure control.
What You'll Learn
- Qdrant architecture and key advantages
- Local and Docker deployment options
- Collection configuration and optimisation
- Advanced filtering and payload management
- Qdrant Cloud for managed deployment
- Production scaling and performance tuning
Key Takeaways
- Qdrant is a high-performance, open-source vector database built in Rust with both self-hosted and cloud options
- Self-hosting on AWS Sydney enables full data sovereignty compliance for Australian businesses
- Advanced filtering capabilities support complex hybrid queries combining semantic search with attribute filters
- Payload indexing significantly improves filtered query performance - index fields you filter on frequently
- Namespace-based multi-tenancy provides secure data isolation without separate collections
- Qdrant Cloud offers managed deployment with Australian region availability for reduced operational burden
- Production deployments should implement API authentication, TLS encryption, and regular snapshot backups
Why Choose Qdrant?
Qdrant distinguishes itself through exceptional performance, rich filtering capabilities, and deployment flexibility. Built in Rust for speed and reliability, it handles production workloads while remaining accessible for development.
Qdrant vs Other Vector Databases
| Feature | Qdrant | Pinecone | Weaviate | Chroma |
|---|---|---|---|---|
| Open Source | ✓ Apache 2.0 | ✗ Managed only | ✓ BSD-3 | ✓ Apache 2.0 |
| Self-Hosted | ✓ Full support | ✗ No | ✓ Full support | ✓ Full support |
| Cloud Option | ✓ Qdrant Cloud | ✓ Primary | ✓ Weaviate Cloud | ✗ Limited |
| Performance | Excellent (Rust) | Excellent | Good (Go) | Moderate |
| Filtering | Advanced | Good | Advanced | Basic |
| Australian DC | ✓ Self-host/AWS Sydney | ✗ US/EU only | ✓ Self-host | ✓ Self-host |
Key Advantages for Australian Businesses
Data Sovereignty
Self-host in Australian data centres or on AWS Sydney. Full control over where your data resides for Privacy Act compliance.
Cost Control
No per-query pricing. Pay only for infrastructure. Significant savings at scale compared to managed services.
Low Latency
Deploy close to your users. Australian-hosted Qdrant means sub-50ms queries from local applications.
Full Customisation
Tune every parameter. Implement custom distance metrics. Integrate with existing infrastructure seamlessly.
Getting Started with Qdrant
Qdrant offers multiple deployment options from local development to production clusters. Let's explore each approach.
Option 1: Docker (Recommended for Development)
# Pull and run Qdrant container
docker pull qdrant/qdrant
docker run -p 6333:6333 -p 6334:6334 \
-v $(pwd)/qdrant_storage:/qdrant/storage:z \
qdrant/qdrant
# Qdrant is now available at:
# REST API: http://localhost:6333
# gRPC: localhost:6334
# Dashboard: http://localhost:6333/dashboard
Option 2: Docker Compose (Production-Ready)
# docker-compose.yml
version: '3.8'
services:
qdrant:
image: qdrant/qdrant:latest
restart: always
ports:
- "6333:6333"
- "6334:6334"
volumes:
- ./qdrant_storage:/qdrant/storage
- ./qdrant_config:/qdrant/config
environment:
- QDRANT__SERVICE__API_KEY=your-secure-api-key
- QDRANT__SERVICE__ENABLE_TLS=true
deploy:
resources:
limits:
memory: 4G
reservations:
memory: 2G
Option 3: Local Binary (Quick Testing)
# Download latest release
wget https://github.com/qdrant/qdrant/releases/latest/download/qdrant-x86_64-unknown-linux-gnu.tar.gz
tar -xzf qdrant-x86_64-unknown-linux-gnu.tar.gz
# Run with default config
./qdrant
Install Python Client
# Install Qdrant client
pip install qdrant-client
# For async support
pip install qdrant-client[fastembed]
Connect and Verify
from qdrant_client import QdrantClient
# Connect to local instance
client = QdrantClient(host="localhost", port=6333)
# Or connect with API key
client = QdrantClient(
host="localhost",
port=6333,
api_key="your-api-key"
)
# Verify connection
print(client.get_collections())Collections and Data Management
Collections in Qdrant are similar to tables in traditional databases. They store vectors along with associated payload data and support sophisticated filtering.
Creating a Collection
from qdrant_client import QdrantClient, models
client = QdrantClient(host="localhost", port=6333)
# Create collection with optimal settings
client.create_collection(
collection_name="australian_documents",
vectors_config=models.VectorParams(
size=1536, # OpenAI embedding dimension
distance=models.Distance.COSINE
),
# Optimise for different workloads
optimizers_config=models.OptimizersConfigDiff(
indexing_threshold=20000, # When to build index
memmap_threshold=50000 # When to use disk
),
# Payload schema for filtering
hnsw_config=models.HnswConfigDiff(
m=16, # Connections per node
ef_construct=100 # Build-time accuracy
)
)
Understanding Distance Metrics
Cosine
Best for normalised embeddings. Most common choice for text embeddings from OpenAI, Cohere, etc.
Range: -1 to 1
Euclidean
Direct distance between points. Good for image embeddings and when magnitude matters.
Range: 0 to ∞
Dot Product
Fastest computation. Use when vectors are already normalised or for recommendation systems.
Range: -∞ to ∞
Inserting Vectors with Payloads
from qdrant_client import models
import uuid
# Single point insertion
client.upsert(
collection_name="australian_documents",
points=[
models.PointStruct(
id=str(uuid.uuid4()),
vector=embedding, # Your 1536-dim vector
payload={
"title": "Privacy Act 1988 Overview",
"category": "compliance",
"state": "federal",
"date": "2024-01-15",
"word_count": 2500,
"tags": ["privacy", "legislation", "data-protection"]
}
)
]
)
# Batch insertion (much faster)
points = [
models.PointStruct(
id=str(uuid.uuid4()),
vector=doc["embedding"],
payload={
"title": doc["title"],
"content": doc["text"][:1000],
"category": doc["category"],
"source": doc["source"]
}
)
for doc in documents
]
# Upload in batches
batch_size = 100
for i in range(0, len(points), batch_size):
batch = points[i:i + batch_size]
client.upsert(
collection_name="australian_documents",
points=batch
)
Payload Indexing for Fast Filtering
# Create payload indexes for fields you'll filter on
client.create_payload_index(
collection_name="australian_documents",
field_name="category",
field_schema=models.PayloadSchemaType.KEYWORD
)
client.create_payload_index(
collection_name="australian_documents",
field_name="date",
field_schema=models.PayloadSchemaType.DATETIME
)
client.create_payload_index(
collection_name="australian_documents",
field_name="word_count",
field_schema=models.PayloadSchemaType.INTEGER
)Advanced Querying and Filtering
Qdrant's filtering capabilities are among the most powerful of any vector database, enabling complex hybrid search queries that combine semantic similarity with precise attribute filtering.
Basic Similarity Search
# Simple vector search
results = client.search(
collection_name="australian_documents",
query_vector=query_embedding,
limit=10
)
for result in results:
print(f"Score: {result.score:.4f}")
print(f"Title: {result.payload.get('title')}")
print(f"Content: {result.payload.get('content')[:200]}...")
print("---")
Filtered Search (Hybrid Queries)
# Filter by category
results = client.search(
collection_name="australian_documents",
query_vector=query_embedding,
query_filter=models.Filter(
must=[
models.FieldCondition(
key="category",
match=models.MatchValue(value="compliance")
)
]
),
limit=10
)
# Multiple conditions with AND
results = client.search(
collection_name="australian_documents",
query_vector=query_embedding,
query_filter=models.Filter(
must=[
models.FieldCondition(
key="category",
match=models.MatchValue(value="compliance")
),
models.FieldCondition(
key="state",
match=models.MatchAny(any=["NSW", "VIC", "federal"])
),
models.FieldCondition(
key="date",
range=models.Range(gte="2024-01-01")
)
]
),
limit=10
)
Complex Filter Combinations
# OR conditions with should
results = client.search(
collection_name="australian_documents",
query_vector=query_embedding,
query_filter=models.Filter(
should=[
models.FieldCondition(
key="category",
match=models.MatchValue(value="privacy")
),
models.FieldCondition(
key="tags",
match=models.MatchAny(any=["data-protection", "GDPR"])
)
]
),
limit=10
)
# Nested AND/OR
results = client.search(
collection_name="australian_documents",
query_vector=query_embedding,
query_filter=models.Filter(
must=[
models.FieldCondition(
key="word_count",
range=models.Range(gte=1000, lte=5000)
)
],
should=[
models.Filter(
must=[
models.FieldCondition(
key="category",
match=models.MatchValue(value="compliance")
),
models.FieldCondition(
key="state",
match=models.MatchValue(value="federal")
)
]
),
models.Filter(
must=[
models.FieldCondition(
key="category",
match=models.MatchValue(value="legal")
)
]
)
]
),
limit=10
)
Text Matching Filters
# Full-text search within payload (requires text index)
client.create_payload_index(
collection_name="australian_documents",
field_name="content",
field_schema=models.TextIndexParams(
type="text",
tokenizer=models.TokenizerType.WORD,
min_token_len=2,
max_token_len=15,
lowercase=True
)
)
# Search with text filter
results = client.search(
collection_name="australian_documents",
query_vector=query_embedding,
query_filter=models.Filter(
must=[
models.FieldCondition(
key="content",
match=models.MatchText(text="privacy breach")
)
]
),
limit=10
)
| Filter Type | Use Case | Example |
|---|---|---|
| MatchValue | Exact match | category = "compliance" |
| MatchAny | Multiple values (OR) | state IN ["NSW", "VIC"] |
| MatchExcept | Exclusion | status NOT IN ["draft"] |
| Range | Numeric/date ranges | date >= "2024-01-01" |
| MatchText | Full-text search | content CONTAINS "privacy" |
| IsEmpty | Null checks | tags IS NOT EMPTY |
Building a RAG System with Qdrant
Let's build a complete retrieval-augmented generation system using Qdrant, designed for Australian business knowledge management.
from qdrant_client import QdrantClient, models
from openai import OpenAI
from typing import Optional
import uuid
class AustralianKnowledgeRAG:
"""RAG system using Qdrant for Australian business knowledge."""
def __init__(
self,
qdrant_host: str = "localhost",
qdrant_port: int = 6333,
qdrant_api_key: Optional[str] = None,
openai_api_key: str = None,
collection_name: str = "au_knowledge"
):
# Initialize clients
self.qdrant = QdrantClient(
host=qdrant_host,
port=qdrant_port,
api_key=qdrant_api_key
)
self.openai = OpenAI(api_key=openai_api_key)
self.collection_name = collection_name
self.embedding_model = "text-embedding-3-small"
# Ensure collection exists
self._init_collection()
def _init_collection(self):
"""Create collection if it doesn't exist."""
collections = self.qdrant.get_collections().collections
exists = any(c.name == self.collection_name for c in collections)
if not exists:
self.qdrant.create_collection(
collection_name=self.collection_name,
vectors_config=models.VectorParams(
size=1536,
distance=models.Distance.COSINE
)
)
# Create payload indexes
for field, schema in [
("category", models.PayloadSchemaType.KEYWORD),
("source", models.PayloadSchemaType.KEYWORD),
("date", models.PayloadSchemaType.DATETIME)
]:
self.qdrant.create_payload_index(
collection_name=self.collection_name,
field_name=field,
field_schema=schema
)
def get_embedding(self, text: str) -> list[float]:
"""Generate embedding for text."""
response = self.openai.embeddings.create(
input=text,
model=self.embedding_model
)
return response.data[0].embedding
def add_documents(self, documents: list[dict]) -> int:
"""Add documents to the knowledge base."""
points = []
for doc in documents:
embedding = self.get_embedding(doc["content"])
points.append(
models.PointStruct(
id=str(uuid.uuid4()),
vector=embedding,
payload={
"title": doc.get("title", ""),
"content": doc["content"],
"category": doc.get("category", "general"),
"source": doc.get("source", ""),
"date": doc.get("date", ""),
"metadata": doc.get("metadata", {})
}
)
)
# Batch upload
self.qdrant.upsert(
collection_name=self.collection_name,
points=points
)
return len(points)
def search(
self,
query: str,
limit: int = 5,
category: Optional[str] = None,
min_score: float = 0.7
) -> list[dict]:
"""Search for relevant documents."""
query_embedding = self.get_embedding(query)
# Build filter
filter_conditions = []
if category:
filter_conditions.append(
models.FieldCondition(
key="category",
match=models.MatchValue(value=category)
)
)
query_filter = models.Filter(must=filter_conditions) if filter_conditions else None
results = self.qdrant.search(
collection_name=self.collection_name,
query_vector=query_embedding,
query_filter=query_filter,
limit=limit,
score_threshold=min_score
)
return [
{
"title": r.payload.get("title", ""),
"content": r.payload.get("content", ""),
"category": r.payload.get("category", ""),
"source": r.payload.get("source", ""),
"score": r.score
}
for r in results
]
def generate_response(
self,
query: str,
context_docs: list[dict]
) -> str:
"""Generate response using retrieved context."""
context = "
---
".join([
f"Source: {doc['title']} ({doc['source']})
{doc['content']}"
for doc in context_docs
])
system_prompt = """You are an expert assistant for Australian businesses.
Use Australian English spelling (organisation, colour, centre).
Answer based on the provided context. Cite sources where relevant.
If the context doesn't contain the answer, acknowledge this."""
response = self.openai.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": f"Context:
{context}
Question: {query}"}
],
temperature=0.3
)
return response.choices[0].message.content
def ask(
self,
query: str,
category: Optional[str] = None,
limit: int = 5
) -> dict:
"""Complete RAG pipeline."""
docs = self.search(query, limit=limit, category=category)
if not docs:
return {
"query": query,
"response": "I couldn't find relevant information for your query.",
"sources": []
}
response = self.generate_response(query, docs)
return {
"query": query,
"response": response,
"sources": [{"title": d["title"], "source": d["source"]} for d in docs]
}
# Usage example
rag = AustralianKnowledgeRAG(
openai_api_key="your-key",
collection_name="au_business_knowledge"
)
# Add documents
rag.add_documents([
{
"title": "Privacy Act Overview",
"content": "The Privacy Act 1988 regulates...",
"category": "compliance",
"source": "OAIC"
}
])
# Query
result = rag.ask(
"What are the key requirements under the Privacy Act?",
category="compliance"
)Qdrant Cloud: Managed Deployment
For teams that want Qdrant's capabilities without infrastructure management, Qdrant Cloud provides a fully managed service with global deployment options.
Getting Started with Qdrant Cloud
- Sign up at cloud.qdrant.io
- Create a cluster (free tier available)
- Choose your region (AWS Sydney available for Australian latency)
- Generate an API key
Connecting to Qdrant Cloud
from qdrant_client import QdrantClient
# Connect to Qdrant Cloud
client = QdrantClient(
url="https://your-cluster-id.aws.cloud.qdrant.io:6333",
api_key="your-api-key"
)
# All operations work the same as self-hosted
client.create_collection(
collection_name="my_collection",
vectors_config=models.VectorParams(
size=1536,
distance=models.Distance.COSINE
)
)
Qdrant Cloud Pricing
| Tier | Vectors | Monthly Cost (USD) | Best For |
|---|---|---|---|
| Free | 1M vectors | $0 | Development, testing |
| Starter | 5M vectors | ~$25 | Small production apps |
| Standard | 25M vectors | ~$100 | Medium workloads |
| Enterprise | Custom | Custom | Large scale, compliance |
Self-Hosted vs Cloud Decision Matrix
| Factor | Self-Hosted | Qdrant Cloud |
|---|---|---|
| Setup Time | Hours to days | Minutes |
| Maintenance | Your responsibility | Managed |
| Data Sovereignty | Full control | Region selection |
| Cost at Scale | Infrastructure only | Premium pricing |
| Compliance | Your audit | SOC 2 available |
Production Deployment and Scaling
Deploying Qdrant in production requires attention to performance tuning, high availability, and monitoring.
Performance Tuning
# Optimise collection for performance
client.update_collection(
collection_name="production_docs",
optimizer_config=models.OptimizersConfigDiff(
# Index threshold - lower = faster indexing, higher = better performance
indexing_threshold=10000,
# Memory mapping threshold
memmap_threshold=50000
),
hnsw_config=models.HnswConfigDiff(
# Higher m = better recall, more memory
m=16,
# Higher ef_construct = better index quality, slower build
ef_construct=100
)
)
# Set search-time parameters
results = client.search(
collection_name="production_docs",
query_vector=query_embedding,
limit=10,
search_params=models.SearchParams(
# Higher ef = better recall, slower search
hnsw_ef=128,
# Exact search (slower but 100% recall)
exact=False
)
)
High Availability with Distributed Mode
# docker-compose-cluster.yml
version: '3.8'
services:
qdrant-node1:
image: qdrant/qdrant:latest
ports:
- "6333:6333"
volumes:
- ./node1_storage:/qdrant/storage
environment:
- QDRANT__CLUSTER__ENABLED=true
- QDRANT__CLUSTER__P2P__PORT=6335
qdrant-node2:
image: qdrant/qdrant:latest
ports:
- "6334:6333"
volumes:
- ./node2_storage:/qdrant/storage
environment:
- QDRANT__CLUSTER__ENABLED=true
- QDRANT__CLUSTER__P2P__PORT=6335
- QDRANT__CLUSTER__BOOTSTRAP=http://qdrant-node1:6335
Monitoring and Health Checks
# Health check endpoint
import requests
def check_qdrant_health(host: str, port: int) -> dict:
"""Check Qdrant health status."""
response = requests.get(f"http://{host}:{port}/healthz")
return response.json()
# Get telemetry data
def get_telemetry(host: str, port: int) -> dict:
"""Get Qdrant metrics."""
response = requests.get(f"http://{host}:{port}/telemetry")
return response.json()
# Collection statistics
def get_collection_stats(client, collection_name: str) -> dict:
"""Get detailed collection statistics."""
info = client.get_collection(collection_name)
return {
"vectors_count": info.vectors_count,
"points_count": info.points_count,
"segments_count": len(info.segments),
"status": info.status,
"optimizer_status": info.optimizer_status
}
Backup and Recovery
# Create snapshot
snapshot_info = client.create_snapshot(collection_name="production_docs")
print(f"Snapshot created: {snapshot_info.name}")
# List snapshots
snapshots = client.list_snapshots(collection_name="production_docs")
# Recover from snapshot
client.recover_snapshot(
collection_name="production_docs",
location=f"http://localhost:6333/collections/production_docs/snapshots/{snapshot_name}"
)
Production Checklist
- ✓ Enable API key authentication
- ✓ Configure TLS for encrypted connections
- ✓ Set up regular automated snapshots
- ✓ Monitor memory and disk usage
- ✓ Implement health check endpoints
- ✓ Use payload indexes for filtered queries
- ✓ Test recovery procedures
- ✓ Document collection schemas
Australian Deployment Case Study
A Sydney-based legal tech startup deployed Qdrant to power their contract analysis platform, demonstrating self-hosted vector database deployment for Australian data sovereignty requirements.
Case Study: Legal Document Search Platform
Challenge
The startup needed semantic search across millions of legal documents while maintaining strict data sovereignty requirements. Documents contained sensitive client information that couldn't leave Australian jurisdiction.
Solution
- Infrastructure: Self-hosted Qdrant on AWS Sydney (ap-southeast-2)
- Architecture: 3-node cluster for high availability
- Security: TLS encryption, API key auth, VPC isolation
- Integration: Custom embedding pipeline with local model option
Technical Implementation
# Multi-tenant namespace strategy
def get_client_namespace(client_id: str) -> str:
return f"client_{client_id}"
# Secure document ingestion
def ingest_client_document(
client_id: str,
document: dict,
qdrant_client: QdrantClient
):
namespace = get_client_namespace(client_id)
# Embed with local model for sensitive docs
if document.get("sensitivity") == "high":
embedding = local_embedding_model.encode(document["text"])
else:
embedding = openai_embed(document["text"])
qdrant_client.upsert(
collection_name="legal_docs",
points=[
models.PointStruct(
id=document["id"],
vector=embedding,
payload={
"title": document["title"],
"type": document["doc_type"],
"date": document["date"],
"parties": document["parties"],
"jurisdiction": document["jurisdiction"]
}
)
],
# Tenant isolation via namespace
namespace=namespace
)
Results
Key Learnings
- Self-hosting provides full control for compliance requirements
- AWS Sydney region offers excellent latency for Australian users
- Namespace-based multi-tenancy enables secure client isolation
- Hybrid embedding approach balances performance and sensitivity
Conclusion
Qdrant offers Australian businesses a powerful, flexible vector database solution with full control over data and infrastructure. Whether you choose self-hosted deployment for data sovereignty or Qdrant Cloud for convenience, the platform delivers the performance and features needed for production AI applications.
The combination of open-source transparency, Rust-powered performance, and sophisticated filtering capabilities makes Qdrant particularly well-suited for building semantic search, RAG systems, and recommendation engines. For organisations with compliance requirements or cost-sensitive workloads, self-hosting in Australian data centres provides complete control while maintaining sub-millisecond query performance.
Start with Docker for development, validate your use case with real data, and scale confidently knowing that Qdrant's architecture supports everything from single-node deployments to distributed clusters handling billions of vectors.
Frequently Asked Questions
Is Qdrant suitable for production use?
How does Qdrant compare to Pinecone for Australian businesses?
What are the infrastructure requirements for self-hosting?
How do I handle multi-tenancy in Qdrant?
What embedding models work best with Qdrant?
How do I migrate from another vector database?
What is the query latency for Qdrant?
How do I secure my Qdrant deployment?
Table of Contents
Related Articles
Pinecone Vector Database Setup Guide for Australian Businesses
Complete guide to setting up Pinecone for vector search and AI applications. Learn indexing strategies, query optimisation, and production deployment for Australian enterprises.
LangChain Implementation Guide: Building AI Applications in Australia
Master LangChain for building sophisticated AI applications. Complete guide to chains, agents, memory, and retrieval systems for Australian developers.
OpenAI API Deep Dive: Building AI Applications in Australia
Master the OpenAI API for production applications. From GPT-4 to embeddings, learn how Australian businesses build custom AI solutions with practical code examples and cost optimisation strategies.
