Master Qdrant for building powerful vector search applications. Complete guide to self-hosted and cloud deployment, filtering, and production optimisation for Australian developers.
Qdrant is a high-performance, open-source vector database that's rapidly becoming the preferred choice for developers who need full control over their AI infrastructure. For Australian businesses with data sovereignty requirements or those seeking cost-effective scaling, Qdrant offers a compelling combination of performance, flexibility, and deployment options.
This comprehensive guide covers everything from local development to production deployment of Qdrant, with practical examples and Australian business context throughout. Whether you're building a RAG system, semantic search engine, or recommendation platform, understanding Qdrant empowers you to build AI applications with complete infrastructure control.
Qdrant distinguishes itself through exceptional performance, rich filtering capabilities, and deployment flexibility. Built in Rust for speed and reliability, it handles production workloads while remaining accessible for development.
| Feature | Qdrant | Pinecone | Weaviate | Chroma |
|---|---|---|---|---|
| Open Source | ✓ Apache 2.0 | ✗ Managed only | ✓ BSD-3 | ✓ Apache 2.0 |
| Self-Hosted | ✓ Full support | ✗ No | ✓ Full support | ✓ Full support |
| Cloud Option | ✓ Qdrant Cloud | ✓ Primary | ✓ Weaviate Cloud | ✗ Limited |
| Performance | Excellent (Rust) | Excellent | Good (Go) | Moderate |
| Filtering | Advanced | Good | Advanced | Basic |
| Australian DC | ✓ Self-host/AWS Sydney | ✗ US/EU only | ✓ Self-host | ✓ Self-host |
Self-host in Australian data centres or on AWS Sydney. Full control over where your data resides for Privacy Act compliance.
No per-query pricing. Pay only for infrastructure. Significant savings at scale compared to managed services.
Deploy close to your users. Australian-hosted Qdrant means sub-50ms queries from local applications.
Tune every parameter. Implement custom distance metrics. Integrate with existing infrastructure seamlessly.
Qdrant offers multiple deployment options from local development to production clusters. Let's explore each approach.
# Pull and run Qdrant container
docker pull qdrant/qdrant
docker run -p 6333:6333 -p 6334:6334 \
-v $(pwd)/qdrant_storage:/qdrant/storage:z \
qdrant/qdrant
# Qdrant is now available at:
# REST API: http://localhost:6333
# gRPC: localhost:6334
# Dashboard: http://localhost:6333/dashboard
# docker-compose.yml
version: '3.8'
services:
qdrant:
image: qdrant/qdrant:latest
restart: always
ports:
- "6333:6333"
- "6334:6334"
volumes:
- ./qdrant_storage:/qdrant/storage
- ./qdrant_config:/qdrant/config
environment:
- QDRANT__SERVICE__API_KEY=your-secure-api-key
- QDRANT__SERVICE__ENABLE_TLS=true
deploy:
resources:
limits:
memory: 4G
reservations:
memory: 2G
# Download latest release
wget https://github.com/qdrant/qdrant/releases/latest/download/qdrant-x86_64-unknown-linux-gnu.tar.gz
tar -xzf qdrant-x86_64-unknown-linux-gnu.tar.gz
# Run with default config
./qdrant
# Install Qdrant client
pip install qdrant-client
# For async support
pip install qdrant-client[fastembed]
from qdrant_client import QdrantClient
# Connect to local instance
client = QdrantClient(host="localhost", port=6333)
# Or connect with API key
client = QdrantClient(
host="localhost",
port=6333,
api_key="your-api-key"
)
# Verify connection
print(client.get_collections())Collections in Qdrant are similar to tables in traditional databases. They store vectors along with associated payload data and support sophisticated filtering.
from qdrant_client import QdrantClient, models
client = QdrantClient(host="localhost", port=6333)
# Create collection with optimal settings
client.create_collection(
collection_name="australian_documents",
vectors_config=models.VectorParams(
size=1536, # OpenAI embedding dimension
distance=models.Distance.COSINE
),
# Optimise for different workloads
optimizers_config=models.OptimizersConfigDiff(
indexing_threshold=20000, # When to build index
memmap_threshold=50000 # When to use disk
),
# Payload schema for filtering
hnsw_config=models.HnswConfigDiff(
m=16, # Connections per node
ef_construct=100 # Build-time accuracy
)
)
Best for normalised embeddings. Most common choice for text embeddings from OpenAI, Cohere, etc.
Range: -1 to 1
Direct distance between points. Good for image embeddings and when magnitude matters.
Range: 0 to ∞
Fastest computation. Use when vectors are already normalised or for recommendation systems.
Range: -∞ to ∞
from qdrant_client import models
import uuid
# Single point insertion
client.upsert(
collection_name="australian_documents",
points=[
models.PointStruct(
id=str(uuid.uuid4()),
vector=embedding, # Your 1536-dim vector
payload={
"title": "Privacy Act 1988 Overview",
"category": "compliance",
"state": "federal",
"date": "2024-01-15",
"word_count": 2500,
"tags": ["privacy", "legislation", "data-protection"]
}
)
]
)
# Batch insertion (much faster)
points = [
models.PointStruct(
id=str(uuid.uuid4()),
vector=doc["embedding"],
payload={
"title": doc["title"],
"content": doc["text"][:1000],
"category": doc["category"],
"source": doc["source"]
}
)
for doc in documents
]
# Upload in batches
batch_size = 100
for i in range(0, len(points), batch_size):
batch = points[i:i + batch_size]
client.upsert(
collection_name="australian_documents",
points=batch
)
# Create payload indexes for fields you'll filter on
client.create_payload_index(
collection_name="australian_documents",
field_name="category",
field_schema=models.PayloadSchemaType.KEYWORD
)
client.create_payload_index(
collection_name="australian_documents",
field_name="date",
field_schema=models.PayloadSchemaType.DATETIME
)
client.create_payload_index(
collection_name="australian_documents",
field_name="word_count",
field_schema=models.PayloadSchemaType.INTEGER
)Qdrant's filtering capabilities are among the most powerful of any vector database, enabling complex hybrid search queries that combine semantic similarity with precise attribute filtering.
# Simple vector search
results = client.search(
collection_name="australian_documents",
query_vector=query_embedding,
limit=10
)
for result in results:
print(f"Score: {result.score:.4f}")
print(f"Title: {result.payload.get('title')}")
print(f"Content: {result.payload.get('content')[:200]}...")
print("---")
# Filter by category
results = client.search(
collection_name="australian_documents",
query_vector=query_embedding,
query_filter=models.Filter(
must=[
models.FieldCondition(
key="category",
match=models.MatchValue(value="compliance")
)
]
),
limit=10
)
# Multiple conditions with AND
results = client.search(
collection_name="australian_documents",
query_vector=query_embedding,
query_filter=models.Filter(
must=[
models.FieldCondition(
key="category",
match=models.MatchValue(value="compliance")
),
models.FieldCondition(
key="state",
match=models.MatchAny(any=["NSW", "VIC", "federal"])
),
models.FieldCondition(
key="date",
range=models.Range(gte="2024-01-01")
)
]
),
limit=10
)
# OR conditions with should
results = client.search(
collection_name="australian_documents",
query_vector=query_embedding,
query_filter=models.Filter(
should=[
models.FieldCondition(
key="category",
match=models.MatchValue(value="privacy")
),
models.FieldCondition(
key="tags",
match=models.MatchAny(any=["data-protection", "GDPR"])
)
]
),
limit=10
)
# Nested AND/OR
results = client.search(
collection_name="australian_documents",
query_vector=query_embedding,
query_filter=models.Filter(
must=[
models.FieldCondition(
key="word_count",
range=models.Range(gte=1000, lte=5000)
)
],
should=[
models.Filter(
must=[
models.FieldCondition(
key="category",
match=models.MatchValue(value="compliance")
),
models.FieldCondition(
key="state",
match=models.MatchValue(value="federal")
)
]
),
models.Filter(
must=[
models.FieldCondition(
key="category",
match=models.MatchValue(value="legal")
)
]
)
]
),
limit=10
)
# Full-text search within payload (requires text index)
client.create_payload_index(
collection_name="australian_documents",
field_name="content",
field_schema=models.TextIndexParams(
type="text",
tokenizer=models.TokenizerType.WORD,
min_token_len=2,
max_token_len=15,
lowercase=True
)
)
# Search with text filter
results = client.search(
collection_name="australian_documents",
query_vector=query_embedding,
query_filter=models.Filter(
must=[
models.FieldCondition(
key="content",
match=models.MatchText(text="privacy breach")
)
]
),
limit=10
)
| Filter Type | Use Case | Example |
|---|---|---|
| MatchValue | Exact match | category = "compliance" |
| MatchAny | Multiple values (OR) | state IN ["NSW", "VIC"] |
| MatchExcept | Exclusion | status NOT IN ["draft"] |
| Range | Numeric/date ranges | date >= "2024-01-01" |
| MatchText | Full-text search | content CONTAINS "privacy" |
| IsEmpty | Null checks | tags IS NOT EMPTY |
Let's build a complete retrieval-augmented generation system using Qdrant, designed for Australian business knowledge management.
from qdrant_client import QdrantClient, models
from openai import OpenAI
from typing import Optional
import uuid
class AustralianKnowledgeRAG:
"""RAG system using Qdrant for Australian business knowledge."""
def __init__(
self,
qdrant_host: str = "localhost",
qdrant_port: int = 6333,
qdrant_api_key: Optional[str] = None,
openai_api_key: str = None,
collection_name: str = "au_knowledge"
):
# Initialize clients
self.qdrant = QdrantClient(
host=qdrant_host,
port=qdrant_port,
api_key=qdrant_api_key
)
self.openai = OpenAI(api_key=openai_api_key)
self.collection_name = collection_name
self.embedding_model = "text-embedding-3-small"
# Ensure collection exists
self._init_collection()
def _init_collection(self):
"""Create collection if it doesn't exist."""
collections = self.qdrant.get_collections().collections
exists = any(c.name == self.collection_name for c in collections)
if not exists:
self.qdrant.create_collection(
collection_name=self.collection_name,
vectors_config=models.VectorParams(
size=1536,
distance=models.Distance.COSINE
)
)
# Create payload indexes
for field, schema in [
("category", models.PayloadSchemaType.KEYWORD),
("source", models.PayloadSchemaType.KEYWORD),
("date", models.PayloadSchemaType.DATETIME)
]:
self.qdrant.create_payload_index(
collection_name=self.collection_name,
field_name=field,
field_schema=schema
)
def get_embedding(self, text: str) -> list[float]:
"""Generate embedding for text."""
response = self.openai.embeddings.create(
input=text,
model=self.embedding_model
)
return response.data[0].embedding
def add_documents(self, documents: list[dict]) -> int:
"""Add documents to the knowledge base."""
points = []
for doc in documents:
embedding = self.get_embedding(doc["content"])
points.append(
models.PointStruct(
id=str(uuid.uuid4()),
vector=embedding,
payload={
"title": doc.get("title", ""),
"content": doc["content"],
"category": doc.get("category", "general"),
"source": doc.get("source", ""),
"date": doc.get("date", ""),
"metadata": doc.get("metadata", {})
}
)
)
# Batch upload
self.qdrant.upsert(
collection_name=self.collection_name,
points=points
)
return len(points)
def search(
self,
query: str,
limit: int = 5,
category: Optional[str] = None,
min_score: float = 0.7
) -> list[dict]:
"""Search for relevant documents."""
query_embedding = self.get_embedding(query)
# Build filter
filter_conditions = []
if category:
filter_conditions.append(
models.FieldCondition(
key="category",
match=models.MatchValue(value=category)
)
)
query_filter = models.Filter(must=filter_conditions) if filter_conditions else None
results = self.qdrant.search(
collection_name=self.collection_name,
query_vector=query_embedding,
query_filter=query_filter,
limit=limit,
score_threshold=min_score
)
return [
{
"title": r.payload.get("title", ""),
"content": r.payload.get("content", ""),
"category": r.payload.get("category", ""),
"source": r.payload.get("source", ""),
"score": r.score
}
for r in results
]
def generate_response(
self,
query: str,
context_docs: list[dict]
) -> str:
"""Generate response using retrieved context."""
context = "
---
".join([
f"Source: {doc['title']} ({doc['source']})
{doc['content']}"
for doc in context_docs
])
system_prompt = """You are an expert assistant for Australian businesses.
Use Australian English spelling (organisation, colour, centre).
Answer based on the provided context. Cite sources where relevant.
If the context doesn't contain the answer, acknowledge this."""
response = self.openai.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": f"Context:
{context}
Question: {query}"}
],
temperature=0.3
)
return response.choices[0].message.content
def ask(
self,
query: str,
category: Optional[str] = None,
limit: int = 5
) -> dict:
"""Complete RAG pipeline."""
docs = self.search(query, limit=limit, category=category)
if not docs:
return {
"query": query,
"response": "I couldn't find relevant information for your query.",
"sources": []
}
response = self.generate_response(query, docs)
return {
"query": query,
"response": response,
"sources": [{"title": d["title"], "source": d["source"]} for d in docs]
}
# Usage example
rag = AustralianKnowledgeRAG(
openai_api_key="your-key",
collection_name="au_business_knowledge"
)
# Add documents
rag.add_documents([
{
"title": "Privacy Act Overview",
"content": "The Privacy Act 1988 regulates...",
"category": "compliance",
"source": "OAIC"
}
])
# Query
result = rag.ask(
"What are the key requirements under the Privacy Act?",
category="compliance"
)For teams that want Qdrant's capabilities without infrastructure management, Qdrant Cloud provides a fully managed service with global deployment options.
from qdrant_client import QdrantClient
# Connect to Qdrant Cloud
client = QdrantClient(
url="https://your-cluster-id.aws.cloud.qdrant.io:6333",
api_key="your-api-key"
)
# All operations work the same as self-hosted
client.create_collection(
collection_name="my_collection",
vectors_config=models.VectorParams(
size=1536,
distance=models.Distance.COSINE
)
)
| Tier | Vectors | Monthly Cost (USD) | Best For |
|---|---|---|---|
| Free | 1M vectors | $0 | Development, testing |
| Starter | 5M vectors | ~$25 | Small production apps |
| Standard | 25M vectors | ~$100 | Medium workloads |
| Enterprise | Custom | Custom | Large scale, compliance |
| Factor | Self-Hosted | Qdrant Cloud |
|---|---|---|
| Setup Time | Hours to days | Minutes |
| Maintenance | Your responsibility | Managed |
| Data Sovereignty | Full control | Region selection |
| Cost at Scale | Infrastructure only | Premium pricing |
| Compliance | Your audit | SOC 2 available |
Deploying Qdrant in production requires attention to performance tuning, high availability, and monitoring.
# Optimise collection for performance
client.update_collection(
collection_name="production_docs",
optimizer_config=models.OptimizersConfigDiff(
# Index threshold - lower = faster indexing, higher = better performance
indexing_threshold=10000,
# Memory mapping threshold
memmap_threshold=50000
),
hnsw_config=models.HnswConfigDiff(
# Higher m = better recall, more memory
m=16,
# Higher ef_construct = better index quality, slower build
ef_construct=100
)
)
# Set search-time parameters
results = client.search(
collection_name="production_docs",
query_vector=query_embedding,
limit=10,
search_params=models.SearchParams(
# Higher ef = better recall, slower search
hnsw_ef=128,
# Exact search (slower but 100% recall)
exact=False
)
)
# docker-compose-cluster.yml
version: '3.8'
services:
qdrant-node1:
image: qdrant/qdrant:latest
ports:
- "6333:6333"
volumes:
- ./node1_storage:/qdrant/storage
environment:
- QDRANT__CLUSTER__ENABLED=true
- QDRANT__CLUSTER__P2P__PORT=6335
qdrant-node2:
image: qdrant/qdrant:latest
ports:
- "6334:6333"
volumes:
- ./node2_storage:/qdrant/storage
environment:
- QDRANT__CLUSTER__ENABLED=true
- QDRANT__CLUSTER__P2P__PORT=6335
- QDRANT__CLUSTER__BOOTSTRAP=http://qdrant-node1:6335
# Health check endpoint
import requests
def check_qdrant_health(host: str, port: int) -> dict:
"""Check Qdrant health status."""
response = requests.get(f"http://{host}:{port}/healthz")
return response.json()
# Get telemetry data
def get_telemetry(host: str, port: int) -> dict:
"""Get Qdrant metrics."""
response = requests.get(f"http://{host}:{port}/telemetry")
return response.json()
# Collection statistics
def get_collection_stats(client, collection_name: str) -> dict:
"""Get detailed collection statistics."""
info = client.get_collection(collection_name)
return {
"vectors_count": info.vectors_count,
"points_count": info.points_count,
"segments_count": len(info.segments),
"status": info.status,
"optimizer_status": info.optimizer_status
}
# Create snapshot
snapshot_info = client.create_snapshot(collection_name="production_docs")
print(f"Snapshot created: {snapshot_info.name}")
# List snapshots
snapshots = client.list_snapshots(collection_name="production_docs")
# Recover from snapshot
client.recover_snapshot(
collection_name="production_docs",
location=f"http://localhost:6333/collections/production_docs/snapshots/{snapshot_name}"
)
A Sydney-based legal tech startup deployed Qdrant to power their contract analysis platform, demonstrating self-hosted vector database deployment for Australian data sovereignty requirements.
The startup needed semantic search across millions of legal documents while maintaining strict data sovereignty requirements. Documents contained sensitive client information that couldn't leave Australian jurisdiction.
# Multi-tenant namespace strategy
def get_client_namespace(client_id: str) -> str:
return f"client_{client_id}"
# Secure document ingestion
def ingest_client_document(
client_id: str,
document: dict,
qdrant_client: QdrantClient
):
namespace = get_client_namespace(client_id)
# Embed with local model for sensitive docs
if document.get("sensitivity") == "high":
embedding = local_embedding_model.encode(document["text"])
else:
embedding = openai_embed(document["text"])
qdrant_client.upsert(
collection_name="legal_docs",
points=[
models.PointStruct(
id=document["id"],
vector=embedding,
payload={
"title": document["title"],
"type": document["doc_type"],
"date": document["date"],
"parties": document["parties"],
"jurisdiction": document["jurisdiction"]
}
)
],
# Tenant isolation via namespace
namespace=namespace
)
Qdrant offers Australian businesses a powerful, flexible vector database solution with full control over data and infrastructure. Whether you choose self-hosted deployment for data sovereignty or Qdrant Cloud for convenience, the platform delivers the performance and features needed for production AI applications.
The combination of open-source transparency, Rust-powered performance, and sophisticated filtering capabilities makes Qdrant particularly well-suited for building semantic search, RAG systems, and recommendation engines. For organisations with compliance requirements or cost-sensitive workloads, self-hosting in Australian data centres provides complete control while maintaining sub-millisecond query performance.
Start with Docker for development, validate your use case with real data, and scale confidently knowing that Qdrant's architecture supports everything from single-node deployments to distributed clusters handling billions of vectors.
Complete guide to setting up Pinecone for vector search and AI applications. Learn indexing strategies, query optimisation, and production deployment for Australian enterprises.
Master LangChain for building sophisticated AI applications. Complete guide to chains, agents, memory, and retrieval systems for Australian developers.
Master the OpenAI API for production applications. From GPT-4 to embeddings, learn how Australian businesses build custom AI solutions with practical code examples and cost optimisation strategies.