PGVector Hybrid Search
Production-grade semantic + keyword search using PostgreSQL
Overview
Architecture:
Query | [Generate embedding] --> Vector Search (PGVector) --> Top 30 results | [Generate ts_query] --> Keyword Search (BM25) --> Top 30 results | [Reciprocal Rank Fusion (RRF)] --> Merge & re-rank --> Top 10 final results
When to use this skill:
-
Building semantic search (RAG, knowledge bases, recommendations)
-
Implementing hybrid retrieval (vector + keyword)
-
Optimizing PGVector performance
-
Working with large document collections (1M+ chunks)
Quick Reference
Search Type Comparison
Aspect Semantic (Vector) Keyword (BM25)
Query Embedding similarity Exact word matches
Strengths Synonyms, concepts Exact phrases, rare terms
Weaknesses Exact matches, technical terms No semantic understanding
Index HNSW (pgvector) GIN (tsvector)
Index Comparison
Metric IVFFlat HNSW
Query speed 50ms 3ms (17x faster)
Index time 2 min 20 min
Best for < 100k vectors 100k+ vectors
Recall@10 0.85-0.95 0.95-0.99
Recommendation: Use HNSW for production (scales to millions).
RRF Formula
rrf_score = 1/(k + vector_rank) + 1/(k + keyword_rank) # k=60 (standard)
Database Schema
CREATE TABLE chunks ( id UUID PRIMARY KEY, document_id UUID REFERENCES documents(id), content TEXT NOT NULL, embedding vector(1024), -- PGVector content_tsvector tsvector GENERATED ALWAYS AS ( to_tsvector('english', content) ) STORED, section_title TEXT, content_type TEXT, created_at TIMESTAMP DEFAULT NOW() );
-- Indexes CREATE INDEX idx_chunks_embedding ON chunks USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);
CREATE INDEX idx_chunks_content_tsvector ON chunks USING gin (content_tsvector);
Hybrid Search Query (SQLAlchemy)
async def hybrid_search( query: str, query_embedding: list[float], top_k: int = 10 ) -> list[Chunk]: FETCH_MULTIPLIER = 3 # Fetch 30 for better RRF coverage K = 60 # RRF smoothing constant
# Vector search subquery
vector_subq = (
select(Chunk.id,
func.row_number().over(
order_by=Chunk.embedding.cosine_distance(query_embedding)
).label("vector_rank"))
.limit(top_k * FETCH_MULTIPLIER)
.subquery()
)
# Keyword search subquery
ts_query = func.plainto_tsquery("english", query)
keyword_subq = (
select(Chunk.id,
func.row_number().over(
order_by=func.ts_rank_cd(Chunk.content_tsvector, ts_query).desc()
).label("keyword_rank"))
.where(Chunk.content_tsvector.op("@@")(ts_query))
.limit(top_k * FETCH_MULTIPLIER)
.subquery()
)
# RRF fusion with FULL OUTER JOIN
rrf_subq = (
select(
func.coalesce(vector_subq.c.id, keyword_subq.c.id).label("chunk_id"),
(func.coalesce(1.0 / (K + vector_subq.c.vector_rank), 0.0) +
func.coalesce(1.0 / (K + keyword_subq.c.keyword_rank), 0.0)
).label("rrf_score"))
.select_from(vector_subq.outerjoin(keyword_subq, ..., full=True))
.order_by("rrf_score DESC")
.limit(top_k)
.subquery()
)
return await session.execute(
select(Chunk).join(rrf_subq, Chunk.id == rrf_subq.c.chunk_id)
)
Common Patterns
Filtered Search
results = await hybrid_search( query="binary search", query_embedding=embedding, content_type_filter=["code_block"] )
Similarity Threshold
results = await hybrid_search(query, embedding, top_k=50) filtered = [r for r in results if (1 - r.vector_distance) >= 0.75][:10]
Multi-Query Retrieval
queries = ["machine learning", "ML algorithms", "neural networks"] all_results = [await hybrid_search(q, embed(q)) for q in queries] final = deduplicate_and_rerank(all_results)
Performance Tips
-
Pre-compute tsvector - 5-10x faster than to_tsvector() at query time
-
Use HNSW index - 17x faster queries than IVFFlat
-
3x fetch multiplier - Better RRF coverage (30 results per search for top 10)
-
Iterative scan for filtered queries - Set hnsw.iterative_scan = 'relaxed_order'
-
Metadata boosting - +6% MRR with title/path matching
Redis 8 FT.HYBRID Alternative
Redis 8.4 introduced FT.HYBRID with native hybrid search support. Consider this alternative when latency is critical.
FT.HYBRID Command Syntax
FT.HYBRID index query VECTOR $embedding AS vec_score TEXT "search terms" AS text_score COMBINE RRF LIMIT 0 10 RETURN 3 content title score
Comparison: pgvector vs Redis 8 FT.HYBRID
Aspect pgvector Redis 8 FT.HYBRID
Setup complexity Medium (extensions, indexes) Low (single FT.CREATE)
RRF implementation Manual SQL with FULL OUTER JOIN Native COMBINE RRF
Keyword scoring ts_rank_cd (TF-IDF variant) BM25STD (configurable)
Typical latency 5-20ms 2-5ms
Persistence ACID transactions AOF/RDB (configurable)
Max dataset Billions (with partitioning) Memory-bound (~100M vectors)
Joins/relations Full SQL support Limited (hash/JSON docs)
Operational maturity Battle-tested New (8.4, 2024)
When to Choose pgvector
-
Already using PostgreSQL - No new infrastructure, single source of truth
-
Need ACID transactions - Atomic updates across search index and relational data
-
Complex joins with relational data - Foreign keys, aggregations, reporting
-
Large datasets - Billions of vectors with table partitioning
-
Compliance requirements - Established PostgreSQL security/audit tooling
When to Choose Redis 8
-
Need sub-5ms latency - Real-time autocomplete, typeahead, live recommendations
-
Caching layer with search - Already using Redis for cache, add search capability
-
Simpler deployment - Single binary, no extensions or complex configuration
-
Ephemeral search indexes - Rebuild from source of truth is acceptable
-
Memory budget allows - Dataset fits comfortably in RAM
References
Detailed Implementation Guides
Reference Description Use When
index-strategies.md HNSW vs IVFFlat, tuning, iterative scans Choosing/optimizing indexes
hybrid-search-rrf.md RRF algorithm, SQL implementation, debugging Implementing hybrid search
metadata-filtering.md Pre/post filtering, score boosting Improving relevance
External Resources
-
PGVector GitHub
-
HNSW Index Guide
Related Skills
-
ai-native-development
-
Embeddings and vector concepts
-
database-schema-designer
-
Schema design for vector search
Version: 1.2.0 | Status: Production-ready | Updated: pgvector 0.8.1
Capability Details
hybrid-search-rrf
Keywords: hybrid search, rrf, reciprocal rank fusion, vector bm25, semantic keyword search Solves:
-
How do I combine vector and keyword search?
-
Implement hybrid retrieval with RRF
-
Merge semantic and BM25 results
semantic-search
Keywords: semantic search, vector similarity, embedding, nearest neighbor, cosine distance Solves:
-
How does semantic search work?
-
When to use semantic vs keyword search
-
Semantic search strengths and weaknesses
keyword-search-bm25
Keywords: bm25, full-text search, tsvector, tsquery, keyword search Solves:
-
How does BM25 keyword search work?
-
Implement PostgreSQL full-text search
-
BM25 vs semantic search trade-offs
rrf-algorithm
Keywords: rrf, reciprocal rank fusion, rank-based fusion, score normalization Solves:
-
How does Reciprocal Rank Fusion work?
-
Why use rank instead of scores?
-
RRF smoothing constant (k parameter)
database-schema
Keywords: pgvector schema, chunk table, embedding column, tsvector, generated column Solves:
-
How do I design schema for hybrid search?
-
Store embeddings with vector(1024)
-
Pre-compute tsvector for performance
search-query-implementation
Keywords: hybrid search query, sqlalchemy, vector distance, ts_rank_cd, full outer join Solves:
-
How do I write hybrid search SQL?
-
Implement RRF in SQLAlchemy
-
Use fetch multiplier for better coverage
indexing-strategies
Keywords: pgvector index, hnsw, ivfflat, vector index performance, index tuning Solves:
-
HNSW vs IVFFlat comparison
-
Optimize vector search speed
-
Scale to millions of vectors
pre-computed-tsvector
Keywords: tsvector, gin index, full-text index, pre-computed column, generated column Solves:
-
Optimize keyword search performance
-
5-10x speedup with indexed tsvector
metadata-filtering
Keywords: metadata filter, faceted search, content type filter, score boosting Solves:
-
Filter search by metadata
-
Boost results by section title
-
Pre-filter by content type
common-patterns
Keywords: filtered search, similarity threshold, multi-query retrieval, search patterns Solves:
-
Filter search by content type
-
Set minimum similarity threshold
-
Implement multi-query retrieval
golden-dataset-testing
Keywords: golden dataset, search evaluation, pass rate, mrr, retrieval testing Solves:
-
Test hybrid search quality
-
Evaluate search with golden queries
-
Calculate pass rate and MRR metrics