hybrid-retrieval

Hybrid Retrieval for RAG

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "hybrid-retrieval" with this command: npx skills add latestaiagents/agent-skills/latestaiagents-agent-skills-hybrid-retrieval

Hybrid Retrieval for RAG

Combine dense (semantic) and sparse (keyword) retrieval for superior results.

When to Use

  • Vector search misses exact keyword matches

  • Domain-specific terminology needs exact matching

  • Users search with both natural language and specific terms

  • Need to balance semantic understanding with precision

The Problem with Vector-Only Search

Query: "Error code E-4521 troubleshooting"

Vector search returns:

  • "Common error handling patterns" (semantically similar)
  • "Debugging techniques for applications" (related topic)

Missing:

  • "E-4521: Database connection timeout" (exact match needed!)

Hybrid Architecture

┌─────────────────────────────────────────────────┐ │ User Query │ └─────────────────────┬───────────────────────────┘ │ ┌────────────┴────────────┐ │ │ ▼ ▼ ┌─────────────────┐ ┌─────────────────┐ │ Dense Search │ │ Sparse Search │ │ (Embeddings) │ │ (BM25/TF-IDF) │ └────────┬────────┘ └────────┬────────┘ │ │ └────────────┬────────────┘ │ ▼ ┌───────────────┐ │ Fusion │ │ (RRF/Linear) │ └───────┬───────┘ │ ▼ ┌───────────────┐ │ Reranker │ │ (Optional) │ └───────┬───────┘ │ ▼ ┌───────────────┐ │ Final Results │ └───────────────┘

Implementation

Basic Hybrid with LangChain

from langchain.retrievers import EnsembleRetriever from langchain_community.retrievers import BM25Retriever from langchain_community.vectorstores import Chroma

Dense retriever (vector search)

vectorstore = Chroma.from_documents(docs, embeddings) dense_retriever = vectorstore.as_retriever(search_kwargs={"k": 10})

Sparse retriever (BM25)

bm25_retriever = BM25Retriever.from_documents(docs) bm25_retriever.k = 10

Combine with ensemble

hybrid_retriever = EnsembleRetriever( retrievers=[dense_retriever, bm25_retriever], weights=[0.5, 0.5] # Adjust based on your data )

results = hybrid_retriever.invoke("Error code E-4521")

Reciprocal Rank Fusion (RRF)

def reciprocal_rank_fusion(results_list: list, k: int = 60) -> list: """ Combine multiple ranked lists using RRF. k=60 is the standard constant from the original paper. """ fused_scores = {}

for results in results_list:
    for rank, doc in enumerate(results):
        doc_id = doc.metadata.get("id", hash(doc.page_content))
        if doc_id not in fused_scores:
            fused_scores[doc_id] = {"doc": doc, "score": 0}
        fused_scores[doc_id]["score"] += 1 / (k + rank + 1)

# Sort by fused score
reranked = sorted(
    fused_scores.values(),
    key=lambda x: x["score"],
    reverse=True
)
return [item["doc"] for item in reranked]

Usage

dense_results = dense_retriever.invoke(query) sparse_results = bm25_retriever.invoke(query) final_results = reciprocal_rank_fusion([dense_results, sparse_results])

With Pinecone (Native Hybrid)

from pinecone import Pinecone from pinecone_text.sparse import BM25Encoder

Initialize

pc = Pinecone(api_key="...") index = pc.Index("hybrid-index")

Sparse encoder

bm25 = BM25Encoder() bm25.fit(corpus)

Query with both dense and sparse

def hybrid_query(query: str, alpha: float = 0.5): # Dense vector dense_vec = embeddings.embed_query(query)

# Sparse vector
sparse_vec = bm25.encode_queries([query])[0]

# Hybrid search
results = index.query(
    vector=dense_vec,
    sparse_vector=sparse_vec,
    top_k=10,
    alpha=alpha,  # 0 = sparse only, 1 = dense only
    include_metadata=True
)
return results

With Weaviate (Native Hybrid)

import weaviate

client = weaviate.Client("http://localhost:8080")

result = client.query.get( "Document", ["content", "title"] ).with_hybrid( query="Error code E-4521", alpha=0.5, # Balance between vector and keyword fusion_type="rankedFusion" ).with_limit(10).do()

Adding a Reranker

from sentence_transformers import CrossEncoder

Load reranker model

reranker = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')

def rerank_results(query: str, docs: list, top_k: int = 5) -> list: """Rerank documents using cross-encoder.""" pairs = [[query, doc.page_content] for doc in docs] scores = reranker.predict(pairs)

# Sort by reranker scores
scored_docs = list(zip(docs, scores))
scored_docs.sort(key=lambda x: x[1], reverse=True)

return [doc for doc, score in scored_docs[:top_k]]

Full pipeline

hybrid_results = hybrid_retriever.invoke(query) # Get 20 results final_results = rerank_results(query, hybrid_results, top_k=5) # Rerank to top 5

Weight Tuning Guidelines

Data Type Dense Weight Sparse Weight Notes

General text 0.5 0.5 Balanced default

Technical docs 0.4 0.6 Keywords matter more

Conversational 0.7 0.3 Semantic matters more

Code/APIs 0.3 0.7 Exact matches critical

Legal/Medical 0.4 0.6 Terminology precision

Evaluation

def evaluate_retrieval(queries: list, ground_truth: dict, retriever) -> dict: """Calculate retrieval metrics.""" metrics = {"mrr": 0, "recall@5": 0, "precision@5": 0}

for query in queries:
    results = retriever.invoke(query)
    result_ids = [doc.metadata["id"] for doc in results[:5]]
    relevant_ids = ground_truth[query]

    # MRR
    for i, rid in enumerate(result_ids):
        if rid in relevant_ids:
            metrics["mrr"] += 1 / (i + 1)
            break

    # Recall & Precision
    hits = len(set(result_ids) & set(relevant_ids))
    metrics["recall@5"] += hits / len(relevant_ids)
    metrics["precision@5"] += hits / 5

# Average
n = len(queries)
return {k: v/n for k, v in metrics.items()}

Best Practices

  • Start with 50/50 weights - then tune based on evaluation

  • Always add a reranker - significant quality improvement

  • Index sparse vectors - BM25 on raw text, not chunks

  • Use native hybrid - when available (Pinecone, Weaviate, Qdrant)

  • Monitor both paths - log which retriever contributed to final results

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

graphrag-patterns

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

agentic-rag

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

production-rag-checklist

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

rag-evaluation

No summary provided by upstream source.

Repository SourceNeeds Review