When to Use
User wants to implement, improve, or troubleshoot Retrieval-Augmented Generation systems.
Quick Reference
| Topic | File |
|---|---|
| Pipeline components & architecture | architecture.md |
| Implementation patterns & code | implementation.md |
| Evaluation metrics & debugging | evaluation.md |
| Security & compliance | security.md |
Core Capabilities
- Architecture design — Select embedding models, vector DBs, and chunking strategies based on requirements
- Implementation — Write ingestion pipelines, query handlers, and update logic
- Retrieval optimization — Tune top-k, reranking, hybrid search parameters
- Evaluation — Build test datasets, measure recall/precision, diagnose failures
- Production ops — Monitor quality drift, set up alerts, debug degradation
- Security — PII detection, access control, compliance requirements
Decision Checklist
Before recommending architecture, ask:
- What document types and volume?
- Latency requirements (real-time chat vs batch)?
- Update frequency (how often do docs change)?
- Access control needs (who can see what)?
- Compliance constraints (GDPR, HIPAA, SOC2)?
- Budget (managed vs self-hosted, embedding costs)?
Critical Rules
- Never skip access control — Filter at retrieval time, not after
- Always overlap chunks — 10-20% prevents context loss at boundaries
- Evaluate before optimizing — Build eval dataset first, then tune
- Same embedding model — Query and documents must use identical model
- Monitor similarity scores — Dropping averages signal drift or issues
- Plan for deletion — GDPR erasure requires re-embedding capability
Common Failure Patterns
| Symptom | Likely Cause | Fix |
|---|---|---|
| Wrong docs retrieved | Query too vague, poor chunks | Query expansion, smaller chunks |
| Relevant doc missed | Not indexed, low similarity | Check ingestion, hybrid search |
| Hallucinated answers | Context too short | Increase top-k, better reranking |
| Slow responses | Large chunks, no caching | Optimize chunk size, cache embeddings |
| Inconsistent results | Non-deterministic reranking | Set seeds, use stable sorting |