vector-db

Vector Database Expert

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "vector-db" with this command: npx skills add rightnow-ai/openfang/rightnow-ai-openfang-vector-db

Vector Database Expert

A retrieval systems specialist with deep expertise in embedding models, vector indexing algorithms, and Retrieval-Augmented Generation (RAG) architectures. This skill provides guidance for designing and operating vector search systems that power semantic search, recommendation engines, and LLM knowledge augmentation, covering embedding selection, indexing strategies, chunking, hybrid search, and production deployment.

Key Principles

  • Choose the embedding model based on your domain and retrieval task; general-purpose models work well for broad use cases, but domain-specific fine-tuned embeddings significantly improve recall for specialized content

  • Select the distance metric that matches your embedding model's training objective: cosine similarity for normalized embeddings, dot product for magnitude-aware comparisons, and L2 (Euclidean) for spatial distance

  • Chunk documents thoughtfully; chunk size directly impacts retrieval quality because too-large chunks dilute relevance while too-small chunks lose context

  • Index choice determines the trade-off between search speed, memory usage, and recall accuracy; understand HNSW, IVF, and flat index characteristics before choosing

  • Combine dense vector search with sparse keyword search (hybrid retrieval) for production systems; neither approach alone handles all query types optimally

Techniques

  • Generate embeddings with models like OpenAI text-embedding-3-small, Cohere embed-v3, or open-source sentence-transformers (all-MiniLM-L6-v2, BGE, E5) depending on cost and quality requirements

  • Configure HNSW indexes with appropriate M (connections per node, typically 16-64) and efConstruction (build quality, typically 100-200) parameters; higher values improve recall at the cost of memory and build time

  • Implement chunking strategies: fixed-size with overlap (e.g., 512 tokens with 50-token overlap), semantic chunking at paragraph or section boundaries, or recursive splitting that respects document structure

  • Build hybrid search by executing both vector similarity and BM25/keyword queries, then combining results with Reciprocal Rank Fusion (RRF) or a learned reranker like Cohere Rerank or cross-encoder models

  • Filter results using metadata (date ranges, categories, access permissions) at query time; most vector databases support pre-filtering or post-filtering with different performance characteristics

  • Design the RAG pipeline: query embedding, retrieval (top-k candidates), optional reranking, context assembly with source citations, and LLM generation with the retrieved context in the prompt

Common Patterns

  • Parent-Child Retrieval: Embed small chunks for precise matching but return the larger parent document or section as context to the LLM, preserving surrounding information

  • Multi-vector Representation: Generate multiple embeddings per document (title, summary, full text) and search across all representations to improve recall for different query styles

  • Contextual Retrieval: Prepend a document-level summary or metadata to each chunk before embedding so that the vector captures both local content and global context

  • Evaluation Pipeline: Measure retrieval quality with precision@k, recall@k, and NDCG using a labeled relevance dataset; track these metrics as embedding models and chunking strategies change

Pitfalls to Avoid

  • Do not use a single embedding model for all use cases without benchmarking; embedding quality varies dramatically across domains, languages, and query types

  • Do not index documents without preprocessing: remove boilerplate, normalize whitespace, and handle tables and code blocks as structured content rather than raw text

  • Do not skip reranking in production RAG systems; initial vector retrieval optimizes for speed, but a cross-encoder reranker significantly improves precision in the final results

  • Do not store only vectors without the original text and metadata; you need the source content for LLM context assembly, debugging, and auditing retrieval results

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

ansible

No summary provided by upstream source.

Repository SourceNeeds Review
General

linux-networking

No summary provided by upstream source.

Repository SourceNeeds Review
General

sysadmin

No summary provided by upstream source.

Repository SourceNeeds Review
General

pdf-reader

No summary provided by upstream source.

Repository SourceNeeds Review