rag-systems

Building Retrieval-Augmented Generation systems.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "rag-systems" with this command: npx skills add doanchienthangdev/omgkit/doanchienthangdev-omgkit-rag-systems

RAG Systems

Building Retrieval-Augmented Generation systems.

RAG Architecture

INDEXING (Offline) Documents → Chunking → Embedding → Vector DB

QUERYING (Online) Query → Embed → Search → Retrieved Docs ↓ Response ← LLM ← Context + Query

Retrieval Algorithms

Term-Based (BM25)

from rank_bm25 import BM25Okapi

tokenized_docs = [doc.split() for doc in documents] bm25 = BM25Okapi(tokenized_docs) scores = bm25.get_scores(query.split())

Embedding-Based

from sentence_transformers import SentenceTransformer import faiss

model = SentenceTransformer('all-MiniLM-L6-v2') embeddings = model.encode(documents)

index = faiss.IndexFlatIP(embeddings.shape[1]) faiss.normalize_L2(embeddings) index.add(embeddings)

Query

query_emb = model.encode([query]) faiss.normalize_L2(query_emb) distances, indices = index.search(query_emb, k=5)

Hybrid Retrieval

def hybrid_retrieve(query, k=5, alpha=0.5): bm25_scores = normalize(bm25.get_scores(query.split())) dense_scores = normalize(index.search(embed(query), len(docs))[0])

hybrid = alpha * bm25_scores + (1-alpha) * dense_scores
return [docs[i] for i in np.argsort(hybrid)[::-1][:k]]

Chunking Strategies

Fixed Size

def fixed_chunk(text, size=500, overlap=50): chunks = [] for i in range(0, len(text), size - overlap): chunks.append(text[i:i+size]) return chunks

Semantic Chunking

def semantic_chunk(text, model, threshold=0.5): sentences = sent_tokenize(text) chunks, current = [], []

for sent in sentences:
    current.append(sent)
    if len(current) > 1:
        sim = similarity(current[-2], current[-1], model)
        if sim < threshold:
            chunks.append(" ".join(current[:-1]))
            current = [sent]

if current:
    chunks.append(" ".join(current))
return chunks

Retrieval Optimization

Query Expansion

def expand_query(query, model): prompt = f"Generate 3 alternative phrasings:\n{query}" return [query] + model.generate(prompt).split("\n")

HyDE (Hypothetical Document)

def hyde(query, model): prompt = f"Write a paragraph answering:\n{query}" return model.generate(prompt) # Use this for retrieval

Reranking

from sentence_transformers import CrossEncoder

reranker = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')

def rerank(query, docs, k=5): pairs = [(query, doc) for doc in docs] scores = reranker.predict(pairs) return sorted(zip(docs, scores), key=lambda x: -x[1])[:k]

RAG Evaluation

def rag_metrics(query, response, context, ground_truth): return { "retrieval_precision": precision(retrieved, relevant), "retrieval_recall": recall(retrieved, relevant), "answer_relevance": similarity(response, ground_truth), "faithfulness": check_hallucination(response, context), }

Best Practices

  • Use hybrid retrieval (BM25 + dense)

  • Add reranking for quality

  • Chunk with overlap (10-20%)

  • Experiment with chunk sizes (200-1000 tokens)

  • Evaluate retrieval separately from generation

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

managing-databases

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

building-nestjs-apis

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

building-laravel-apis

No summary provided by upstream source.

Repository SourceNeeds Review