Stop Your AI From Making Things Up
Guide the user through making their AI factually grounded. The core principle: never trust a bare LM output — always verify against sources.
Why AI hallucinates
LMs generate plausible-sounding text, not verified facts. Hallucination happens when:
- The model has no source material to ground its answer
- The prompt doesn't enforce citations or evidence
- There's no verification step after generation
- Temperature is too high for factual tasks
The fix isn't better prompting — it's programmatic constraints that force grounding.
Step 1: Understand the grounding situation
Ask the user:
- Do you have source documents? (knowledge base, docs, database) → use retrieval-grounded answers
- Is it general knowledge? (no docs, just the model's knowledge) → use self-consistency checks
- How bad is a hallucination? (annoying vs. dangerous) → determines how strict the checks should be
Step 2: Citation enforcement
Force the AI to cite sources for every claim. Uses dspy.Assert to reject answers without citations.
import dspy
import re
class CitedAnswer(dspy.Signature):
"""Answer the question using the provided sources. Cite every claim with [1], [2], etc."""
context: list[str] = dspy.InputField(desc="Numbered source documents")
question: str = dspy.InputField()
answer: str = dspy.OutputField(desc="Answer with inline citations like [1], [2]")
class CitationEnforcer(dspy.Module):
def __init__(self):
self.answer = dspy.ChainOfThought(CitedAnswer)
def forward(self, context, question):
result = self.answer(context=context, question=question)
# Every 1-2 sentences must have a citation
sentences = [s.strip() for s in result.answer.split(".") if s.strip()]
citations_found = [bool(re.search(r"\[\d+\]", s)) for s in sentences]
# Check that at least half the sentences have citations
citation_ratio = sum(citations_found) / max(len(sentences), 1)
dspy.Assert(
citation_ratio >= 0.5,
"Answer must cite sources. Use [1], [2], etc. after claims. "
f"Only {citation_ratio:.0%} of sentences have citations."
)
# Check that cited numbers actually exist in the context
cited_nums = set(int(n) for n in re.findall(r"\[(\d+)\]", result.answer))
valid_nums = set(range(1, len(context) + 1))
invalid = cited_nums - valid_nums
dspy.Assert(
len(invalid) == 0,
f"Citations {invalid} don't match any source. Valid sources: [1] to [{len(context)}]."
)
return result
Step 3: Faithfulness verification
After generating an answer, use a second LM call to check if it's actually supported by the sources.
class CheckFaithfulness(dspy.Signature):
"""Check if every claim in the answer is supported by the context."""
context: list[str] = dspy.InputField(desc="Source documents")
answer: str = dspy.InputField(desc="Generated answer to verify")
is_faithful: bool = dspy.OutputField(desc="Is every claim supported by the context?")
unsupported_claims: list[str] = dspy.OutputField(desc="Claims not found in context")
class FaithfulResponder(dspy.Module):
def __init__(self):
self.retrieve = dspy.Retrieve(k=5)
self.answer = dspy.ChainOfThought(CitedAnswer)
self.verify = dspy.Predict(CheckFaithfulness)
def forward(self, question):
context = self.retrieve(question).passages
result = self.answer(context=context, question=question)
check = self.verify(context=context, answer=result.answer)
dspy.Assert(
check.is_faithful,
f"Answer contains unsupported claims: {check.unsupported_claims}. "
"Rewrite using only information from the provided sources."
)
return result
When dspy.Assert fails, DSPy automatically retries the LM call, feeding back the error message so the model can self-correct. This retry loop (called backtracking) runs up to max_backtrack_attempts times (default: 2).
Step 4: Self-check pattern
Generate an answer, then ask the model to verify its own claims against the sources. Lightweight and good for most cases.
class SelfCheckedAnswer(dspy.Module):
def __init__(self):
self.answer = dspy.ChainOfThought("context, question -> answer")
self.check = dspy.ChainOfThought(CheckFaithfulness)
def forward(self, context, question):
result = self.answer(context=context, question=question)
verification = self.check(context=context, answer=result.answer)
dspy.Suggest(
verification.is_faithful,
f"Some claims may not be supported: {verification.unsupported_claims}. "
"Consider revising to stick closer to the sources."
)
return dspy.Prediction(
answer=result.answer,
is_verified=verification.is_faithful,
unsupported=verification.unsupported_claims,
)
Use dspy.Suggest (soft) instead of dspy.Assert (hard) when you want to flag issues without blocking the response.
Step 5: Cross-check pattern
Generate the answer twice independently, then compare. If two independent generations disagree, something is probably made up.
class CrossChecked(dspy.Module):
def __init__(self):
self.gen_a = dspy.ChainOfThought("context, question -> answer")
self.gen_b = dspy.ChainOfThought("context, question -> answer")
self.compare = dspy.Predict(CompareAnswers)
def forward(self, context, question):
a = self.gen_a(context=context, question=question)
b = self.gen_b(context=context, question=question)
check = self.compare(answer_a=a.answer, answer_b=b.answer)
dspy.Assert(
check.agree,
f"Two independent answers disagree: {check.discrepancy}. "
"This suggests hallucination. Regenerate with closer attention to sources."
)
return a
class CompareAnswers(dspy.Signature):
"""Check if two independently generated answers agree on the facts."""
answer_a: str = dspy.InputField()
answer_b: str = dspy.InputField()
agree: bool = dspy.OutputField(desc="Do they agree on all factual claims?")
discrepancy: str = dspy.OutputField(desc="What they disagree on, if anything")
Best for high-stakes outputs where the cost of hallucination is high. Doubles your LM calls but catches inconsistencies.
Step 6: Confidence thresholds
Flag low-confidence outputs for human review instead of showing them to users.
class ConfidenceGated(dspy.Signature):
"""Answer the question and rate your confidence."""
context: list[str] = dspy.InputField()
question: str = dspy.InputField()
answer: str = dspy.OutputField()
confidence: float = dspy.OutputField(desc="0.0 to 1.0, how confident are you?")
reasoning: str = dspy.OutputField(desc="Why this confidence level?")
class GatedResponder(dspy.Module):
def __init__(self, threshold=0.7):
self.respond = dspy.ChainOfThought(ConfidenceGated)
self.threshold = threshold
def forward(self, context, question):
result = self.respond(context=context, question=question)
if result.confidence < self.threshold:
return dspy.Prediction(
answer=result.answer,
needs_review=True,
confidence=result.confidence,
reason=result.reasoning,
)
return dspy.Prediction(
answer=result.answer,
needs_review=False,
confidence=result.confidence,
)
Step 7: Loading source data for verification
Anti-hallucination patterns need source documents. Here's how to load common formats:
From transcript files
import json, re
def load_vtt(path):
"""Extract text from a VTT transcript, stripping timestamps and cues."""
text = open(path).read()
lines = [line.strip() for line in text.split("\n")
if line.strip() and not line.startswith("WEBVTT")
and not re.match(r"\d{2}:\d{2}", line)
and not line.strip().isdigit()]
return " ".join(lines)
def load_livekit_transcript(path):
"""Extract text from a LiveKit transcript JSON export."""
data = json.load(open(path))
segments = data.get("segments", data.get("results", []))
return " ".join(seg.get("text", "") for seg in segments)
def load_recall_transcript(transcript_data):
"""Extract text from a Recall.ai transcript response."""
return " ".join(
entry["words"] for entry in transcript_data if entry.get("words")
)
From Langfuse traces
from langfuse import Langfuse
def load_langfuse_generations(trace_id):
"""Load LM generations from a Langfuse trace for verification."""
langfuse = Langfuse()
trace = langfuse.get_trace(trace_id)
generations = []
for obs in trace.observations:
if obs.type == "GENERATION" and obs.output:
generations.append({
"input": obs.input,
"output": obs.output,
"model": obs.model,
})
return generations
Breaking source documents into numbered passages
Most patterns here expect context: list[str] — numbered source passages. Split long documents into chunks so citations are meaningful:
def chunk_document(text, max_chars=500):
"""Split a document into numbered passages for citation."""
paragraphs = [p.strip() for p in text.split("\n\n") if p.strip()]
chunks = []
current = ""
for para in paragraphs:
if len(current) + len(para) > max_chars and current:
chunks.append(current.strip())
current = para
else:
current = current + "\n\n" + para if current else para
if current:
chunks.append(current.strip())
return [f"[{i+1}] {chunk}" for i, chunk in enumerate(chunks)]
# Use with any source
transcript_text = load_vtt("meeting.vtt")
context = chunk_document(transcript_text)
result = citation_enforcer(context=context, question="What was decided about the timeline?")
Step 8: Evaluating anti-hallucination quality
You need metrics to know if your verification actually works. The key question: does the system catch hallucinations and produce faithful answers?
Faithfulness metric
def faithfulness_metric(example, prediction, trace=None):
"""Score: does the answer stick to the sources?"""
verifier = dspy.Predict(CheckFaithfulness)
check = verifier(context=example.context, answer=prediction.answer)
# Binary: is it faithful?
if not check.is_faithful:
return 0.0
# Bonus: does it actually answer the question?
relevance = dspy.Predict("question, answer -> is_relevant: bool")
rel = relevance(question=example.question, answer=prediction.answer)
return 1.0 if rel.is_relevant else 0.5
evaluator = dspy.Evaluate(devset=devset, metric=faithfulness_metric, num_threads=4)
score = evaluator(my_grounded_qa)
Citation coverage metric
def citation_metric(example, prediction, trace=None):
"""Score citation quality: coverage + validity."""
answer = prediction.answer
sentences = [s.strip() for s in answer.split(".") if s.strip()]
cited = [bool(re.search(r"\[\d+\]", s)) for s in sentences]
coverage = sum(cited) / max(len(sentences), 1)
# Check all cited sources exist
cited_nums = set(int(n) for n in re.findall(r"\[(\d+)\]", answer))
valid_nums = set(range(1, len(example.context) + 1))
all_valid = cited_nums.issubset(valid_nums)
if not all_valid:
return 0.0
return coverage # 0.0 to 1.0
Optimizing the verification pipeline
# Create training data: questions with source context and gold answers
trainset = [
dspy.Example(
context=["[1] The meeting is on March 5.", "[2] Budget is $50k."],
question="When is the meeting?",
answer="The meeting is on March 5 [1]."
).with_inputs("context", "question"),
# ... more examples
]
# Optimize the citation enforcer
optimizer = dspy.BootstrapFewShot(metric=faithfulness_metric, max_bootstrapped_demos=4)
optimized = optimizer.compile(CitationEnforcer(), trainset=trainset)
optimized.save("optimized_citation_enforcer.json")
# Load later
enforcer = CitationEnforcer()
enforcer.load("optimized_citation_enforcer.json")
Using a cheap LM for verification
The verification step doesn't need an expensive model — a smaller model checking claims against sources works well and cuts costs:
class CostEfficientVerifier(dspy.Module):
def __init__(self):
self.answer = dspy.ChainOfThought(CitedAnswer)
self.verify = dspy.Predict(CheckFaithfulness)
# Use a cheaper model for the verification step
cheap_lm = dspy.LM("openai/gpt-4o-mini")
self.verify.set_lm(cheap_lm)
def forward(self, context, question):
result = self.answer(context=context, question=question)
check = self.verify(context=context, answer=result.answer)
dspy.Assert(
check.is_faithful,
f"Unsupported claims: {check.unsupported_claims}. "
"Only use information from the provided sources."
)
return result
Step 9: Batch verification
When you need to verify many responses at once (e.g., auditing a transcript Q&A system):
import json
def verify_batch(qa_pairs, context, output_path="verification_results.json"):
"""Verify a batch of question-answer pairs against source context."""
verifier = dspy.Predict(CheckFaithfulness)
results = []
for qa in qa_pairs:
check = verifier(context=context, answer=qa["answer"])
results.append({
"question": qa["question"],
"answer": qa["answer"],
"is_faithful": check.is_faithful,
"unsupported_claims": check.unsupported_claims,
})
# Summary
faithful_count = sum(1 for r in results if r["is_faithful"])
print(f"Faithful: {faithful_count}/{len(results)} "
f"({faithful_count/len(results):.0%})")
with open(output_path, "w") as f:
json.dump(results, f, indent=2)
return results
How backtracking works
When dspy.Assert fails:
- DSPy catches the assertion failure
- The error message is fed back to the LM as additional context
- The LM retries generation with the feedback ("your answer had unsupported claims X, Y")
- This repeats up to
max_backtrack_attemptstimes - If all retries fail, the assertion raises an error
This is why good error messages matter — they're literally the feedback the model uses to improve.
Choosing the right pattern
| Pattern | Cost | Latency | Best for |
|---|---|---|---|
| Citation enforcement | 1 LM call | Low | When you have numbered sources |
| Faithfulness verification | 2 LM calls | Medium | RAG systems, doc Q&A |
| Self-check | 2 LM calls | Medium | General fact-checking |
| Cross-check | 3 LM calls | High | High-stakes, critical outputs |
| Confidence gating | 1 LM call | Low | Human-in-the-loop systems |
| Cheap verifier | 1 expensive + 1 cheap | Low-Medium | Cost-sensitive production |
Key principles
- Grounding beats prompting. Giving the AI sources to cite is more effective than asking it to "be accurate."
- Assert for critical facts. Use
dspy.Assertwhen hallucination is unacceptable (medical, legal, financial). - Suggest for nice-to-haves. Use
dspy.Suggestwhen you want to flag but not block. - Layer your defenses. Combine retrieval + citation + verification for the strongest protection.
- Good error messages help. The Assert message becomes the model's self-correction prompt.
- Measure faithfulness. Use the evaluation patterns above to quantify how well your verification works — don't guess.
Additional resources
- Use
/ai-searching-docsfor retrieval-augmented generation (RAG) setup - Use
/ai-checking-outputsfor general output validation (format, safety, quality) - Use
/ai-following-rulesfor enforcing business rules and content policies - See
examples.mdfor complete worked examples