agent-qa-gates

Output validation gates for AI agent systems. Prevents hallucinated data, leaked internal context, wrong formats, duplicate sends, and post-compaction drift. Use when building or operating an agent that delivers output to humans or external systems. Provides a tiered gate system (internal → user-facing → external → code), protocol gates for recurring failure modes, severity classification, and a feedback loop for gate evolution. Triggers on phrases like "QA gates", "validation", "output quality", "prevent hallucination", "delivery checklist", "agent QA".

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "agent-qa-gates" with this command: npx skills add zurbrick/agent-qa-gates

Agent QA Gates

A field-tested validation system for AI agent output. Born from production failures, not theory.

Quick Start

Before any agent delivers output, run the Pre-Ship Checklist:

  1. Accurate? — every number/date/metric has a source. Unsourced → prefix "estimated"
  2. Complete? — no missing pieces, no "I'll do that next"
  3. Actionable? — ends with clear next step or decision point
  4. Fits the channel? — check character limits for your delivery surface
  5. No leaks? — no internal context, private data, or secrets
  6. Not a duplicate? — verify no recent identical send
  7. Would the human be embarrassed? — if yes, don't ship

Gate Tiers

Four ascending tiers by risk level:

GateScopeKey Checks
Gate 0Internal (files, config, memory)Mechanism changed not just text, no placeholders, file exists
Gate 1Human-facing (briefings, summaries)Key info in first 2 lines, ≤3-line paragraphs, channel length limits
Gate 2External (email, public content, client materials)No internal context leaked, recipient-appropriate tone, dedup check
Gate 3Code & technicalBuilds clean, no secrets in code, error handling, tests pass

See references/gates-detail.md for full gate checklists.

Severity Classification

Not all failures are equal:

  • 🔴 BLOCK — cannot ship (secrets, privacy, hallucinated data, wrong recipient)
  • 🟡 FIX — fix before shipping, <2 min (formatting, too long, missing citation)
  • 🟢 NOTE — log and ship (style preference, minor optimization)

Protocol Gates

Recurring failure modes need dedicated gates. These are the most common:

Heartbeat / Periodic Check Output

  • Binary output: alert text ONLY or status-OK ONLY. Never mixed.
  • Every data point verified by current-session tool call. No hallucinated metrics.
  • No stale data from previous cycles or pre-compaction sessions.

Post-Compaction / Context Reset

  • Do not trust facts from the pre-reset session — verify from files and tools.
  • Rerun pending checks from scratch.
  • Zero carryover for periodic checks.

Scheduled Job / Cron Changes

  • Explicit timeout set
  • Explicit model set
  • Verify schedule after creation
  • Output fits destination channel limits

Sub-Agent Output Review

  • Does output match the brief's success criteria?
  • Any uncertainty flags unresolved?
  • Is the reasoning (not just the conclusion) sound?

Gate Evolution

Gates should evolve based on real failures, not imagination:

  1. When a failure occurs → log it with root cause
  2. Same failure class occurs 2+ times → add a gate item
  3. Monthly: prune gates that haven't caught anything in 60 days

Anti-Patterns

  • Gates that sound good but never catch anything → kill them
  • Per-agent checklists that duplicate general gates → merge or reference
  • "ADHD-friendly" or "high-quality" as gate items → not testable, replace with mechanical checks
  • Aspirational gates nobody runs → either automate or cut

Adapting to Your System

This skill provides the pattern. Adapt it:

  1. Start with the Pre-Ship Checklist — it works for any agent system
  2. Add Protocol Gates for your top 3 recurring failure modes
  3. Set channel limits for your delivery surfaces
  4. Map real failures to gates — if a failure isn't gated, add the gate
  5. Kill gates that never fire — a shorter, sharper checklist wins

For the full reference implementation, see references/gates-detail.md. For automation scripts, see scripts/qa-check.sh.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

Nango API Integration

Connect AI agents to 700+ external APIs using Nango. Handles OAuth, authentication flows, and tool calling for any API. Use when integrating agents with exte...

Registry SourceRecently Updated
Automation

Video Remix

Enables remixing video content by editing, combining, or transforming video files for creative or production workflows.

Registry SourceRecently Updated
Automation

Flowdo

Task and workflow manager with kanban-style status tracking. Add tasks, move them through workflow states (todo, doing, done, blocked), set priority levels,...

Registry SourceRecently Updated
1020ckchzh
Automation

Dropship Helper

代发货助手。一件代发、供应商选择、产品选品、利润计算、物流方案、风险控制。Dropshipping helper. 代发货、一件代发、跨境电商。

Registry SourceRecently Updated
1740Profile unavailable