evidence-verification

Evidence Verification

Ensures all claims are backed by executable proof: test results, coverage metrics, build success, and deployment verification.

Key Principle: Show, don't tell. No task is complete without verifiable evidence.

Overview

Auto-Activate Triggers

Completing code implementation
Finishing code review
Marking tasks complete in Squad mode
Before agent handoff
Production deployment verification

Manual Activation

When user requests "verify this works"
Before creating pull requests
During quality assurance reviews
When troubleshooting failures

Evidence Verification

Core Concepts

Evidence Types

Test Evidence

Exit code (must be 0 for success)
Test suite results (passed/failed/skipped)
Coverage percentage (if available)
Test duration

Build Evidence

Build exit code (0 = success)
Compilation errors/warnings
Build artifacts created
Build duration

Deployment Evidence

Deployment status (success/failed)
Environment deployed to
Health check results
Rollback capability verified

Code Quality Evidence

Linter results (errors/warnings)
Type checker results
Security scan results
Accessibility audit results

Evidence Collection Protocol

Evidence Collection Steps

Identify Verification Points
- What needs to be proven?
- What could go wrong?
- What does "complete" mean?
Execute Verification
- Run tests
- Run build
- Run linters
- Check deployments
Capture Results
- Record exit codes
- Save output snippets
- Note timestamps
- Document environment
Store Evidence
- Add to shared context
- Reference in task completion
- Link to artifacts
Verification Standards

Minimum Evidence Requirements:

✅ At least ONE verification type executed
✅ Exit code captured (0 = pass, non-zero = fail)
✅ Timestamp recorded
✅ Evidence stored in context

Production-Grade Requirements:

✅ Tests run with exit code 0
✅ Coverage >70% (or project standard)
✅ Build succeeds with exit code 0
✅ No critical linter errors
✅ Security scan passes

Evidence Verification

Evidence Collection Templates

Template 1: Test Evidence

Use this template when running tests:

Test Evidence

Command: npm test (or equivalent) Exit Code: 0 ✅ / non-zero ❌ Duration: X seconds Results:

Tests passed: X
Tests failed: X
Tests skipped: X
Coverage: X%

Output Snippet:

[First 10 lines of test output]

Timestamp: YYYY-MM-DD HH:MM:SS Environment: Node vX.X.X, OS, etc.

Template 2: Build Evidence

Use this template when building:

Build Evidence

Command: npm run build (or equivalent) Exit Code: 0 ✅ / non-zero ❌ Duration: X seconds Artifacts Created:

dist/bundle.js (245 KB)
dist/styles.css (18 KB)

Errors: X Warnings: X

Output Snippet:

[First 10 lines of build output]

Timestamp: YYYY-MM-DD HH:MM:SS

Template 3: Code Quality Evidence

Use this template for linting and type checking:

Code Quality Evidence

Linter: ESLint / Ruff / etc. Command: npm run lint Exit Code: 0 ✅ / non-zero ❌ Errors: X Warnings: X

Type Checker: TypeScript / mypy / etc. Command: npm run typecheck Exit Code: 0 ✅ / non-zero ❌ Type Errors: X

Timestamp: YYYY-MM-DD HH:MM:SS

Evidence Verification

Quality Standards

Minimum Acceptable

✅ Tests executed with captured exit code ✅ Timestamp recorded ✅ Evidence stored in context

Production-Grade

✅ Tests pass (exit code 0) ✅ Coverage ≥70% (or project standard) ✅ Build succeeds (exit code 0) ✅ No critical linter errors ✅ Type checker passes ✅ Security scan shows no critical issues

Gold Standard

✅ All production-grade requirements ✅ Coverage ≥80% ✅ No linter warnings ✅ Performance benchmarks within thresholds ✅ Accessibility audit passes (WCAG 2.1 AA) ✅ Integration tests pass ✅ Deployment verification complete

Evidence Verification

Common Pitfalls

❌ Don't Skip Evidence Collection

Bad:

"I've implemented the login feature. It should work correctly."

Good:

"I've implemented the login feature. Evidence:

Tests: Exit code 0, 12 tests passed, 0 failed
Build: Exit code 0, no errors
Coverage: 89% Task complete with verification."

❌ Don't Fake Evidence

Bad:

"Tests passed" (without actually running them)

Good:

"Tests passed. Exit code: 0 Command: npm test Output: Test Suites: 3 passed, 3 total Timestamp: 2025-11-02 10:30:15"

❌ Don't Ignore Failed Evidence

Bad:

"Build failed with exit code 1, but the code looks correct so marking complete."

Good:

"Build failed with exit code 1. Errors:

TypeError: Cannot read property 'id' of undefined (line 42) Fixing the error now before marking complete."

Evidence Verification

Quick Reference

Evidence Collection Checklist

Before marking task complete:

Tests executed
Test exit code captured (0 = pass)
Build executed (if applicable)
Build exit code captured (0 = pass)
Code quality checks run (linter, types)
Evidence documented with timestamp
Evidence added to shared context
Evidence summary in completion message

Common Commands by Language/Framework

JavaScript/TypeScript:

npm test # Run tests npm run build # Build project npm run lint # Run ESLint npm run typecheck # Run TypeScript compiler

Python:

pytest # Run tests pytest --cov # Run tests with coverage ruff check . # Run linter mypy . # Run type checker

Evidence Verification

Remember: Evidence-first development prevents hallucinations, ensures production quality, and builds confidence. When in doubt, collect more evidence, not less.

Related Skills

unit-testing
Unit test patterns for generating test evidence
integration-testing
Integration test patterns for component verification
security-scanning
Security scan evidence collection (npm audit, pip-audit)
test-standards-enforcer
Enforce evidence collection standards

Key Decisions

Decision Choice Rationale

Minimum Coverage 70% Industry standard for production-grade code

Exit Code Requirement 0 = pass Unix standard for success/failure indication

Gold Standard Coverage 80% Higher bar for critical paths

Retry Before Block 2 attempts Allow fix attempts before escalation

Capability Details

exit-code-validation

Keywords: exit code, return code, success, failure, status, $?, exit 0, non-zero Solves:

How do I verify command succeeded?
Check exit codes for evidence (0 = pass)
Validate build/test success with exit codes
Capture command exit status in evidence

test-evidence

Keywords: test results, test output, coverage report, test evidence, jest, pytest, test suite, passed, failed Solves:

How do I capture test evidence?
Record test results in session state
Prove tests passed with exit code 0
Document test coverage percentage
Capture passed/failed/skipped counts

build-evidence

Keywords: build log, build output, compile, bundle, webpack, vite, cargo build, npm build Solves:

How do I capture build evidence?
Record build success with exit code
Verify compilation without errors
Document build artifacts created
Track build duration and warnings

code-quality-evidence

Keywords: linter, lint, eslint, ruff, type check, mypy, typescript, code quality, warnings, errors Solves:

How do I capture code quality evidence?
Run linter and capture results
Execute type checker and record errors
Document linter errors and warnings count
Prove code quality checks passed

deployment-evidence

Keywords: deployment, deploy, production, staging, health check, rollback, deployment status Solves:

How do I verify deployment succeeded?
Check health endpoints after deploy
Verify application started successfully
Document deployment status and environment
Confirm rollback capability exists

security-scan-evidence

Keywords: security, vulnerability, npm audit, pip-audit, security scan, cve, critical vulnerabilities Solves:

How do I capture security scan results?
Run npm audit or pip-audit
Document critical vulnerabilities found
Record security scan exit code
Prove no critical security issues

evidence-storage

Keywords: session state, state.json, evidence storage, record evidence, save results, quality_evidence, context 2.0 Solves:

How do I store evidence in context?
Update session/state.json with results
Structure evidence data properly
Add timestamp to evidence records
Link to evidence log files

combined-evidence-report

Keywords: evidence report, task completion, verification summary, proof of completion, comprehensive evidence Solves:

How do I create complete evidence report?
Combine test, build, and quality evidence
Create task completion evidence summary
Document all verification checks run
Provide comprehensive proof of completion

evidence-collection-workflow

Keywords: evidence workflow, verification steps, evidence protocol, collection process, verification checklist Solves:

What steps to collect evidence?
Follow evidence collection protocol
Run all necessary verification checks
Complete evidence checklist before marking done
Ensure minimum evidence requirements met

quality-standards

Keywords: quality standards, minimum requirements, production-grade, gold standard, evidence thresholds Solves:

What evidence is required to pass?
Understand minimum vs production-grade standards
Meet gold standard evidence requirements
Know when evidence is sufficient
Validate evidence meets project standards

evidence-pitfalls

Keywords: evidence mistakes, common errors, skip evidence, fake evidence, ignore failures Solves:

What evidence mistakes to avoid?
Never skip evidence collection
Don't fake evidence results
Don't ignore failed evidence
Always re-collect after changes