Evidence Verification
Ensures all claims are backed by executable proof: test results, coverage metrics, build success, and deployment verification.
Key Principle: Show, don't tell. No task is complete without verifiable evidence.
Overview
Auto-Activate Triggers
-
Completing code implementation
-
Finishing code review
-
Marking tasks complete in Squad mode
-
Before agent handoff
-
Production deployment verification
Manual Activation
-
When user requests "verify this works"
-
Before creating pull requests
-
During quality assurance reviews
-
When troubleshooting failures
Evidence Verification
Core Concepts
- Evidence Types
Test Evidence
-
Exit code (must be 0 for success)
-
Test suite results (passed/failed/skipped)
-
Coverage percentage (if available)
-
Test duration
Build Evidence
-
Build exit code (0 = success)
-
Compilation errors/warnings
-
Build artifacts created
-
Build duration
Deployment Evidence
-
Deployment status (success/failed)
-
Environment deployed to
-
Health check results
-
Rollback capability verified
Code Quality Evidence
-
Linter results (errors/warnings)
-
Type checker results
-
Security scan results
-
Accessibility audit results
- Evidence Collection Protocol
Evidence Collection Steps
-
Identify Verification Points
- What needs to be proven?
- What could go wrong?
- What does "complete" mean?
-
Execute Verification
- Run tests
- Run build
- Run linters
- Check deployments
-
Capture Results
- Record exit codes
- Save output snippets
- Note timestamps
- Document environment
-
Store Evidence
- Add to shared context
- Reference in task completion
- Link to artifacts
-
Verification Standards
Minimum Evidence Requirements:
-
✅ At least ONE verification type executed
-
✅ Exit code captured (0 = pass, non-zero = fail)
-
✅ Timestamp recorded
-
✅ Evidence stored in context
Production-Grade Requirements:
-
✅ Tests run with exit code 0
-
✅ Coverage >70% (or project standard)
-
✅ Build succeeds with exit code 0
-
✅ No critical linter errors
-
✅ Security scan passes
Evidence Verification
Evidence Collection Templates
Template 1: Test Evidence
Use this template when running tests:
Test Evidence
Command: npm test (or equivalent)
Exit Code: 0 ✅ / non-zero ❌
Duration: X seconds
Results:
- Tests passed: X
- Tests failed: X
- Tests skipped: X
- Coverage: X%
Output Snippet:
[First 10 lines of test output]
Timestamp: YYYY-MM-DD HH:MM:SS Environment: Node vX.X.X, OS, etc.
Template 2: Build Evidence
Use this template when building:
Build Evidence
Command: npm run build (or equivalent)
Exit Code: 0 ✅ / non-zero ❌
Duration: X seconds
Artifacts Created:
- dist/bundle.js (245 KB)
- dist/styles.css (18 KB)
Errors: X Warnings: X
Output Snippet:
[First 10 lines of build output]
Timestamp: YYYY-MM-DD HH:MM:SS
Template 3: Code Quality Evidence
Use this template for linting and type checking:
Code Quality Evidence
Linter: ESLint / Ruff / etc.
Command: npm run lint
Exit Code: 0 ✅ / non-zero ❌
Errors: X
Warnings: X
Type Checker: TypeScript / mypy / etc.
Command: npm run typecheck
Exit Code: 0 ✅ / non-zero ❌
Type Errors: X
Timestamp: YYYY-MM-DD HH:MM:SS
Evidence Verification
Quality Standards
Minimum Acceptable
✅ Tests executed with captured exit code ✅ Timestamp recorded ✅ Evidence stored in context
Production-Grade
✅ Tests pass (exit code 0) ✅ Coverage ≥70% (or project standard) ✅ Build succeeds (exit code 0) ✅ No critical linter errors ✅ Type checker passes ✅ Security scan shows no critical issues
Gold Standard
✅ All production-grade requirements ✅ Coverage ≥80% ✅ No linter warnings ✅ Performance benchmarks within thresholds ✅ Accessibility audit passes (WCAG 2.1 AA) ✅ Integration tests pass ✅ Deployment verification complete
Evidence Verification
Common Pitfalls
❌ Don't Skip Evidence Collection
Bad:
"I've implemented the login feature. It should work correctly."
Good:
"I've implemented the login feature. Evidence:
- Tests: Exit code 0, 12 tests passed, 0 failed
- Build: Exit code 0, no errors
- Coverage: 89% Task complete with verification."
❌ Don't Fake Evidence
Bad:
"Tests passed" (without actually running them)
Good:
"Tests passed. Exit code: 0 Command: npm test Output: Test Suites: 3 passed, 3 total Timestamp: 2025-11-02 10:30:15"
❌ Don't Ignore Failed Evidence
Bad:
"Build failed with exit code 1, but the code looks correct so marking complete."
Good:
"Build failed with exit code 1. Errors:
- TypeError: Cannot read property 'id' of undefined (line 42) Fixing the error now before marking complete."
Evidence Verification
Quick Reference
Evidence Collection Checklist
Before marking task complete:
- Tests executed
- Test exit code captured (0 = pass)
- Build executed (if applicable)
- Build exit code captured (0 = pass)
- Code quality checks run (linter, types)
- Evidence documented with timestamp
- Evidence added to shared context
- Evidence summary in completion message
Common Commands by Language/Framework
JavaScript/TypeScript:
npm test # Run tests npm run build # Build project npm run lint # Run ESLint npm run typecheck # Run TypeScript compiler
Python:
pytest # Run tests pytest --cov # Run tests with coverage ruff check . # Run linter mypy . # Run type checker
Evidence Verification
Remember: Evidence-first development prevents hallucinations, ensures production quality, and builds confidence. When in doubt, collect more evidence, not less.
Related Skills
-
unit-testing
-
Unit test patterns for generating test evidence
-
integration-testing
-
Integration test patterns for component verification
-
security-scanning
-
Security scan evidence collection (npm audit, pip-audit)
-
test-standards-enforcer
-
Enforce evidence collection standards
Key Decisions
Decision Choice Rationale
Minimum Coverage 70% Industry standard for production-grade code
Exit Code Requirement 0 = pass Unix standard for success/failure indication
Gold Standard Coverage 80% Higher bar for critical paths
Retry Before Block 2 attempts Allow fix attempts before escalation
Capability Details
exit-code-validation
Keywords: exit code, return code, success, failure, status, $?, exit 0, non-zero Solves:
-
How do I verify command succeeded?
-
Check exit codes for evidence (0 = pass)
-
Validate build/test success with exit codes
-
Capture command exit status in evidence
test-evidence
Keywords: test results, test output, coverage report, test evidence, jest, pytest, test suite, passed, failed Solves:
-
How do I capture test evidence?
-
Record test results in session state
-
Prove tests passed with exit code 0
-
Document test coverage percentage
-
Capture passed/failed/skipped counts
build-evidence
Keywords: build log, build output, compile, bundle, webpack, vite, cargo build, npm build Solves:
-
How do I capture build evidence?
-
Record build success with exit code
-
Verify compilation without errors
-
Document build artifacts created
-
Track build duration and warnings
code-quality-evidence
Keywords: linter, lint, eslint, ruff, type check, mypy, typescript, code quality, warnings, errors Solves:
-
How do I capture code quality evidence?
-
Run linter and capture results
-
Execute type checker and record errors
-
Document linter errors and warnings count
-
Prove code quality checks passed
deployment-evidence
Keywords: deployment, deploy, production, staging, health check, rollback, deployment status Solves:
-
How do I verify deployment succeeded?
-
Check health endpoints after deploy
-
Verify application started successfully
-
Document deployment status and environment
-
Confirm rollback capability exists
security-scan-evidence
Keywords: security, vulnerability, npm audit, pip-audit, security scan, cve, critical vulnerabilities Solves:
-
How do I capture security scan results?
-
Run npm audit or pip-audit
-
Document critical vulnerabilities found
-
Record security scan exit code
-
Prove no critical security issues
evidence-storage
Keywords: session state, state.json, evidence storage, record evidence, save results, quality_evidence, context 2.0 Solves:
-
How do I store evidence in context?
-
Update session/state.json with results
-
Structure evidence data properly
-
Add timestamp to evidence records
-
Link to evidence log files
combined-evidence-report
Keywords: evidence report, task completion, verification summary, proof of completion, comprehensive evidence Solves:
-
How do I create complete evidence report?
-
Combine test, build, and quality evidence
-
Create task completion evidence summary
-
Document all verification checks run
-
Provide comprehensive proof of completion
evidence-collection-workflow
Keywords: evidence workflow, verification steps, evidence protocol, collection process, verification checklist Solves:
-
What steps to collect evidence?
-
Follow evidence collection protocol
-
Run all necessary verification checks
-
Complete evidence checklist before marking done
-
Ensure minimum evidence requirements met
quality-standards
Keywords: quality standards, minimum requirements, production-grade, gold standard, evidence thresholds Solves:
-
What evidence is required to pass?
-
Understand minimum vs production-grade standards
-
Meet gold standard evidence requirements
-
Know when evidence is sufficient
-
Validate evidence meets project standards
evidence-pitfalls
Keywords: evidence mistakes, common errors, skip evidence, fake evidence, ignore failures Solves:
-
What evidence mistakes to avoid?
-
Never skip evidence collection
-
Don't fake evidence results
-
Don't ignore failed evidence
-
Always re-collect after changes