Agent Spec Tool-First Workflow
Version: 3.1.0 | Last Updated: 2026-03-08
You are an expert at using agent-spec as a CLI tool for contract-driven AI coding. Help users by:
- Planning: Render task contracts before coding with
contract - Implementing: Follow contract Intent, Decisions, Boundaries
- Verifying: Run
lifecycle/guardto check code against specs - Reviewing: Use
explainfor human-readable summaries,stampfor git trailers - Debugging: Interpret verification failures and fix code accordingly
IMPORTANT: CLI Prerequisite Check
Before running any agent-spec command, Claude MUST check:
command -v agent-spec || cargo install agent-spec
If agent-spec is not installed, inform the user:
agent-specCLI not found. Install with:cargo install agent-spec
Core Mental Model
The key shift: Review point displacement. Human attention moves from "reading code diffs" to "writing contracts".
Traditional: Write Issue (10%) → Agent codes (0%) → Read diff (80%) → Approve (10%)
agent-spec: Write Contract (60%) → Agent codes (0%) → Read explain (30%) → Approve (10%)
Humans define "what is correct" (Contract). Machines verify "is the code correct" (lifecycle). Humans do final "Contract Acceptance" — not Code Review.
Quick Reference
| Command | Purpose | When to Use |
|---|---|---|
agent-spec init | Scaffold new spec | Starting a new task |
agent-spec contract <spec> | Render Task Contract | Before coding - read the execution plan |
agent-spec lint <files> | Spec quality check | After writing spec, before giving to Agent |
agent-spec lifecycle <spec> --code . | Full lint + verify pipeline | After edits - main quality gate |
agent-spec guard --spec-dir specs --code . | Repo-wide check | Pre-commit / CI - all specs at once |
agent-spec explain <spec> --format markdown | PR-ready review summary | Contract Acceptance - paste into PR |
agent-spec explain <spec> --history | Execution history | See how many retries the Agent needed |
agent-spec stamp <spec> --dry-run | Preview git trailers | Before committing - traceability |
agent-spec verify <spec> --code . | Raw verification only | When you want verify without lint gate |
agent-spec checkpoint status | VCS-aware status | Check uncommitted state |
Documentation
Refer to the local files for detailed command patterns:
./references/commands.md- Complete CLI command reference with all flags
IMPORTANT: Documentation Completeness Check
Before answering questions, Claude MUST:
- Read
./references/commands.mdfor exact command syntax - If file read fails: Inform user "references/commands.md is missing, answering from SKILL.md patterns"
- Still answer based on SKILL.md patterns + built-in knowledge
The Seven-Step Workflow
Step 1: Human writes Task Contract (human attention: 60%)
Not a vague Issue — a structured Contract with Intent, Decisions, Boundaries, Completion Criteria.
agent-spec init --level task --lang zh --name "用户注册API"
# Then fill in the four elements in the generated .spec file
For rewrite, migration, or parity tasks, prefer the parity-aware scaffold:
agent-spec init --level task --template rewrite-parity --lang en --name "CLI Parity Contract"
Key principle: Exception scenarios >= happy path scenarios. 1 happy + 3 error paths forces you to think through edge cases before coding begins.
Step 2: Contract quality gate
Check Contract quality before handing to Agent. Like "code review" but for the Contract itself.
agent-spec parse specs/user-registration.spec
agent-spec lint specs/user-registration.spec --min-score 0.7
Catches: malformed structure, zero-scenario acceptance sections, vague verbs, unquantified constraints, non-deterministic wording, missing test selectors, sycophancy bias, uncovered constraints, uncovered decisions (decision-coverage), unbound observable behavior decisions (observable-decision-coverage), uncovered output modes (output-mode-coverage), unverified precedence/fallback chains (precedence-fallback-coverage), weak mock-only I/O error scenarios (external-io-error-strength), missing verification-strength metadata on I/O scenarios (verification-metadata-suggestion), missing error paths (error-path), universal claims with insufficient scenarios (universal-claim), boundary entry points without matching scenarios (boundary-entry-point), untested flag combinations (flag-combination-coverage), untagged platform-specific decisions (platform-decision-tag).
Required self-checks before coding:
agent-spec parsemust show the expected section count and a non-zero scenario count for task specs.- If
Acceptance Criteria: 0 scenariosappears, stop and rewrite the spec before runningcontractorlifecycle. - The parser accepts Markdown-heading forms like
### Scenario:and### Test:for compatibility, but authoring should still emit bareScenario:/场景:andTest:/测试:lines by default. Do not invent extra top-level sections like## Milestones.
Unbound Observable Behavior review:
- After
parse + lint, ask which stdout, stderr, file, network, cache, and persisted-state behaviors are still unbound. - If the task is a rewrite, migration, or parity effort, also ask whether the contract covers:
- command x output mode
- local x remote
- warm cache x cold start
- fallback / precedence order
- partial failure vs hard failure
- If any of these surfaces are still only described in prose, switch back to authoring mode and add scenarios before coding.
Optional: team "Contract Review" — review 50-80 lines of natural language instead of 500 lines of code diff.
Step 3: Agent reads Contract and codes
Agent consumes the structured contract:
agent-spec contract specs/user-registration.spec
Agent is triple-constrained:
- Decisions tell it "how to do it" (no technology shopping)
- Boundaries tell it "what to touch" (no unauthorized file changes)
- Completion Criteria tell it "when it's done" (all bound tests must pass)
Step 4: Agent self-checks with lifecycle (automatic retry loop)
agent-spec lifecycle specs/user-registration.spec \
--code . --change-scope worktree --format json --run-log-dir .agent-spec/runs
Four verification layers run in sequence:
- lint — re-check Contract quality (prevent spec tampering)
- StructuralVerifier — pattern match Must NOT constraints against code
- BoundariesVerifier — check changed files are within Allowed Changes
- TestVerifier — execute tests bound to each scenario
Agent retry loop (no human needed):
Code → lifecycle → FAIL (2/5) → read failure_summary → fix → lifecycle → FAIL (4/5) → fix → lifecycle → PASS (5/5) ✓
Run logs record this history — "this Contract took 3 tries to pass".
Retry Protocol
When lifecycle fails, follow this exact sequence:
- Run:
agent-spec lifecycle <spec> --code . --format json - Parse JSON output, find each scenario's
verdictandevidence - For
fail: the bound test ran and failed — read evidence to understand why, fix code - For
skip: the bound test was not found — checkTest:selector matches a real test name - For
uncertain: AI verification pending — review manually or enable AI backend - Fix code based on evidence. Do NOT modify the spec file — changing the Contract to make verification pass is sycophancy, not a fix
- Re-run lifecycle
- After 3 consecutive failures on the same scenario, stop and escalate to the human
Critical rule: The spec defines "what is correct". If the code doesn't match, fix the code. If the spec itself is wrong, switch to authoring mode and update the Contract explicitly — never silently weaken acceptance criteria.
Step 5: Guard gate (pre-commit / CI)
# Pre-commit hook
agent-spec guard --spec-dir specs --code . --change-scope staged
# CI (GitHub Actions)
agent-spec guard --spec-dir specs --code . --change-scope worktree
Runs lint + verify on ALL specs against current changes. Blocks commit/PR if any spec fails.
Step 6: Contract Acceptance replaces Code Review (human attention: 30%)
Human reviews a Contract-level summary, not a code diff:
agent-spec explain specs/user-registration.spec --code . --format markdown
Reviewer judges two questions:
- Is the Contract definition correct? (Intent, Decisions, Boundaries make sense?)
- Did all verifications pass? (4/4 pass including error paths?)
If both "yes" → approve. This is 10x faster than reading code diffs.
Check retry history if needed:
agent-spec explain specs/user-registration.spec --code . --history
Assisting Contract Acceptance
When helping a human review a completed task:
- Run
agent-spec explain <spec> --code . --format markdownand present the output - If human asks about retry history: run with
--historyflag - If human asks about specific failures: run
agent-spec lifecycle <spec> --code . --format jsonand extract the relevant scenario results - If human approves: run
agent-spec stamp <spec> --code . --dry-runand present the trailers
Step 7: Stamp and archive
agent-spec stamp specs/user-registration.spec --dry-run
# Output: Spec-Name: 用户注册API
# Spec-Passing: true
# Spec-Summary: 4/4 passed, 0 failed, 0 skipped, 0 uncertain
Establishes Contract → Commit traceability chain.
Verdict Interpretation
| Verdict | Meaning | Action |
|---|---|---|
pass | Scenario verified | No action needed |
fail | Scenario failed verification | Read evidence, fix code |
skip | Test not found or not run | Add missing test or fix selector |
uncertain | AI stub / manual review needed | Review manually or enable AI backend |
Key rule: skip != pass. All four verdicts are distinct.
VCS Awareness
agent-spec auto-detects the VCS from the project root. Behavior differs between git and jj:
| Condition | Behavior |
|---|---|
.jj/ exists (even with .git/) | Use --change-scope jj instead of worktree |
| jj repo | Do NOT run git add or git commit — jj auto-snapshots all changes |
| jj repo | stamp output includes Spec-Change: trailer with jj change ID |
| jj repo | explain --history shows file-level diffs between runs (via operation IDs) |
Only .git/ | Use standard git commands (--change-scope staged or worktree) |
| Neither | Change scope detection unavailable; use --change <path> explicitly |
Change Set Options
| Flag | Behavior | Default |
|---|---|---|
--change <path> | Explicit file/dir for boundary checking | (none) |
--change-scope staged | Git staged files | guard default |
--change-scope worktree | All git working tree changes | (none) |
--change-scope jj | Jujutsu VCS changes | (none) |
--change-scope none | No change detection | lifecycle/verify default |
Advanced Features
Verification Layers
# Run only specific layers
agent-spec lifecycle specs/task.spec --code . --layers lint,boundary,test
# Available: lint, boundary, test, ai
Run Logging
agent-spec lifecycle specs/task.spec --code . --run-log-dir .agent-spec/runs
agent-spec explain specs/task.spec --history
AI Mode
agent-spec verify specs/task.spec --code . --ai-mode off # default - no AI
agent-spec verify specs/task.spec --code . --ai-mode stub # testing only
agent-spec lifecycle specs/task.spec --code . --ai-mode caller # agent-as-verifier
AI Verification: Caller Mode
When --ai-mode caller is used, the calling Agent acts as the AI verifier. This is a two-step protocol:
Step 1: Emit AI requests
agent-spec lifecycle specs/task.spec --code . --ai-mode caller --format json
If any scenarios are skipped (no mechanical verifier covered them), the output JSON includes:
"ai_pending": true"ai_requests_file": ".agent-spec/pending-ai-requests.json"
The pending requests file contains AiRequest objects with scenario context, code paths, contract intent, and constraints.
Step 2: Resolve with external decisions
The Agent reads the pending requests, analyzes each scenario, then writes decisions:
[
{
"scenario_name": "场景名称",
"model": "claude-agent",
"confidence": 0.92,
"verdict": "pass",
"reasoning": "All steps verified by code analysis"
}
]
Then merges them back:
agent-spec resolve-ai specs/task.spec --code . --decisions decisions.json
This produces a final merged report where Skip verdicts are replaced with the Agent's AI decisions.
When to use caller mode:
- When the calling Agent (Claude, Codex, etc.) can read and reason about code
- For scenarios that can't be verified by tests alone (design intent, code quality)
- When you want the Agent to be both implementor and verifier
When to Use / When NOT to Use
| Scenario | Use agent-spec? | Why |
|---|---|---|
| Clear feature with defined inputs/outputs | Yes | Contract can express deterministic acceptance criteria |
| Bug fix with reproducible steps | Yes | Great for "given bug X, when fixed, then Y" |
| Exploratory prototyping | No | You don't know "what is done" yet - vibe code first |
| Large architecture refactor | No | Boundaries hard to define, "better architecture" isn't testable |
| Security/compliance rules | Yes (org.spec) | Encode rules once, enforce mechanically everywhere |
Gradual Adoption
Week 1-2: Pick 2-3 clear bug fixes, write Contracts for them
Week 3-4: Expand to new feature development
Week 5-8: Create project.spec with team coding standards
Month 3+: Consider org.spec for cross-project governance
Common Errors
| Error | Cause | Solution |
|---|---|---|
| Guard reports N specs failing | Specs have lint or verify issues | Run lifecycle on each failing spec individually |
skip verdict on scenario | Test selector doesn't match any test | Check Test: / Package: / Filter: in spec |
| Quality score below threshold | Too many lint warnings | Fix vague verbs, add quantifiers, improve testability |
| Boundary violation detected | Changed file outside allowed paths | Either update Boundaries or revert the change |
uncertain on all AI scenarios | Using --ai-mode stub or no backend | Expected — review manually |
| Agent keeps failing lifecycle | Contract criteria too vague or too strict | Improve Completion Criteria specificity |
Command Priority
| Preference | Use | Instead of |
|---|---|---|
contract | Render task contract | brief (legacy alias) |
lifecycle | Full pipeline | verify alone (misses lint) |
guard | Repo-wide | Multiple individual lifecycle calls |
--change | Explicit paths known | --change-scope when paths are known |
| CLI commands | Tool-first approach | spec-gateway library API |
When to Switch to Authoring Mode
During implementation, if you discover:
- A missing exception path that should be in Completion Criteria
- A Boundary that's too restrictive (need to modify more files than allowed)
- A Decision that needs to change (technology choice was wrong)
Switch to agent-spec-authoring skill, update the Contract FIRST, re-run agent-spec lint to validate the change, then resume implementation. Do NOT silently work outside the Contract's boundaries.
Escalation
Switch to library integration only when:
- Embedding
agent-specinto another Rust agent runtime - Testing
spec-gatewayinternals - Injecting a host
AiBackendviaverify_with_backend(Arc<dyn AiBackend>)