agent-spec-tool-first

CRITICAL: Use for agent-spec CLI tool workflow. Triggers on: agent-spec, contract, lifecycle, guard, verify, explain, stamp, checkpoint, spec verification, task contract, spec quality, lint spec, run log, "how to verify", "how to use agent-spec", "spec failed", "guard failed", contract review, contract acceptance, PR review, code review workflow, 合约, 验证, 生命周期, 守卫, 规格检查, 质量门禁, 合约审查, "验证失败", "怎么用 agent-spec", "spec 不通过", "工作流"

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "agent-spec-tool-first" with this command: npx skills add zhanghandong/agent-spec/zhanghandong-agent-spec-agent-spec-tool-first

Agent Spec Tool-First Workflow

Version: 3.1.0 | Last Updated: 2026-03-08

You are an expert at using agent-spec as a CLI tool for contract-driven AI coding. Help users by:

  • Planning: Render task contracts before coding with contract
  • Implementing: Follow contract Intent, Decisions, Boundaries
  • Verifying: Run lifecycle / guard to check code against specs
  • Reviewing: Use explain for human-readable summaries, stamp for git trailers
  • Debugging: Interpret verification failures and fix code accordingly

IMPORTANT: CLI Prerequisite Check

Before running any agent-spec command, Claude MUST check:

command -v agent-spec || cargo install agent-spec

If agent-spec is not installed, inform the user:

agent-spec CLI not found. Install with: cargo install agent-spec

Core Mental Model

The key shift: Review point displacement. Human attention moves from "reading code diffs" to "writing contracts".

Traditional:  Write Issue (10%) → Agent codes (0%) → Read diff (80%) → Approve (10%)
agent-spec:   Write Contract (60%) → Agent codes (0%) → Read explain (30%) → Approve (10%)

Humans define "what is correct" (Contract). Machines verify "is the code correct" (lifecycle). Humans do final "Contract Acceptance" — not Code Review.

Quick Reference

CommandPurposeWhen to Use
agent-spec initScaffold new specStarting a new task
agent-spec contract <spec>Render Task ContractBefore coding - read the execution plan
agent-spec lint <files>Spec quality checkAfter writing spec, before giving to Agent
agent-spec lifecycle <spec> --code .Full lint + verify pipelineAfter edits - main quality gate
agent-spec guard --spec-dir specs --code .Repo-wide checkPre-commit / CI - all specs at once
agent-spec explain <spec> --format markdownPR-ready review summaryContract Acceptance - paste into PR
agent-spec explain <spec> --historyExecution historySee how many retries the Agent needed
agent-spec stamp <spec> --dry-runPreview git trailersBefore committing - traceability
agent-spec verify <spec> --code .Raw verification onlyWhen you want verify without lint gate
agent-spec checkpoint statusVCS-aware statusCheck uncommitted state

Documentation

Refer to the local files for detailed command patterns:

  • ./references/commands.md - Complete CLI command reference with all flags

IMPORTANT: Documentation Completeness Check

Before answering questions, Claude MUST:

  1. Read ./references/commands.md for exact command syntax
  2. If file read fails: Inform user "references/commands.md is missing, answering from SKILL.md patterns"
  3. Still answer based on SKILL.md patterns + built-in knowledge

The Seven-Step Workflow

Step 1: Human writes Task Contract (human attention: 60%)

Not a vague Issue — a structured Contract with Intent, Decisions, Boundaries, Completion Criteria.

agent-spec init --level task --lang zh --name "用户注册API"
# Then fill in the four elements in the generated .spec file

For rewrite, migration, or parity tasks, prefer the parity-aware scaffold:

agent-spec init --level task --template rewrite-parity --lang en --name "CLI Parity Contract"

Key principle: Exception scenarios >= happy path scenarios. 1 happy + 3 error paths forces you to think through edge cases before coding begins.

Step 2: Contract quality gate

Check Contract quality before handing to Agent. Like "code review" but for the Contract itself.

agent-spec parse specs/user-registration.spec
agent-spec lint specs/user-registration.spec --min-score 0.7

Catches: malformed structure, zero-scenario acceptance sections, vague verbs, unquantified constraints, non-deterministic wording, missing test selectors, sycophancy bias, uncovered constraints, uncovered decisions (decision-coverage), unbound observable behavior decisions (observable-decision-coverage), uncovered output modes (output-mode-coverage), unverified precedence/fallback chains (precedence-fallback-coverage), weak mock-only I/O error scenarios (external-io-error-strength), missing verification-strength metadata on I/O scenarios (verification-metadata-suggestion), missing error paths (error-path), universal claims with insufficient scenarios (universal-claim), boundary entry points without matching scenarios (boundary-entry-point), untested flag combinations (flag-combination-coverage), untagged platform-specific decisions (platform-decision-tag).

Required self-checks before coding:

  • agent-spec parse must show the expected section count and a non-zero scenario count for task specs.
  • If Acceptance Criteria: 0 scenarios appears, stop and rewrite the spec before running contract or lifecycle.
  • The parser accepts Markdown-heading forms like ### Scenario: and ### Test: for compatibility, but authoring should still emit bare Scenario: / 场景: and Test: / 测试: lines by default. Do not invent extra top-level sections like ## Milestones.

Unbound Observable Behavior review:

  • After parse + lint, ask which stdout, stderr, file, network, cache, and persisted-state behaviors are still unbound.
  • If the task is a rewrite, migration, or parity effort, also ask whether the contract covers:
    • command x output mode
    • local x remote
    • warm cache x cold start
    • fallback / precedence order
    • partial failure vs hard failure
  • If any of these surfaces are still only described in prose, switch back to authoring mode and add scenarios before coding.

Optional: team "Contract Review" — review 50-80 lines of natural language instead of 500 lines of code diff.

Step 3: Agent reads Contract and codes

Agent consumes the structured contract:

agent-spec contract specs/user-registration.spec

Agent is triple-constrained:

  • Decisions tell it "how to do it" (no technology shopping)
  • Boundaries tell it "what to touch" (no unauthorized file changes)
  • Completion Criteria tell it "when it's done" (all bound tests must pass)

Step 4: Agent self-checks with lifecycle (automatic retry loop)

agent-spec lifecycle specs/user-registration.spec \
  --code . --change-scope worktree --format json --run-log-dir .agent-spec/runs

Four verification layers run in sequence:

  1. lint — re-check Contract quality (prevent spec tampering)
  2. StructuralVerifier — pattern match Must NOT constraints against code
  3. BoundariesVerifier — check changed files are within Allowed Changes
  4. TestVerifier — execute tests bound to each scenario
Agent retry loop (no human needed):
  Code → lifecycle → FAIL (2/5) → read failure_summary → fix → lifecycle → FAIL (4/5) → fix → lifecycle → PASS (5/5) ✓

Run logs record this history — "this Contract took 3 tries to pass".

Retry Protocol

When lifecycle fails, follow this exact sequence:

  1. Run: agent-spec lifecycle <spec> --code . --format json
  2. Parse JSON output, find each scenario's verdict and evidence
  3. For fail: the bound test ran and failed — read evidence to understand why, fix code
  4. For skip: the bound test was not found — check Test: selector matches a real test name
  5. For uncertain: AI verification pending — review manually or enable AI backend
  6. Fix code based on evidence. Do NOT modify the spec file — changing the Contract to make verification pass is sycophancy, not a fix
  7. Re-run lifecycle
  8. After 3 consecutive failures on the same scenario, stop and escalate to the human

Critical rule: The spec defines "what is correct". If the code doesn't match, fix the code. If the spec itself is wrong, switch to authoring mode and update the Contract explicitly — never silently weaken acceptance criteria.

Step 5: Guard gate (pre-commit / CI)

# Pre-commit hook
agent-spec guard --spec-dir specs --code . --change-scope staged

# CI (GitHub Actions)
agent-spec guard --spec-dir specs --code . --change-scope worktree

Runs lint + verify on ALL specs against current changes. Blocks commit/PR if any spec fails.

Step 6: Contract Acceptance replaces Code Review (human attention: 30%)

Human reviews a Contract-level summary, not a code diff:

agent-spec explain specs/user-registration.spec --code . --format markdown

Reviewer judges two questions:

  1. Is the Contract definition correct? (Intent, Decisions, Boundaries make sense?)
  2. Did all verifications pass? (4/4 pass including error paths?)

If both "yes" → approve. This is 10x faster than reading code diffs.

Check retry history if needed:

agent-spec explain specs/user-registration.spec --code . --history

Assisting Contract Acceptance

When helping a human review a completed task:

  1. Run agent-spec explain <spec> --code . --format markdown and present the output
  2. If human asks about retry history: run with --history flag
  3. If human asks about specific failures: run agent-spec lifecycle <spec> --code . --format json and extract the relevant scenario results
  4. If human approves: run agent-spec stamp <spec> --code . --dry-run and present the trailers

Step 7: Stamp and archive

agent-spec stamp specs/user-registration.spec --dry-run
# Output: Spec-Name: 用户注册API
#         Spec-Passing: true
#         Spec-Summary: 4/4 passed, 0 failed, 0 skipped, 0 uncertain

Establishes Contract → Commit traceability chain.

Verdict Interpretation

VerdictMeaningAction
passScenario verifiedNo action needed
failScenario failed verificationRead evidence, fix code
skipTest not found or not runAdd missing test or fix selector
uncertainAI stub / manual review neededReview manually or enable AI backend

Key rule: skip != pass. All four verdicts are distinct.

VCS Awareness

agent-spec auto-detects the VCS from the project root. Behavior differs between git and jj:

ConditionBehavior
.jj/ exists (even with .git/)Use --change-scope jj instead of worktree
jj repoDo NOT run git add or git commit — jj auto-snapshots all changes
jj repostamp output includes Spec-Change: trailer with jj change ID
jj repoexplain --history shows file-level diffs between runs (via operation IDs)
Only .git/Use standard git commands (--change-scope staged or worktree)
NeitherChange scope detection unavailable; use --change <path> explicitly

Change Set Options

FlagBehaviorDefault
--change <path>Explicit file/dir for boundary checking(none)
--change-scope stagedGit staged filesguard default
--change-scope worktreeAll git working tree changes(none)
--change-scope jjJujutsu VCS changes(none)
--change-scope noneNo change detectionlifecycle/verify default

Advanced Features

Verification Layers

# Run only specific layers
agent-spec lifecycle specs/task.spec --code . --layers lint,boundary,test
# Available: lint, boundary, test, ai

Run Logging

agent-spec lifecycle specs/task.spec --code . --run-log-dir .agent-spec/runs
agent-spec explain specs/task.spec --history

AI Mode

agent-spec verify specs/task.spec --code . --ai-mode off      # default - no AI
agent-spec verify specs/task.spec --code . --ai-mode stub      # testing only
agent-spec lifecycle specs/task.spec --code . --ai-mode caller # agent-as-verifier

AI Verification: Caller Mode

When --ai-mode caller is used, the calling Agent acts as the AI verifier. This is a two-step protocol:

Step 1: Emit AI requests

agent-spec lifecycle specs/task.spec --code . --ai-mode caller --format json

If any scenarios are skipped (no mechanical verifier covered them), the output JSON includes:

  • "ai_pending": true
  • "ai_requests_file": ".agent-spec/pending-ai-requests.json"

The pending requests file contains AiRequest objects with scenario context, code paths, contract intent, and constraints.

Step 2: Resolve with external decisions

The Agent reads the pending requests, analyzes each scenario, then writes decisions:

[
  {
    "scenario_name": "场景名称",
    "model": "claude-agent",
    "confidence": 0.92,
    "verdict": "pass",
    "reasoning": "All steps verified by code analysis"
  }
]

Then merges them back:

agent-spec resolve-ai specs/task.spec --code . --decisions decisions.json

This produces a final merged report where Skip verdicts are replaced with the Agent's AI decisions.

When to use caller mode:

  • When the calling Agent (Claude, Codex, etc.) can read and reason about code
  • For scenarios that can't be verified by tests alone (design intent, code quality)
  • When you want the Agent to be both implementor and verifier

When to Use / When NOT to Use

ScenarioUse agent-spec?Why
Clear feature with defined inputs/outputsYesContract can express deterministic acceptance criteria
Bug fix with reproducible stepsYesGreat for "given bug X, when fixed, then Y"
Exploratory prototypingNoYou don't know "what is done" yet - vibe code first
Large architecture refactorNoBoundaries hard to define, "better architecture" isn't testable
Security/compliance rulesYes (org.spec)Encode rules once, enforce mechanically everywhere

Gradual Adoption

Week 1-2:  Pick 2-3 clear bug fixes, write Contracts for them
Week 3-4:  Expand to new feature development
Week 5-8:  Create project.spec with team coding standards
Month 3+:  Consider org.spec for cross-project governance

Common Errors

ErrorCauseSolution
Guard reports N specs failingSpecs have lint or verify issuesRun lifecycle on each failing spec individually
skip verdict on scenarioTest selector doesn't match any testCheck Test: / Package: / Filter: in spec
Quality score below thresholdToo many lint warningsFix vague verbs, add quantifiers, improve testability
Boundary violation detectedChanged file outside allowed pathsEither update Boundaries or revert the change
uncertain on all AI scenariosUsing --ai-mode stub or no backendExpected — review manually
Agent keeps failing lifecycleContract criteria too vague or too strictImprove Completion Criteria specificity

Command Priority

PreferenceUseInstead of
contractRender task contractbrief (legacy alias)
lifecycleFull pipelineverify alone (misses lint)
guardRepo-wideMultiple individual lifecycle calls
--changeExplicit paths known--change-scope when paths are known
CLI commandsTool-first approachspec-gateway library API

When to Switch to Authoring Mode

During implementation, if you discover:

  • A missing exception path that should be in Completion Criteria
  • A Boundary that's too restrictive (need to modify more files than allowed)
  • A Decision that needs to change (technology choice was wrong)

Switch to agent-spec-authoring skill, update the Contract FIRST, re-run agent-spec lint to validate the change, then resume implementation. Do NOT silently work outside the Contract's boundaries.

Escalation

Switch to library integration only when:

  • Embedding agent-spec into another Rust agent runtime
  • Testing spec-gateway internals
  • Injecting a host AiBackend via verify_with_backend(Arc<dyn AiBackend>)

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

agent-spec-authoring

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

agent-spec-estimate

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

rust-code-navigator

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

domain-cli

No summary provided by upstream source.

Repository SourceNeeds Review