Spec Smith
Turn ephemeral plans into structured, persistent specs built through deep
research and iterative interviews. Specs have phases, tasks, resume context,
and a decision log. They live in .specs/ at the project root and work
with any AI coding tool that can read markdown.
Whether .specs/ is committed is repository policy. Respect .gitignore
and the user's preference for tracked vs local-only spec state.
Critical Invariants
- Single-file policy: Keep this workflow in one
SKILL.mdfile. - Canonical paths:
- Registry:
.specs/registry.md - Per-spec files:
.specs/<id>/SPEC.md,.specs/<id>/research-*.md,.specs/<id>/interview-*.md
- Registry:
- Authority rule:
SPEC.mdfrontmatter is authoritative. Registry is a denormalized index for quick lookup. - Active-spec rule: Target exactly one active spec at a time.
- Parser policy: Use best-effort parsing with clear warnings and repair guidance instead of hard failure on malformed rows.
Claude Code Plugin
If running as a Claude Code plugin, slash commands like /specsmith:forge,
/specsmith:resume, /specsmith:pause etc. are available. See the
plugin's commands/ directory for the full set. The /forge command
replaces plan mode with deep research, iterative interviews, and spec
writing.
Session Start
If active-spec context was injected by host tooling, use it directly instead of reading files. Otherwise, fall back to reading files manually:
- Read
.specs/registry.mdto check for a spec withactivestatus - If one exists, briefly mention it: "You have an active spec: User Auth System (5/12 tasks, Phase 2). Say 'resume' to pick up where you left off."
- Don't force it — the user might want to do something else first
Deterministic Edge Cases (Best-Effort)
| Situation | Required behavior |
|---|---|
.specs/registry.md missing | If .specs/ exists, report "No registry yet" and offer to initialize it. If .specs/ is missing, report "No specs yet" and continue normally. |
| Malformed registry row | Skip malformed row, emit warning with row text, continue parsing remaining rows. |
Multiple active rows | Warn user. Pick the row with the newest Updated date (or first active row if dates are unavailable) for this run. On next write, normalize to a single active spec. |
Registry row exists but .specs/<id>/SPEC.md missing | Warn and continue. Keep row visible in list/status with (SPEC.md missing). |
| Registry and SPEC conflict | Trust SPEC.md, then repair registry values on next write. |
| No active spec | List available specs and ask which to activate or resume. |
Working on a Spec
Resuming
When the user says "resume", "what was I working on", or similar:
-
Read
.specs/registry.md— find the spec withactivestatus. If none, list specs and ask which to resume -
Load
.specs/<id>/SPEC.md -
Parse progress:
- Count completed
[x]vs total tasks per phase - Find current phase (first
[in-progress]phase) - Find current task (
← currentmarker, or first unchecked in current phase)
- Count completed
-
Read the Resume Context section
-
Present a compact summary:
Resuming: User Auth System Progress: 5/12 tasks (Phase 2: OAuth Integration) Current: Implement Google OAuth callback handler Context: Token exchange is working. Need to handle the callback URL parsing and store refresh tokens in the user model. Next file: src/auth/oauth/google.ts -
Begin working on the current task — don't wait for permission
Implementing
After completing each task, immediately edit the SPEC.md file to record progress. Do not wait until the end of a session or until asked — update the spec as you go:
- Check off the completed task:
- [ ]->- [x] - Move
← currentto the next unchecked task - When all tasks in a phase are done:
- Phase status:
[in-progress]->[completed] - Next phase:
[pending]->[in-progress]
- Phase status:
- Update the
updateddate in YAML frontmatter - Update progress (
X/Y) andupdateddate in.specs/registry.md
Update transaction (required order):
- Update
SPEC.mdfirst (status/task/phase/resume context). - Recompute progress directly from
SPEC.mdcheckboxes. - Update the matching registry row (status/progress/updated).
- Re-read both files to verify consistency.
- If registry update fails, keep
SPEC.mdas source of truth and emit a warning with exact repair action for.specs/registry.md.
Also:
- If a task is more complex than expected, split it into subtasks
- Update resume context at natural pauses
- Log non-obvious technical decisions to the Decision Log
- If implementation diverges from the spec (errors found, better approach discovered, assumptions proved wrong), log it in the Deviations section
Pausing
When the user says "pause", switches specs, or a session is ending:
- If there is no active spec, report that there is nothing to pause and stop.
- Capture what was happening:
- Which task was in progress
- What files were being modified (paths, function names)
- Key decisions made this session
- Any blockers or open questions
- Write this to the Resume Context section in SPEC.md
- Update checkboxes to reflect actual progress
- Move
← currentmarker to the right task - Add any session decisions to the Decision Log
- Update
status: pausedin frontmatter - Update the
updateddate
Resume Context is the most important part of pausing. Write it as if
briefing a colleague who will pick up tomorrow. Include specific file paths,
function names, and the exact next step. Vague context like "was working on
auth" is useless — write "implementing verifyRefreshToken() in
src/auth/tokens.ts, the JWT verification works but refresh rotation isn't
hooked up to the /auth/refresh endpoint yet."
Switching Between Specs
- Validate the target spec ID first. If missing, list available specs.
- Confirm
.specs/<target-id>/SPEC.mdexists. If not, stop with an error. - If target is already active, report and stop.
- Pause the current active spec if one exists (full pause workflow).
- Set target status to
activein frontmatter and in.specs/registry.md. - Resume the target spec (full resume workflow).
Command Ownership Map
SKILL.md: global invariants, lifecycle rules, state authority, and conflict handling, plus cross-tool OpenAPI behavior.commands/*.md: command-specific entrypoints, prompts, and output shapes.- If there is a conflict, preserve
Critical Invariantsfrom this file and apply command-specific behavior only where it does not violate invariants.
Spec Format
Frontmatter
YAML frontmatter with: id, title, status, created, updated,
optional priority and tags.
Status values: active, paused, completed, archived
Phase Markers
[pending], [in-progress], [completed], [blocked]
Task Markers
- [ ] [CODE-01]unchecked,- [x] [CODE-01]done- Task codes:
<PREFIX>-<NN>— prefix is a short (2-4 letter) uppercase abbreviation of the spec (e.g.,user-auth-system→AUTH). Numbers auto-increment across all phases starting at01 ← currentafter the task text marks the active task[NEEDS CLARIFICATION]after the task code on unclear tasks
Resume Context
Blockquote section with specific file paths, function names, and exact next step. This is what makes cross-session continuity work.
Decision Log
Markdown table with date, decision, and rationale columns. Log non-obvious technical choices (library selection, architecture pattern, API design).
Deviations
Markdown table tracking where implementation diverged from the spec: task, what the spec said, what was actually done, and why. Only log changes that would surprise someone comparing the spec to the code.
See references/spec-format.md for the full SPEC.md template.
Forging Specs
When asked to forge, plan, spec out, or "write a spec for X", follow the full forge workflow: setup, research deeply, interview the user, iterate until clear, then write the spec.
If the environment is in read-only plan mode, do not run forge in that mode.
Ask the user to exit plan mode (Shift+Tab) and rerun /specsmith:forge.
The forge workflow never produces application code. Its outputs are only
.specs/ files: research notes, interview notes, and the SPEC.md. If the
user says "write a spec", that means write a SPEC.md — not implement the
feature. Implementation happens separately, after the user reviews and
approves the spec.
Step 1: Setup
- Generate a spec ID from the title (lowercase, hyphenated):
"User Auth System"->user-auth-system - Collision check: If
.specs/<id>/SPEC.mdalready exists or the ID appears in.specs/registry.md, warn the user and ask:- Resume the existing spec
- Rename the new spec (suggest
<id>-v2or ask for a new title) - Archive the old spec and create a new one in its place Do not proceed until the user chooses.
- Initialize directories:
mkdir -p .specs/<id> - If
.specs/registry.mddoesn't exist, initialize it:# Spec Registry | ID | Title | Status | Priority | Progress | Updated | |----|-------|--------|----------|----------|---------|
Step 2: Deep Research
Research is the foundation of a good spec. Be exhaustive — use every available resource. The goal is to gather enough context that the spec won't need revision mid-build.
Research runs on two parallel tracks to maximize thoroughness and speed:
Track A: Spawn the Researcher Agent
Always spawn the specsmith:researcher agent for codebase + internet
research. Don't skip this — the researcher is purpose-built for exhaustive
multi-source analysis and runs in parallel so it doesn't slow down the
workflow.
Spawn it with the Task tool, providing:
- The user's request (what they want to build/change)
- The spec ID and output path:
.specs/<id>/research-01.md - Any Context7 findings you've already gathered (Track B)
- Specific areas to focus on, if known
The researcher will:
- Map the full project architecture (read manifests, lock files, directory tree)
- Read 15-30 relevant code files and trace dependency chains
- Run 3+ web searches for best practices and current patterns
- Compare 2-4 library candidates for every choice point
- Assess security risks and performance implications
- Produce a structured research document with a completeness checklist
Track B: Context7 & Cross-Skill Research (in parallel)
While the researcher runs, do these yourself — they use MCP tools that the researcher agent doesn't have access to:
- Context7: If available (resolve-library-id / query-docs tools), pull up-to-date documentation for every key library involved. Check API changes, deprecated features, and recommended patterns for the specific versions in use. Do this for 2-5 key libraries — the ones central to the feature being built.
- Cross-skill loading: Load other available skills when relevant:
- frontend-design: For UI-heavy specs — creative, professional design
- datasmith-pg: For database specs — schema design, migrations, indexing
- webapp-testing: For testing strategy — Playwright patterns
- vercel-react-best-practices: For Next.js/React performance
- Any other relevant skill that's available
- UI research (if applicable): Take screenshots, map component hierarchy, research modern UI patterns, note accessibility requirements
Merging Research
When the researcher agent completes, read its output at
.specs/<id>/research-01.md. Merge your Context7 and cross-skill findings
into the research notes — either append to the file or keep them in mind
for the interview. The combined research should cover:
architecture, relevant code, tech stack, library comparisons, internet
research, Context7 docs, UI research (if applicable), risk assessment,
and open questions.
Step 3: Interview Round 1
Present your research findings and ask targeted questions. Your research should inform specific questions, not generic ones.
- Summarize findings (2-3 paragraphs — not a wall of text)
- State assumptions — "Based on the codebase, I'm assuming we'll use X pattern because that's what similar features use. Correct?"
- Ask 3-6 targeted questions that research couldn't answer:
- Architecture decisions ("New module or extend existing one?")
- Scope boundaries ("Should this handle X edge case?")
- Technical choices ("Stick with Library A or try Library B?")
- User-facing behavior ("What should happen when X fails?")
- Propose a rough approach and ask for reactions
STOP after presenting questions. Wait for the user to answer before proceeding. Do not answer your own questions, do not assume answers, and do not continue to Step 4 or Step 5 until the user has responded. The interview is a conversation — the user's answers shape the spec. If you skip this, the spec will be based on guesses instead of decisions.
Save to .specs/<id>/interview-01.md with: questions asked, user answers,
key decisions, and any new research needed.
Step 4: Deeper Research + Interview Loop
Based on the user's answers, do another round of research — explore the
specific paths they chose, check feasibility, find potential issues. Save
to .specs/<id>/research-02.md.
Then present deeper findings and ask about trade-offs, edge cases,
implementation sequence, and scope refinement. Save each interview round
to interview-02.md, interview-03.md, etc.
Repeat research-then-interview until:
- You have enough clarity to write a spec with no ambiguous tasks
- The user is satisfied with the direction
- Every task can be described concretely (not "figure out X")
Two rounds is typical. Don't rush it — but don't drag it out either.
Step 5: Write the Spec
Synthesize all research notes, interview answers, and decisions into a
comprehensive SPEC.md. See references/spec-format.md for the full template.
The spec should be thorough and detailed — someone reading it should be able to implement the feature without guessing. Include:
- YAML frontmatter (id, title, status, created, updated, priority, tags)
- Overview (2-4 sentences — what's being built and why)
- Architecture Diagram — ASCII art or Mermaid diagram showing the system architecture, data flow, or component relationships. Every non-trivial spec should have at least one diagram. Use ASCII for simple flows, Mermaid for complex relationships.
- Library Choices — Table comparing evaluated libraries with the selected pick and rationale. Include version numbers.
- Phases with status markers (3-6 phases is typical)
- Tasks as markdown checkboxes with task codes (
[PREFIX-NN]) — be specific: include file paths, function names, and expected behavior - Testing Strategy — Comprehensive testing plan: unit tests, integration tests, e2e tests, edge case tests. Specify which testing frameworks to use and what test files to create. Every feature task should have a corresponding test task.
- Resume Context section (blockquote)
- Decision Log with non-obvious technical choices from the interviews
- Deviations table (empty — filled during implementation)
Diagram guidelines:
- Use ASCII art for simple request flows and data pipelines:
Client → API Gateway → Auth Middleware → Route Handler → Database ↓ Cache Layer - Use Mermaid for complex architecture, state machines, and ER diagrams:
graph TD A[Client] --> B[API Gateway] B --> C{Auth?} C -->|Yes| D[Handler] C -->|No| E[401] - Include at least one diagram per spec (architecture, data flow, or state)
Solution quality standards:
- Proposed solutions should be simple, maintainable, and professional
- Prefer clean, modern patterns over clever hacks
- Choose the best available libraries — compare options, pick the most mature and well-maintained
- UI designs should be creative, sleek, and professional — not generic
- Code architecture should be innovative where appropriate but always clean
Coherence and logic review (mandatory before presenting):
- Read through the entire spec as a whole — does it tell a coherent story?
- Check that phases are in logical dependency order — no phase requires work from a later phase
- Verify every task is concrete and actionable (file paths, function names)
- Confirm the architecture diagram matches the task descriptions
- Check that the testing strategy covers all feature tasks
- Verify library choices are consistent throughout (no conflicting picks)
- Ensure the overview accurately summarizes what the phases will deliver
- Look for gaps — is there anything the implementation would need that isn't covered by a task?
Save to .specs/<id>/SPEC.md. Update .specs/registry.md — set
status to active.
Present the spec and wait for approval. Show the user the complete spec and ask: "Does this look right? Want to adjust anything before we start?" Do not begin implementing until the user explicitly approves. The forge workflow produces only spec files (SPEC.md, research-.md, interview-.md) — never application code. Implementation starts only after the user approves the spec and says to proceed.
Phase/task guidelines:
- Mark Phase 1 as
[in-progress], the rest as[pending] - Mark the first unchecked task with
← current
Implementing a Spec
When the user says "implement the spec", "implement phase N", "implement all phases", or similar:
Scope Detection
Parse the user's request to determine scope:
- "implement the spec" or "implement" → Start from the current task
(the
← currentmarker) and work forward - "implement phase N" or "implement phase <name>" → Implement all tasks in that specific phase
- "implement all phases" or "implement everything" → Implement all remaining unchecked tasks across all phases, in order
Implementation Flow
- Read
.specs/registry.mdto find the active spec - Load
.specs/<id>/SPEC.mdand parse phases/tasks - Identify the target tasks based on scope
- For each task, in order:
a. Mark it with
← currentb. Implement it — write the actual code c. Check it off:- [ ]→- [x]d. Remove the← currentmarker e. When all tasks in a phase complete:- Phase status:
[in-progress]→[completed] - Next phase:
[pending]→[in-progress]f. Updateupdateddate in frontmatter g. Update progress and date in.specs/registry.md
- Phase status:
- After each task completion, update Resume Context with current state
- Log any new decisions to the Decision Log
- If implementation diverges from the spec, log it in the Deviations section
- If blocked on a task:
- Keep the task unchecked and record blocker details in Resume Context
- Set phase marker
[blocked]only when the whole phase is blocked - Continue with another unblocked task only if sequencing allows it
Testing During Implementation
When implementing, follow the testing strategy from the spec:
- Write tests as specified in the testing tasks
- Run tests after each task to verify correctness
- If a test task exists for the feature task you just completed, implement the test task immediately after
Completion
When all tasks are done:
- Set all phases to
[completed] - Set spec status to
completedin frontmatter and registry - Update the
updateddate - Present a summary of what was implemented
- Suggest next spec to activate if any are paused
Generating OpenAPI Docs
When the user says "generate openapi", "update api docs", or similar:
- Scan the codebase for API routes/handlers/controllers and request/response schemas.
- Infer auth/security schemes and endpoint grouping (tags).
- Write
.openapi/openapi.yaml(OpenAPI 3.1.1) with:operationIdfor every operation- Reusable
components/schemasand$refusage - Accurate parameters, request bodies, responses, and security
- Write one endpoint doc per route under
.openapi/endpoints/using{method}-{path-slug}.mdnames (e.g.,get-api-users-id.md). - Preserve manual additions in existing
.openapi/files when updating. - Report totals: endpoints, schemas, security schemes, and manual-review candidates.
Before Session Ends
If the session is ending:
- Pause the active spec (run full pause workflow)
- Write detailed resume context
- Confirm to the user that context was saved
Directory Layout
All state lives in .specs/ at the project root:
.specs/
├── registry.md # Denormalized index for status/progress lookups
└── <spec-id>/
├── SPEC.md # The spec document
├── research-01.md # Deep research findings
├── interview-01.md # Interview notes
└── ...
Registry Format
.specs/registry.md is a simple markdown table:
# Spec Registry
| ID | Title | Status | Priority | Progress | Updated |
|----|-------|--------|----------|----------|---------|
| user-auth-system | User Auth System | active | high | 5/12 | 2026-02-10 |
| api-refactor | API Refactoring | paused | medium | 2/8 | 2026-02-09 |
SPEC.md frontmatter is authoritative. The registry is a denormalized index for quick lookups. Always update both together — when you change status, progress, or dates in SPEC.md, immediately mirror those changes in the registry. If they ever conflict, SPEC.md wins.
Listing Specs
Read .specs/registry.md and present specs grouped by status:
Active:
-> user-auth-system: User Auth System (5/12 tasks, Phase 2)
Paused:
|| api-refactor: API Refactoring (2/8 tasks, Phase 1)
Completed:
ok ci-pipeline: CI Pipeline Setup (8/8 tasks)
Canonical Output Templates
Use these concise formats consistently:
Resume
Resuming: <Title> (<id>)
Progress: <done>/<total> tasks
Phase: <phase name>
Current: <task text>
Context: <one to three lines from Resume Context>
List
Active:
-> <id>: <Title> (<done>/<total>, <phase>) [<priority>]
Paused:
|| <id>: <Title> (<done>/<total>, <phase>) [<priority>]
Completed:
ok <id>: <Title> (<done>/<total>) [<priority>]
Status
<Title> [<status>, <priority>]
Created: <date> | Updated: <date>
Phase <n>: <name> [<marker>]
Progress: <done>/<total> (<pct>%)
Current: <task text or none>
Completing a Spec
- Verify all tasks are checked (warn if not, but allow override)
- Set status to
completedin frontmatter and registry - Update the
updateddate in both - Suggest next spec to activate if any are paused
Archiving a Spec
Archive completed specs to keep the registry clean:
- Set status to
archivedin frontmatter and registry - Research files (research-.md, interview-.md) in
.specs/<id>/can optionally be deleted (the SPEC.md has all the decisions and context)
Specs can be archived from completed or paused status. To reactivate
an archived spec, set its status back to active.
Deleting a Spec
To remove a spec entirely:
- Delete
.specs/<id>/(contains SPEC.md, research notes, interviews) - Remove the row from
.specs/registry.md
This is irreversible — consider archiving instead if you might need it later.
Cross-Tool Compatibility
The spec format is pure markdown with YAML frontmatter. Any tool that can read and write files can use these specs:
- Claude Code: Full plugin support or skill via
npx skills add - Codex: Snippet in AGENTS.md or skill via
npx skills add - Cursor / Windsurf / Cline: Snippet in rules file
- Gemini CLI: Snippet in GEMINI.md
- Humans: Readable and editable in any text editor
- Git: Diffs cleanly, easy to track in version control
To configure another tool, run npx skills add ngvoicu/specsmith -a <tool>.
Behavioral Notes
Be proactive about spec management. If you notice the user has been working for a while and made progress, update the spec without being asked. If a session is ending, offer to pause and save context.
Specs should evolve. It's fine to add tasks, reorder phases, or split a phase into two as understanding deepens. Specs aren't contracts — they're living documents that adapt as you learn more about the problem.
The Decision Log matters. When the user makes a non-obvious technical choice (library selection, architecture pattern, API design), log it with the rationale. Future-you resuming this spec will thank present-you.
Don't over-structure. A spec with 3 phases and 15 tasks is useful. A spec with 12 phases and 80 tasks is a project plan, not a coding spec. Keep it lean enough to parse and act on in one read.
Respect the user's flow. Don't interrupt deep coding work to update the spec. Batch updates for natural pauses — task completion, phase transitions, or session boundaries.