fix

Make risky or unclear code safe with the smallest sound, validated change, then keep listening to the final self-review until no actionable self-review change remains. When the validated changeset survives the post-self-review rerun, $fix stops at that boundary; broader architecture, product, or roadmap analysis belongs to another skill.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "fix" with this command: npx skills add tkersey/dotfiles/tkersey-dotfiles-fix

Fix

Intent

Make risky or unclear code safe with the smallest sound, validated change, then keep listening to the final self-review until no actionable self-review change remains. When the validated changeset survives the post-self-review rerun, $fix stops at that boundary; broader architecture, product, or roadmap analysis belongs to another skill.

Double Diamond fit

fix spans the full Double Diamond, but keeps divergence mostly internal to stay autonomous:

  • Discover: reproduce/characterize + collect counterexamples.

  • Define: state the contract and invariants (before/after).

  • Develop: (internal) consider 2-3 plausible fix strategies; pick the smallest sound one per default policy.

  • Deliver: implement + prove with a real validation signal.

If a fix requires a product-sensitive choice (cannot be derived or characterized safely), stop and ask (or invoke $creative-problem-solver when multiple viable strategies exist).

Skill composition (required)

  • $invariant-ace : default invariant/protocol engine for fix ; run it before invariant-affecting edits.

  • $complexity-mitigator : default complexity analysis engine; use it for risk-driven complexity judgments before reshaping code.

  • $refine : default path when the task is to improve fix itself (or other skills); apply $refine workflow and quick_validate .

  • Delegation order when both are needed: $invariant-ace first, then $complexity-mitigator .

Delegation triggers (deterministic):

  • Run $invariant-ace when the slice touches stateful boundaries (parse/construct/API/DB/lock/txn/retry ordering), ownership/lifetime, or any "should never happen" claim.

  • Run $complexity-mitigator when auditability is at risk (for example: branch soup, deep nesting, cross-file reasoning, or unclear ownership/control flow).

  • Run $refine when the requested target is a skill artifact (for example SKILL.md , agents/openai.yaml , skill scripts/references/assets).

  • If both invariant and complexity triggers fire, run $invariant-ace first, then $complexity-mitigator .

Inputs

  • User request text.

  • Repo state (code + tests + scripts).

  • Validation signal (failing test/repro/log) OR a proof hook you create.

Outputs (chat)

  • Emit the exact sections in Deliverable format (chat) .

  • Use section headings verbatim (for example, Findings (severity order) , not Findings ).

  • During execution, emit one-line pass progress updates at pass start and pass end using: Pass <n>/<total_planned>: <name> — <start|done>; edits=<yes|no|n/a>; signal=<cmd|n/a>; result=<ok|fail|n/a> .

  • In Validation , include a machine-checkable JSON object with keys: baseline_cmd , baseline_result , proof_hook , final_cmd , final_result .

  • Keep self-review transcript shapes aligned with references/self_review_loop_examples.md when editing this contract.

Embedded mode (when $fix is invoked inside another skill)

  • You may emit a compact Fix Record instead of the full deliverable.

  • If another skill requires a primary artifact (for example patch-only diff), append Fix Record immediately after that artifact in the same assistant message.

  • Do not mention “Using $fix” without emitting either the full deliverable or a Fix Record.

Hard rules (MUST / MUST NOT)

  • MUST default to review + implement (unless review-only is requested).

  • MUST triage and present findings in severity order: security > crash > corruption > logic.

  • MUST NOT claim done without a passing validation signal.

  • MUST NOT edit when no local signal/proof hook can be found or created under Validation signal selection ; fail fast and report blocker.

  • MUST NOT do product/feature work.

  • MUST NOT do intentional semantic changes without clarifying, except correctness tightening.

  • MUST resolve trade-offs using Default policy (non-interactive) .

  • MUST emit pass progress updates while running the multi-pass loop.

  • MUST include a final Pass trace section in the deliverable/Fix Record with executed pass count and per-pass outcomes.

  • MUST include machine-checkable validation evidence keys in Validation : baseline_cmd , baseline_result , proof_hook , final_cmd , final_result .

  • MUST use the exact heading names from Deliverable format (chat) / Fix Record ; do not alias or shorten heading labels.

  • MUST produce a complete finding record for every acted-on issue:

  • counterexample

  • invariant_before

  • invariant_after

  • fix

  • proof

  • proof_strength

  • compatibility_impact

  • MUST follow the delegation contract in Skill composition (required) (including order: $invariant-ace -> $complexity-mitigator ).

  • MUST route skill-self edits (for example codex/skills/fix ) through $refine and run quick_validate .

  • MUST NOT put fixable items in Residual risks / open questions ; if it is fixable under the autonomy gate + guardrails, treat it as a finding and fix it.

  • MUST include a final Self-review loop trace section in the deliverable/Fix Record.

  • MUST run the final agent-directed self-review loop only after at least one change is applied and validation is passing.

  • MUST run the self-review loop against the final validated changeset only.

  • MUST invalidate and rerun the self-review loop if any non-self-review edit occurs after a self-review round.

  • MUST ask internally exactly: If you could change one thing about this changeset what would you change?

  • MUST treat the final agent-directed self-review phase as current-worktree scoped once the first validated changeset exists; do not reject a self-review suggestion solely because it broadens the diff.

  • MUST treat a self-review answer as actionable whenever it identifies a concrete compatible/provable improvement on the current validated changeset, even if the improvement is ergonomic, structural, or API-shaping rather than a baseline bug fix.

  • MUST answer that question internally, apply at most one new actionable self-review change per self-round, re-run validation, and repeat until a self-round yields no new actionable self-review change or only blocked changes.

  • MUST NOT reject a self-review suggestion solely because the baseline is already green, the concern sounds architectural, or the change would reshape a public/API seam; if it can stay backward-compatible and be revalidated locally, implement it.

  • MUST, when a self-review critique is broader than the smallest bug repair, apply the narrowest compatible/provable slice that materially addresses the critique before considering broader follow-up skills.

  • MUST rerun the non-self-review $fix passes once after the self-review loop reaches no_new_actionable_changes or blocked .

  • MUST NOT use scope_guardrail as the reason to reject a self-review suggestion once the self-review phase has started.

  • MUST NOT report a self-review answer that was already applied before the final self-review round; record finding=none only when the current final validated changeset yields no concrete compatible/provable self-review change.

  • MUST NOT emit If you could change one thing about this changeset what would you change? as a user-facing terminal line during normal successful completion.

  • MUST stop $fix once no new actionable self-review change remains and the post-self-review rerun is clean; do not continue under $fix into broader architecture, product, roadmap, or conceptual analysis.

  • MUST, if the user asks for broader or bolder analysis after a clean or closed $fix pass, close the $fix deliverable first and recommend the next skill explicitly ($grill-me , $parse , $plan , or $creative-problem-solver ) instead of continuing under $fix .

  • When paired with $tk in wave execution, MUST treat $fix as the final mutating pass before artifactization:

  • commit_first : hand off immediately to $commit after passing validation.

  • patch_first : hand off immediately to $patch after passing validation.

Default policy (non-interactive)

Use these defaults to maximize autonomy (avoid asking).

Priority order (highest first):

  • correctness + data safety + security

  • compatibility for PROVEN_USED behavior

  • performance

Definitions:

  • PROVEN_USED = behavior with evidence (see checklist below).

  • NON_PROVEN_USED = everything else.

PROVEN_USED evidence checklist (deterministic)

Goal: classify behavior as PROVEN_USED vs NON_PROVEN_USED without asking.

Algorithm:

  • Enumerate affected behavior tokens:

  • API symbols (functions/types/methods).

  • CLI flags/commands.

  • config keys / env vars.

  • file/wire formats (field names, JSON keys).

  • error codes/messages consumed by callers.

  • Collect evidence in this order (stop at first match):

  • User repro/signal references the token.

  • Tests assert on the token/behavior.

  • Docs/README/examples/CHANGELOG/config examples describe or demonstrate the token/behavior.

  • Repo callsites use the token (non-test, non-doc).

  • IF any evidence exists, THEN mark PROVEN_USED; ELSE mark NON_PROVEN_USED.

Evidence scan procedure (repo-local):

  • Tests: search tests/ , test/ , tests/ , spec/ , and files matching test / .spec. .

  • Docs/examples: search README* , docs/ , examples/ , CHANGELOG* , config.example* , and *.md .

  • Callsites: search remaining source files.

  • CI/config surfaces: also scan .github/workflows/ for env vars, flags, and script entrypoints.

Tooling hint: prefer rg -n "<token>" with --glob filters. Deterministic scan command template (run in this order):

  • rg -n "<token>" tests test tests spec --glob 'test' --glob '.spec.'

  • rg -n "<token>" README* docs examples CHANGELOG* config.example* --glob '*.md'

  • rg -n "<token>" .github/workflows

  • rg -n "<token>" .

Externally-used surface checklist (deterministic)

Treat a surface as external if ANY is true:

  • Token appears in docs/README/examples.

  • Token is a CLI flag/command, config key, env var, or file format field.

  • Token is exported/public API (examples: Rust pub , TS/JS export , Python in all , Go exported identifier).

If external use is plausible but uncertain:

  • Prefer an additive compatibility path (wrapper/alias/adapter) if small.

  • Ask only if a breaking change is unavoidable.

Compatibility rules:

  • MUST preserve PROVEN_USED behavior unless it is unsafe (crash/corruption/security). If unsafe, tighten and return a clear error.

  • For NON_PROVEN_USED behavior, treat it as undefined; tightening is allowed.

API/migration rules:

  • IF a fix touches an externally-used surface (use Externally-used surface checklist ), THEN prefer additive/backward-compatible changes (new option/new function/adapter) over breaking changes.

  • IF a breaking change is unavoidable, THEN stop and ask.

Performance rules:

  • MUST avoid obvious asymptotic regressions in hot paths.

  • IF performance impact is plausible but unmeasurable locally, THEN choose the smallest correctness-first fix and record the risk in Residual risks / open questions (do not ask).

Autonomy gate (conviction)

Proceed without asking only if ALL are true:

  • SIGNAL: you have (or can create) a local repro/signal without product ambiguity.

  • CONTRACT: you can derive contract from repo evidence OR characterize current behavior without product ambiguity.

  • INVARIANT: you can state invariant_before and invariant_after.

  • DIFF: the change is localized and reviewable for the chosen strategy.

  • PROOF: at least one validation signal passes after changes.

If ANY gate fails:

  • Apply Defaults + contract derivation.

  • Ask only if still blocked.

Correctness tightening (default)

Definition: tightening = rejecting invalid/ambiguous states earlier, removing silent failure paths, or making undefined behavior explicit.

Rule:

  • IF tightening prevents crash/corruption/security issue, THEN apply it (even if PROVEN_USED), return a clear error, and record the compatibility impact.

  • ELSE IF tightening affects only NON_PROVEN_USED inputs/states, THEN apply it without asking.

  • ELSE (tightening might change PROVEN_USED behavior):

  • prefer a backward-compatible shape (adapter/additive API), OR

  • lock current behavior with a characterization test,

  • ask only if compatibility cannot be preserved.

Allowed tightening moves (examples only):

  • Parse/coerce: implicit coercion/partial parse -> explicit parse + error (JS implicit string->number; Rust parse().ok() fallback).

  • Fallbacks: warn+fallback/default-without-signal -> explicit error OR explicit "unknown"/"invalid".

  • Lossy conversion: truncation/clamp/wrap -> checked conversion + error (Rust as ; C casts; Go int conversions).

  • Partial updates: multi-step mutation -> atomic/transactional boundary OR explicit partial result.

  • Ignored errors: ignore -> propagate with context; if intentionally ignored -> compensating signal (assert/test/metric/log).

  • Sentinels: -1 /0 /"" /None -> richer returns.

Defaults (use only when semantics-preserving for valid inputs)

Ownership / lifetimes

  • MUST inventory in-slice resources: allocations, handles, sockets, locks, txns, temp files.

  • For each resource, record: acquire_site, owner, release_action.

  • MUST ensure every exit releases exactly once.

  • MUST free/release ONLY via the proven owner/allocator/handle.

  • Never assume a "free" is safe without proving allocator/ownership.

  • Arena/bump allocators: per-object free is safe only if explicitly documented as permitted/no-op.

Validation signal selection

Algorithm:

  • IF the user provided a command, use it.

  • ELSE choose the cheapest local signal (no network) in this order:

  • README/QUICKSTART

  • scripts/check / scripts/test

  • Makefile / justfile / Taskfile.yml

  • ELSE create a proof hook in this order:

  • focused regression/characterization test

  • boundary assertion that fails loudly

  • scoped + rate-limited diagnostic log tied to ONE invariant (never log secrets/PII)

No-signal fail-fast (deterministic):

  • If no signal/proof hook is available after executing the selection algorithm once, stop before editing.

  • Return one blocker with blocked_by=no_repro_or_proof and include the exact commands you attempted.

PR/diff scope guardrail (default in review mode)

  • Default slice = changed lines/paths (git diff/PR diff) + at most one boundary seam (parse/construct/API edge) required to make the fix sound.

  • Do not fix pre-existing issues outside the slice unless:

  • severity is security/crash/corruption AND

  • the fix is localized and provable without widening the slice.

  • Otherwise: record as Residual risks / open questions .

Scope widening trigger (deterministic):

  • Widen beyond the diff only when ALL are true:

  • severity is security , crash , or corruption

  • you can show a concrete causal chain from a diff token to the out-of-slice location

  • the widening is limited to one adjacent seam (at most one additional file/module boundary)

  • one local validation signal can prove the widened fix

  • If any check fails, keep scope fixed and record blocked_by=scope_guardrail .

  • This guardrail applies to the core review passes.

  • Once the final agent-directed self-review loop starts, scope expands to the current repo/worktree and scope_guardrail is not a valid reason to reject the self-review suggestion.

Generated / third-party code guardrail

  • If a file appears generated or is under third-party/build output, do not edit it directly.

  • Prefer to locate the source-of-truth + regeneration step; apply fixes there and regenerate as needed.

  • Common no-edit zones (examples): dist/ , build/ , vendor/ , node_modules/ .

Proof discipline for passing baselines

When the baseline signal is ok:

  • For every acted-on finding, prefer a proof hook that fails before the fix (focused regression/characterization test).

  • If you cannot produce a failing proof hook, do not edit; attempt to create one first by:

  • turning the counterexample into a focused regression/characterization test

  • enforcing the invariant at a single boundary seam (parse/refine once) and testing the new error

  • reducing effects to a pure helper and unit-testing it

  • using an existing fuzzer/property test harness if present

  • Only if you still cannot create a proof hook without product ambiguity, record the blocker as residual risk.

  • Self-review exception: when the self-review finding is a concrete compatible simplification or auditability improvement on an already-green changeset, a fresh failing hook is preferred but not mandatory; the primary validation bundle may serve as the proof hook if it still proves the improved invariant after the change.

Residual risks / open questions policy (last resort)

Residual risks are a record of what you could not safely fix (not a to-do list).

Rules:

  • Before emitting a residual item, attempt to convert it into an actionable finding:

  • produce a concrete counterexample

  • attach a proof hook (test/assert) that fails before the fix

  • apply the smallest localized fix within guardrails

  • re-run the validation signal

  • Only emit residual items that are truly blocked by one of these blockers:

  • product_ambiguity (semantics cannot be derived/characterized safely)

  • breaking_change (no additive path; fix would be breaking)

  • no_repro_or_proof (cannot create a repro/proof hook locally)

  • scope_guardrail (outside diff slice and not severe enough to widen)

  • generated_output (generated/third-party output; need source-of-truth + regen)

  • external_dependency (needs network/creds/services/hardware)

  • perf_unmeasurable (impact plausible; no local measurement)

  • For self-review-originated changes, do not emit blocked_by=scope_guardrail ; if scope is the only blocker, widen and continue.

  • Every residual bullet MUST include: a location or token, blocked_by=<product_ambiguity|breaking_change|no_repro_or_proof|scope_guardrail|generated_output|external_dependency|perf_unmeasurable> , and next=<one action> .

  • If there are no residual items, output - None .

Multi-pass loop (default)

Goal: reduce missed issues in PR/diff reviews without widening scope.

Run 3 core passes (mandatory). Run 2 additional delta passes only if pass 3 edits code.

Pass 1) Safety (highest severity)

  • Scope: diff-driven slice + required boundary seams.

  • Focus: security/crash/corruption hazards, unsafe tightening, missing error propagation.

  • Change budget: smallest sound fix only.

Pass 2) Surface (compat + misuse)

  • Scope: externally-used surfaces touched by the diff (exports/CLI/config/format/docs).

  • Focus: additive compatibility, defuse top footguns, clearer errors.

  • Change budget: additive/wrapper/adapter preferred; breaking change => stop and ask.

Pass 3) Audit (invariants + ownership + proof quality)

  • Scope: final diff slice.

  • Focus: invariants enforced at strongest cheap boundary; ownership release-on-all-paths; proof strength.

  • Delegation: run $invariant-ace first for invariant framing, then $complexity-mitigator for complexity verdicts that affect auditability.

  • Change budget: no refactors unless they directly reduce risk/auditability of invariants.

Early exit (stop after pass 3):

  • If pass 3 applies no edits, stop (after running/confirming the primary validation signal).

Delta passes (only if pass 3 applied edits)

Pass 4) Safety delta rescan

  • Scope: ONLY lines/paths changed by passes 1-3 and their immediate boundaries.

  • Focus: new security/crash/corruption hazards introduced by the fixes.

Pass 5) Surface + proof delta rescan

  • Re-enumerate behavior tokens from the FINAL diff (including newly introduced errors/flags/exports/config keys).

  • Re-run PROVEN_USED/external-surface checks for newly introduced tokens.

  • Ensure proof is still strong for the final diff.

Rules:

  • After any pass that edits code, run a local signal before continuing.

  • Do not edit on suspicion: every edit must have a concrete counterexample, invariant_before/after, and a proof hook.

  • During the self-review phase, the counterexample may be an auditability or responsibility-split failure in the current validated changeset; if the change is compatible and locally validated, that is actionable under $fix even when the baseline was already green.

  • Merge/de-duplicate findings across passes; final output format stays unchanged.

  • Emit pass progress updates in real time at pass start/end; these updates are required and do not replace the final Pass trace section.

Clarify before changes

Stop and ask ONLY if any is true:

  • Behavior is contradictory/product-sensitive AND cannot be derived from tests/docs/callsites AND cannot be characterized safely.

  • A breaking API change or irreversible migration is unavoidable (no backward-compatible path).

  • No local signal/proof hook can be found or created without product ambiguity.

Finding record schema (internal)

For every issue you act on, construct this record before editing:

  • id: F<number>

  • location: file:line

  • severity: security|crash|corruption|logic

  • issue:

  • tokens: <affected behavior tokens; empty if none>

  • proven_used: <yes/no + evidence if yes>

  • external_surface: <yes/no + why>

  • diff_touch: <yes/no>

  • counterexample: <input/timeline>

  • invariant_before: <what is allowed/assumed today>

  • invariant_after: <what becomes guaranteed/rejected>

  • fix:

  • proof: <test/assert/log + validation command + result>

  • proof_strength: characterization|targeted_regression|property_or_fuzz

  • compatibility_impact: none|tightening|additive|breaking

Canonical finding example (fully-filled)

Use this as the reference shape when writing findings.

F1 src/config_loader.py:88 — crash — untrusted config type causes uncaught attribute access

  • Surface: Tokens=config.path; PROVEN_USED=yes (tests/config/test_loader.py::test_reads_path); External=yes; Diff_touch=yes
  • Counterexample: {"path":null} reaches cfg.path.strip() and raises AttributeError
  • Invariant (before): loader assumes path is a non-empty string
  • Invariant (after): loader accepts only non-empty string path; invalid type returns explicit error
  • Fix: add boundary validation in parse_config() and reject invalid path before use
  • Proof: uv run pytest tests/config/test_loader.py::test_rejects_null_path -> ok
  • Proof strength: targeted_regression
  • Compatibility impact: tightening

Residual risks / open questions

  • vendor/generated/config_schema.py — blocked_by=generated_output — next=edit source schema and regenerate artifact

Workflow (algorithm)

  1. Preflight
  • Determine mode: review-only vs fix.

  • If the requested target is this skill (or another skill), route through $refine :

  • follow $refine Discover -> Define -> Develop -> Deliver,

  • apply minimal skill diffs,

  • run uv run --with pyyaml -- python3 codex/skills/.system/skill-creator/scripts/quick_validate.py codex/skills/<skill-name> .

  • if codex/skills/fix/SKILL.md was touched, also run uv run python codex/skills/fix/scripts/lint_fix_skill_contract.py codex/skills/fix/SKILL.md .

  • Return to the rest of fix only if the request also includes code/runtime defects.

  • Define slice:

  • If in a git repo and there is a diff, derive slice/tokens from the diff (paths, changed symbols/strings).

  • Otherwise: entrypoint, inputs, outputs, state.

  • If the request is PR-scoped (for example "$fix this PR ", "$fix current branch ", CI failure on a PR), anchor the slice to PR diff/base and keep the work PR-local unless severity widening is required.

  • Apply PR/diff scope guardrail for review mode.

  • Apply Generated / third-party code guardrail before editing.

  • Select validation signal (or create proof hook).

  • If signal/proof creation fails, stop before editing and return blocked_by=no_repro_or_proof with attempted commands.

  1. Contract + baseline
  • Determine PROVEN_USED behavior:

  • Apply PROVEN_USED evidence checklist to the behavior tokens affected by the slice.

  • Derive contract without asking (in this order):

  • tests that exercise the slice

  • callsites in the repo

  • docs/README/examples

  • if none: add a characterization test for current behavior

  • Write contract (1 sentence): "Working means …".

  • Run baseline signal once; record result.

  • If baseline signal is ok, apply Proof discipline for passing baselines .

  1. Create initial findings
  • Enumerate candidate failure modes for the slice.

  • Rank security > crash > corruption > logic.

  • For each issue you will act on, create a finding record.

  1. Mandatory scans (multi-pass)

Run the Multi-pass loop (default) for the touched slice.

3a) Unsoundness scan

For each hazard class that applies:

  • Identify the first failure point.

  • Provide a concrete counterexample (input/timeline).

  • Specify the smallest sound fix that removes the bug-class.

Hazard classes:

  • Security boundaries: authz/authn, injection, path traversal, SSRF, unsafe deserialization, secrets/PII handling.

  • Nullability/uninitialized state; sentinel values.

  • Ownership/lifetime: leaks, double-free, use-after-close, lock not released.

  • Concurrency/order/time: races, lock ordering, reentrancy, timeouts/retries, TOCTOU.

  • Bounds/arithmetic: indexing/slicing, overflow/underflow, signedness.

  • Persistence/atomicity: partial writes, torn updates, non-transactional multi-step updates.

  • Encoding/units: bytes vs chars, timezone/locale, unit mismatches, lossy normalization.

  • Error handling: swallowed errors, missing context, silent fallback.

Language cues (examples only):

  • Rust: unsafe , unwrap/expect , as casts.

  • JS/TS: any , implicit coercions, undefined flows.

  • Python: bare except , truthiness-based branching.

  • C/C++: raw pointers, unchecked casts, manual malloc/free .

3b) Invariant strengthening scan

Delegate to $invariant-ace and import its artifacts into fix .

Default execution:

  • Run $invariant-ace Compact Mode first.

  • Capture at minimum:

  • Counterexample

  • Invariants

  • Owner and Scope

  • Enforcement Boundary

  • Seam (Before -> After)

  • Verification

  • Map these to fix finding fields:

  • invariant_before from broken trace + prior scope,

  • invariant_after from chosen predicate(s) + holds scope,

  • fix from seam,

  • proof from verification signal.

  • Escalate to full $invariant-ace protocol if Compact Mode does not yield an inductive predicate.

3c) Footgun scan + defusal

Trigger: you touched an API surface OR a caller can plausibly misuse the code.

For top-ranked misuse paths:

  • Provide minimal misuse snippet + surprising behavior.

  • Defuse by changing the surface:

  • options structs / named params

  • split booleans into enums or separate functions

  • explicit units/encodings

  • richer returns (no sentinels)

  • separate pure computation from effects

  • Lock with a regression test or boundary assertion.

3d) Complexity scan (risk-driven)

Delegate to $complexity-mitigator for analysis; implement only via fix .

Rule: reshape only when it reduces risk and improves auditability of invariants/ownership.

Algorithm:

  • Run $complexity-mitigator on the touched slice (heat read + essential/incidental verdict + ranked options + TRACE).

  • Select the smallest viable cut that directly supports a finding (safety/surface/invariant ownership/proof quality).

  • If simplification depends on an unstated invariant, run/refresh $invariant-ace first.

  • Keep complexity-only cleanup out of scope unless it closes a concrete risk.

  1. Implement fixes (per finding)

For findings in severity order:

  • Implement the smallest sound fix that removes the bug-class.

  • Apply correctness tightening when allowed.

  • Ensure invariants + ownership hold on all paths.

  • Keep diff reviewable; avoid drive-by refactors.

  1. Close the loop (required)
  • Run the chosen validation signal.

  • IF it fails:

  • update findings with the new counterexample,

  • apply the smallest additional fix,

  • re-run the SAME signal.

  • Repeat until the signal passes.

  • If you cannot make progress after 3 repair cycles, stop and ask with:

  • the last failing output (key lines/stack),

  • what you tried,

  • the smallest remaining decision that blocks you.

  1. Residual risk sweep (required)
  • Re-check Residual risks / open questions policy .

  • Any item without a valid blocker becomes a finding and is fixed (with proof) or dropped.

  1. Agent-directed self-review loop (required final step when a changeset exists)
  • Precondition gate: run this step only when Changes applied is not None and the latest validation signal result is ok .

  • Freeze the self-review baseline as the latest validated changeset.

  • Ask internally exactly: If you could change one thing about this changeset what would you change?

  • Summarize the self-round in Self-review loop trace with one delta row.

  • If the answer yields one new actionable self-review change:

  • convert it into one concrete finding,

  • treat compatible ergonomic/structural/API-shaping improvements as actionable when they materially improve the current validated changeset and can be proven locally,

  • widen anywhere in the current repo/worktree as needed,

  • apply the smallest sound change that materially addresses the critique,

  • re-run the chosen validation signal and update findings/proof,

  • record the (validated_changeset_fingerprint, normalized_answer_summary) pair so repeated suggestions cannot loop forever,

  • repeat from step 2.

  • If the answer yields only blocked changes, record stop_reason=blocked , carry blockers to Residual risks / open questions , and continue to step 8.

  • If the answer yields no new actionable self-review change, record stop_reason=no_new_actionable_changes and continue to step 8.

  • This check is against the current final validated changeset, not an earlier pre-delta review state.

  • already green , architectural , not fix-shaped , or no failing proof hook are not sufficient reasons by themselves when a concrete compatible/provable improvement still exists.

  • Run the non-self-review $fix passes once against the full resulting changeset.

  • If that rerun edits code, discard the stale self-review state, revalidate, and restart from step 2 against the new final changeset.

  • If that rerun applies no edits, exit the loop and proceed to output.

  • Skip gate: if Changes applied is None or the run is blocked before edits, output - None (skip_gate) in Self-review loop trace and proceed to output.

7a) Post-fix boundary (required)

  • If a clean or closed $fix pass surfaces broader non-fix opportunities, do not continue exploring them under $fix .

  • If the user explicitly asks for broader or deeper follow-up after closure, recommend the next skill explicitly:

  • $parse for architecture or purpose analysis

  • $grill-me for interrogation, pressure-testing, or narrowing the next move

  • $plan for a decision-complete roadmap

  • $creative-problem-solver for wider option generation

  • Do not place those broader opportunities in Residual risks / open questions unless a valid blocker from the allowed set applies.

  1. Output lock (required)

Before sending the final message, verify all are true:

  • Heading set is exact and complete (Contract , Findings (severity order) , Changes applied , Pass trace , Validation , Self-review loop trace , Residual risks / open questions ).

  • Pass trace includes planned/executed counts and P1/P2/P3 lines (plus P4/P5 when executed).

  • Runtime pass updates (Pass <n>/<total_planned>: ... ) were emitted during execution.

  • If embedded mode was used, include Fix Record in the same assistant message (after any required artifact).

  • Validation includes machine-checkable keys: baseline_cmd , baseline_result , proof_hook , final_cmd , final_result .

  • Every acted-on finding includes Proof strength and Compatibility impact using the allowed enums.

  • Every residual blocked_by uses only: product_ambiguity|breaking_change|no_repro_or_proof|scope_guardrail|generated_output|external_dependency|perf_unmeasurable .

  • If Changes applied is not None AND the latest validation result is ok , Self-review loop trace includes a terminal row with stop_reason=<no_new_actionable_changes|blocked> and the user-facing final line is not the question.

Deliverable format (chat)

Output exactly these sections in this order.

If no findings:

  • Findings (severity order): None .

  • Changes applied: None .

  • Self-review loop trace: - None (skip_gate) .

  • Residual risks / open questions: - None .

  • Still include Pass trace and Validation.

Contract

Findings (severity order) For each finding:

  • F# <file:line> — <security|crash|corruption|logic> —

  • Surface: Tokens=<...>; PROVEN_USED=<yes/no + evidence>; External=<yes/no>; Diff_touch=<yes/no>

  • Counterexample: <input/timeline>

  • Invariant (before): <what was assumed/allowed>

  • Invariant (after): <what is now guaranteed/rejected>

  • Fix:

  • Proof: <test/assert/log + validation command> -> <ok/fail>

  • Proof strength: <characterization|targeted_regression|property_or_fuzz>

  • Compatibility impact: <none|tightening|additive|breaking>

Changes applied

Pass trace

  • Core passes planned: 3 ; core passes executed: <3>

  • Delta passes planned: <0|2> ; delta passes executed: <0|2>

  • Total passes executed: <3|5>

  • P1 Safety -> <done> ; edits=<yes|no> ; signal=<cmd|n/a> ; result=<ok|fail|n/a>

  • P2 Surface -> <done> ; edits=<yes|no> ; signal=<cmd|n/a> ; result=<ok|fail|n/a>

  • P3 Audit -> <done> ; edits=<yes|no> ; signal=<cmd|n/a> ; result=<ok|fail|n/a>

  • If delta passes executed, also include P4 and P5 lines in the same format.

Validation

  • -> <ok/fail>

  • {"baseline_cmd":"<cmd|n/a>","baseline_result":"<ok|fail|n/a>","proof_hook":"<test/assert/log|n/a>","final_cmd":"<cmd>","final_result":"<ok|fail>"} (single-line JSON)

Self-review loop trace

  • If none: - None (skip_gate)

  • Otherwise: S# prompt=If you could change one thing about this changeset what would you change? ; answer_summary=<...>; finding=<F#|none> ; change_applied=<yes|no> ; proof=<cmd|n/a> ; result=<ok|fail|n/a> ; stop_reason=<continue|no_new_actionable_changes|blocked>

Residual risks / open questions

  • If none: - None

  • Otherwise: - <file:line or token> — blocked_by=<product_ambiguity|breaking_change|no_repro_or_proof|scope_guardrail|generated_output|external_dependency|perf_unmeasurable> — next=<one action>

Fix Record (embedded mode only)

Use only when $fix is invoked inside another skill.

Findings (severity order) For each finding:

  • F# <file:line> — <security|crash|corruption|logic> —

  • Surface: Tokens=<...>; PROVEN_USED=<yes/no + evidence>; External=<yes/no>; Diff_touch=<yes/no>

  • Counterexample: <input/timeline>

  • Invariant (before): <what was assumed/allowed>

  • Invariant (after): <what is now guaranteed/rejected>

  • Fix:

  • Proof: <test/assert/log + validation command> -> <ok/fail>

  • Proof strength: <characterization|targeted_regression|property_or_fuzz>

  • Compatibility impact: <none|tightening|additive|breaking>

Changes applied

Pass trace

  • Core passes planned: 3 ; core passes executed: <3>

  • Delta passes planned: <0|2> ; delta passes executed: <0|2>

  • Total passes executed: <3|5>

  • P1 Safety -> <done> ; edits=<yes|no> ; signal=<cmd|n/a> ; result=<ok|fail|n/a>

  • P2 Surface -> <done> ; edits=<yes|no> ; signal=<cmd|n/a> ; result=<ok|fail|n/a>

  • P3 Audit -> <done> ; edits=<yes|no> ; signal=<cmd|n/a> ; result=<ok|fail|n/a>

  • If delta passes executed, also include P4 and P5 lines in the same format.

Validation

  • -> <ok/fail>

  • {"baseline_cmd":"<cmd|n/a>","baseline_result":"<ok|fail|n/a>","proof_hook":"<test/assert/log|n/a>","final_cmd":"<cmd>","final_result":"<ok|fail>"} (single-line JSON)

Self-review loop trace

  • If none: - None (skip_gate)

  • Otherwise: S# prompt=If you could change one thing about this changeset what would you change? ; answer_summary=<...>; finding=<F#|none> ; change_applied=<yes|no> ; proof=<cmd|n/a> ; result=<ok|fail|n/a> ; stop_reason=<continue|no_new_actionable_changes|blocked>

Residual risks / open questions

  • If none: - None

  • Otherwise: - <file:line or token> — blocked_by=<product_ambiguity|breaking_change|no_repro_or_proof|scope_guardrail|generated_output|external_dependency|perf_unmeasurable> — next=<one action>

Pitfalls

  • Vague advice without locations.

  • "Nice-to-have" refactors that don’t reduce risk.

  • Feature creep.

  • Continuing broad repo critique or planning under a completed $fix label.

  • Asking for risk tolerance instead of applying the default policy.

  • Tightening presented as optional.

  • Ownership fixes without proven allocator/owner.

  • Editing generated/third-party outputs instead of source-of-truth.

  • Code edits without a failing proof hook when baseline was ok.

  • Using Residual risks / open questions as a substitute for fixable findings.

Activation cues

  • "resolve" / "fix" / "crash" / "data corruption"

  • "$fix this PR" / "$fix current branch" / "fix this branch"

  • "CI failed" / "fix the red checks" / "repair failing PR"

  • "footgun" / "misuse" / "should never happen"

  • "invariant" / "lifetime" / "nullable surprise"

  • "too complex" / "branch soup" / "cross-file hops"

  • "refine $fix" / "improve fix skill workflow" / "update fix skill docs"

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

grill-me

No summary provided by upstream source.

Repository SourceNeeds Review
General

complexity-mitigator

No summary provided by upstream source.

Repository SourceNeeds Review
General

mesh

No summary provided by upstream source.

Repository SourceNeeds Review
General

refine

No summary provided by upstream source.

Repository SourceNeeds Review