Reckit — Bulletproof AI Code Verification
Build it. Break it. Prove it works.
Philosophy
AI can't verify itself. Structure the pipeline so it can't silently agree with itself. Separate Builder/Tester/Breaker roles across fresh contexts. Use independent oracles.
Full 14-step framework:
references/verification-framework.md
Modes
Auto-detected from context:
| Mode | Trigger | Description |
|---|---|---|
| 🟢 BUILD | Empty repo + PRD | Full pipeline for greenfield |
| 🟡 REBUILD | Existing code + migration spec | BUILD + behavior capture + replay |
| 🔴 FIX | Existing code + bug report | Fix, verify, check regressions |
| 🔵 AUDIT | Existing code, no changes | Verify and report only |
Gates
Read the gate file before executing it. Each contains: question, checks, pass/fail criteria.
| Gate | BUILD | REBUILD | FIX | AUDIT | File |
|---|---|---|---|---|---|
| AI Slop Scan | ✅ | ✅ | ✅ | ✅ | references/gates/slop-scan.md |
| Type Check | ✅ | ✅ | ✅ | ✅ | references/gates/type-check.md |
| Ralph Loop | ✅ | ✅ | ✅ | ❌ | references/gates/ralph-loop.md |
| Test Quality | ✅ | ✅ | ✅ | ✅ | references/gates/test-quality.md |
| Mutation Kill | ✅ | ✅ | ✅ | ✅ | references/gates/mutation-kill.md |
| Cross-Verify | ✅ | ❌ | ❌ | ❌ | references/gates/cross-verify.md |
| Behavior Capture | ❌ | ✅ | ❌ | ❌ | references/gates/behavior-capture.md |
| Regression | ❌ | ✅ | ✅ | ❌ | references/gates/regression.md |
| SAST | ❌ | ❌ | ✅ | ✅ | references/gates/sast.md |
| LLM-as-Judge | opt | opt | opt | opt | references/gates/llm-judge.md |
| Design Review | ❌ | ❌ | ❌ | ✅ | references/gates/design-review.md |
| CI Integration | ✅ | ✅ | ❌ | ✅ | references/gates/ci-integration.md |
| Proof Bundle | ✅ | ✅ | ✅ | ✅ | references/gates/proof-bundle.md |
Scripts
Deterministic helpers — run these, don't rewrite them:
Core (all modes):
scripts/project-type.sh [path]— classify project context + calibration profile (skip_gates, thresholds, tolerated warns)scripts/detect-stack.sh [path]— auto-detect language, framework, test runner → JSONscripts/check-deps.sh [path]— verify all deps exist in registries (hallucination check)scripts/slop-scan.sh [path]— semantic slop scan (tracked vs untracked debt, categorized output) → JSONscripts/type-check.sh [path]— run type checker (tsc/mypy/cargo/go vet) → JSONscripts/ralph-loop.sh [path]— validate IMPLEMENTATION_PLAN.md structure → JSONscripts/coverage-stats.sh [path]— extract raw coverage numbers from test runnerscripts/mutation-test.sh [path] [test-cmd]— mutation testing (mutmut/cargo-mutants/Stryker/AI)scripts/mutation-test-stryker.sh [path]— Stryker-specific mutation testing → JSONscripts/red-team.sh [path]— SAST + 20+ vulnerability patterns → JSONscripts/regex-complexity.sh [path] [--context library|app]— targeted ReDoS analysis → JSONscripts/proof-bundle.sh [path] [mode]— corroboration-based aggregation + proof bundle writerscripts/run-all-gates.sh [path] [mode] [--log-file]— sequential gate runner with telemetry + adaptive skipping/tolerance
Mode-specific:
scripts/behavior-capture.sh [path]— capture golden fixtures before rebuild (REBUILD)scripts/design-review.sh [path]— dep graph, coupling, circular deps (AUDIT/REBUILD) → JSONscripts/ci-integration.sh [path]— CI config detection and scoring → JSONscripts/differential-test.sh [path]— oracle comparison, golden tests (BUILD/REBUILD) → JSON
Extended verification:
scripts/dynamic-analysis.sh [path]— memory leaks, race conditions, FD leaks → JSONscripts/perf-benchmark.sh [path]— benchmark detection + regression vs baseline → JSONscripts/property-test.sh [path]— property-based/fuzz testing, generates stubs → JSON
Bootstrap:
scripts/run-audit.sh [path] [mode] [--spawn]— generate orchestrator task + optional spawn
Swarm Architecture
For multi-gate parallel execution, read references/swarm/orchestrator.md.
Quick overview:
Main agent → wreckit orchestrator (depth 1)
├─ Planning: Architect worker
├─ Building: Sequential Implementer workers
├─ Verification: Parallel gate workers
├─ Sequential: Cross-verify / regression / judge
└─ Decision: Proof bundle → Ship / Caution / Blocked
Critical: Read references/swarm/collect.md before spawning workers.
Never fabricate results. Wait for all workers to report back.
Worker output format: references/swarm/handoff.md.
Config required:
{ "agents.defaults.subagents": { "maxSpawnDepth": 2, "maxChildrenPerAgent": 8 } }
Decision Framework
| Verdict | Criteria |
|---|---|
| Ship ✅ | No hard blocks; no corroborated multi-domain fail evidence above block threshold |
| Caution ⚠️ | Single non-hard fail, warning-only risk, or corroboration below block threshold |
| Blocked 🚫 | Any hard block OR corroborated non-hard failure pattern (multi-signal, multi-domain, high-confidence) |
Hard-block + corroboration rule details: references/gates/corroboration.md
Supported Languages & Stacks
| Language | Gates Available | Notes |
|---|---|---|
| TypeScript/JS | 11/11 | Full support via Stryker, tsc, vitest/jest |
| Python | 11/11 | Full support via mutmut, mypy/pyright, pytest |
| Rust | 11/11 | Full support via cargo-mutants, cargo check/test |
| Go | 11/11 | Full support via go vet, go test |
| Swift (SPM) | 9/11 | mutation = AI-estimated CAUTION, cross-verify = manual |
| Swift (Xcode) | 7/11 | type-check = xcodebuild, mutation = AI-estimated, coverage = limited |
| iOS apps | 7/11 | Same as Xcode projects |
| Java/Kotlin | 10/11 | Gradle/Maven, mutation via PIT (manual setup) |
| Shell | 8/11 | shellcheck, limited mutation testing |
Swift Notes
- Mutation testing requires manual verification — no automated mutation testing tool exists for Swift as of 2026. The mutation gate uses AI-estimated analysis (counts mutation surface, compares to test count) and always outputs
CAUTION, neverSHIP. - SPM projects get high-confidence type checking via
swift build(the compiler IS the type checker). - Xcode projects get medium-confidence type checking via
xcodebuildwith auto-detected schemes. - Dependency checking lists SPM dependencies but notes that no automated CVE database exists for Swift packages — manual review is always recommended.
- CocoaPods projects:
pod outdatedis checked if Podfile present. - Build systems detected: SPM, xcodebuild, CocoaPods, Carthage, mixed.
Running an Audit (Single-Agent, No Swarm)
For small projects or when swarm isn't needed, run gates sequentially:
scripts/detect-stack.sh→ know your target (language, test cmd, type checker)scripts/check-deps.sh→ verify deps are real (not hallucinated)scripts/slop-scan.sh→ find placeholders, template artifacts, empty stubs- Run type checker (from detect-stack output) →
references/gates/type-check.md - Run tests +
scripts/coverage-stats.sh→references/gates/test-quality.md scripts/mutation-test.sh→references/gates/mutation-kill.md(uses mutmut/cargo-mutants/Stryker if available)scripts/red-team.sh→references/gates/sast.md(20+ vulnerability patterns, JSON report)scripts/design-review.sh→references/gates/design-review.md(dep graph, circular deps, god modules)scripts/ci-integration.sh→references/gates/ci-integration.md(CI config detection + scoring)scripts/dynamic-analysis.sh→references/gates/dynamic-analysis.md(memory leaks, race conditions)scripts/perf-benchmark.sh→references/gates/performance.md(benchmark detection + regression)scripts/property-test.sh→references/gates/property-based.md(fuzzing, invariant checks)scripts/differential-test.sh→references/gates/differential.md(oracle comparison, metamorphic tests)- echo '[...gate-results-json...]' |
scripts/proof-bundle.sh [path] [mode]→ writes.wreckit/proof.json,dashboard.json,decision.md
Quick Start
"Use wreckit to audit [project]. Don't change anything."
"Use wreckit to build [project] from this PRD."
"Use wreckit to fix [bug]. Prove nothing else breaks."
"Use wreckit to rebuild [project] in [framework]."
Dashboard
assets/dashboard/ contains a local web dashboard for viewing proof bundles across repos.
Run: node assets/dashboard/server.mjs (port 3939). Reads .wreckit/dashboard.json from projects.
Codex CLI Notes (2026-02-22)
When using Codex CLI to build/run projects:
--full-autosandbox blocksnpm installnetwork access (ENOTFOUND registry.npmjs.org)- Fix: use
--dangerously-bypass-approvals-and-sandboxflag instead - Auth:
echo "$OPENAI_API_KEY" | codex login --with-api-keystores credentials to~/.codex/auth.json - Config:
~/.codex/config.tomlwithmodel = "gpt-5.2-codex"and[shell_environment_policy] inherit = "all" gpt-5.3-codexis Copilot/VS Code only — not available via direct API. Usegpt-5.2-codex.