System Supervisor - Architectural Integrity Scanner
Detects specification drift, silent failures, hallucinated assertions, and incomplete features. Provides a post-execution audit layer that complements the Execution Guardian's pre-execution gates.
Description
Scans the codebase for architectural integrity issues that accumulate over time. Compares implementation against specifications, detects dead code and silent failures, validates technical assertions, and measures feature completeness across all layers. Distinct from the Execution Guardian (pre-execution safety) and the Council of Logic (code quality).
When to Apply
Positive Triggers
-
Before merge to main: Final integrity check before code reaches production branch
-
After Genesis Orchestrator execution phase: Post-section or post-phase audit
-
Explicit audit keywords: "audit", "drift", "dead code", "completeness", "integrity", "silent failure"
-
Feature declared complete: Verify all layers are implemented before closing
-
Periodic review: At least once per milestone boundary
-
SCALE mode: Full audit capability available
Negative Triggers (Delegate to Other Systems)
-
Active coding/implementation → genesis-orchestrator (phase-locked execution)
-
Pre-execution risk assessment → execution-guardian (validation gates)
-
Code quality/complexity → council-of-logic (mathematical principles)
-
Runtime error handling → error-taxonomy (structured error codes)
-
EXPLORATION mode → No auditing needed
-
STRATEGY mode → Planning phase, no code to audit
Architecture Drift Detection
How It Works
-
Parse specifications: Scan docs/phases/ , docs/features/ , and spec files for declared features, endpoints, models, and components
-
Scan implementation: Walk the codebase for matching implementations
-
Compare and report: Identify MISSING, ORPHANED, and DIVERGED items
Drift Categories
Category Meaning Severity
MISSING Spec declares it, codebase does not implement it HIGH
ORPHANED Codebase implements it, no spec references it MEDIUM
DIVERGED Both exist but implementation differs from spec HIGH
ALIGNED Spec and implementation match OK
Drift Report Format
Architecture Drift Report — {date}
| Item | Spec Location | Code Location | Status | Severity |
|---|---|---|---|---|
| User authentication | docs/features/auth/spec.md | apps/backend/src/auth/ | ALIGNED | OK |
| Contractor API | docs/phases/phase-2-spec.md | apps/backend/src/api/contractors.py | DIVERGED | HIGH |
| Analytics dashboard | docs/features/analytics/spec.md | — | MISSING | HIGH |
| Legacy helper | — | apps/backend/src/utils/old_helper.py | ORPHANED | MEDIUM |
Summary
- ALIGNED: {n}
- MISSING: {n} (HIGH priority)
- ORPHANED: {n} (review for removal)
- DIVERGED: {n} (reconcile spec or code)
Drift Scanning Rules
See references/drift-rules.md for:
-
Spec-to-code location mapping conventions
-
Severity assignment rules
-
Ignore patterns for generated/config files
Silent Failure Detection
Silent failures are code constructs that fail without alerting. They accumulate technical debt invisibly.
What to Scan For
Silent Failure Type Detection Pattern Severity
Dead imports import X where X is never used in the file LOW
Empty catch blocks except: or catch {} with no logging or re-raise HIGH
Unused API endpoints Route defined but no frontend calls reference it MEDIUM
Orphaned DB columns Column in model but never read/written by any query MEDIUM
Unchecked return values Async function called without await or return value discarded HIGH
Unreachable code paths Code after unconditional return , raise , or break
LOW
Stale environment variables .env.example references variable not used in code LOW
Unhandled promise rejections .then() without .catch() or missing error boundary HIGH
Type assertion bypasses as any , # type: ignore without justification comment MEDIUM
Commented-out code blocks More than 5 consecutive commented lines of code LOW
Silent Failure Report Format
Silent Failure Scan — {date}
| # | Type | Location | Line | Severity | Recommendation |
|---|---|---|---|---|---|
| 1 | Empty catch | apps/backend/src/api/main.py | 47 | HIGH | Add logging or re-raise |
| 2 | Dead import | apps/web/lib/api/client.ts | 3 | LOW | Remove unused import |
| 3 | Unchecked await | apps/backend/src/agents/base.py | 92 | HIGH | Add await or handle return |
Summary
- HIGH: {n} (fix immediately)
- MEDIUM: {n} (fix before merge)
- LOW: {n} (fix when convenient)
Hallucination Prevention
The Problem
AI coding agents can make assertions about code behaviour that are not verified. These assertions become "technical hallucinations" — statements treated as fact that may be incorrect.
Assertion Classification
When the agent makes a technical assertion (e.g., "this function handles errors correctly", "this endpoint returns 404"), classify it:
Classification Meaning Action
CONFIRMED Verified by test, code inspection, or documentation No action needed
INFERRED Logically follows from confirmed facts but not directly verified Acceptable with note
ASSUMED Plausible but not verified Trigger verification pass
FABRICATED No evidence supports the assertion Block and correct immediately
Verification Protocol
When an assertion is classified as ASSUMED:
-
Locate the source: Find the code or spec that should confirm the assertion
-
Run targeted test: Execute relevant test if available
-
Reclassify: Move to CONFIRMED or FABRICATED based on evidence
-
Report: Document the verification result
Hallucination Report Format
Assertion Verification — {date}
| # | Assertion | Classification | Evidence | Action |
|---|---|---|---|---|
| 1 | "Auth middleware validates JWT expiry" | CONFIRMED | test_auth.py:test_expired_token passes | None |
| 2 | "Rate limiter returns 429 after 100 req/min" | ASSUMED | No rate limiter test exists | Write test |
| 3 | "Contractor API returns paginated results" | FABRICATED | API returns full list, no pagination | Correct claim |
Scope Boundary
This hallucination check applies to technical code assertions only. For content-level claims (marketing copy, user-facing text), defer to content review processes.
Feature Completeness Matrix
How It Works
For each declared feature, verify implementation across all required layers:
Layer What to Check Path Pattern
UI React component exists and renders apps/web/app//*.tsx , apps/web/components//*.tsx
API Route Backend endpoint defined apps/backend/src/api/**/*.py
Data Model SQLAlchemy model or Pydantic schema apps/backend/src/db//*.py , apps/backend/src/models//*.py
Validation Input validation (Zod frontend, Pydantic backend) Inline in route handlers and components
Tests At least one test per layer apps/backend/tests//*.py , apps/web//*.test.{ts,tsx}
Docs Feature documented docs/features//*.md , docs/reference//*.md
Error Handling Errors follow error-taxonomy patterns Inline in route handlers
Completeness Report Format
Feature Completeness — {date}
| Feature | UI | API | Model | Validation | Tests | Docs | Errors | Score |
|---|---|---|---|---|---|---|---|---|
| Auth (login/logout) | Y | Y | Y | Y | Y | Y | Y | 100% |
| Contractor profiles | Y | Y | Y | P | N | N | P | 57% |
| Document management | Y | Y | Y | Y | P | Y | Y | 86% |
| Analytics dashboard | N | P | N | N | N | N | N | 14% |
Legend: Y = Complete, P = Partial, N = Missing
Thresholds
- 100%: Release-ready
- 80-99%: Acceptable for merge (document gaps)
- 50-79%: Requires completion plan before merge
- <50%: Not ready — must complete core layers first
See references/completeness-matrix.md for detailed layer-by-layer checklists and path patterns.
Strategic Intelligence (Lightweight)
Portable, project-scoped signals only. No investor readiness, competitive analysis, or business metrics — those are non-portable and outside this skill's scope.
Technical Debt Trajectory
Track across audit runs:
Technical Debt Signals
| Metric | Previous | Current | Trend |
|---|---|---|---|
| Silent failures (HIGH) | 12 | 8 | Improving |
| Architecture drift items | 5 | 7 | Worsening |
| Feature completeness avg | 72% | 78% | Improving |
| Type assertion bypasses | 3 | 1 | Improving |
| Empty catch blocks | 6 | 4 | Improving |
Dependency Health
Dependency Health
| Package | Current | Latest | Status |
|---|---|---|---|
| next | 15.x.x | 15.x.x | Current |
| fastapi | 0.x.x | 0.x.x | Current |
| sqlalchemy | 2.x.x | 2.x.x | Current |
| {package} | {ver} | {ver} | Outdated (n versions behind) |
Feature Completion Percentage
Overall completion: {n}% ({completed}/{total} features at 80%+ threshold)
Audit Workflow
Full Audit (SCALE mode or explicit request)
Run all four scans in sequence:
-
Architecture Drift Detection → Drift Report
-
Silent Failure Detection → Silent Failure Report
-
Hallucination Prevention → Assertion Verification
-
Feature Completeness Matrix → Completeness Report
-
Strategic Intelligence → Debt Trajectory + Dependency Health
Quick Audit (BUILD mode phase boundary)
Run abbreviated scans:
-
Architecture Drift → Changed files only (not full codebase)
-
Silent Failures → Changed files only
-
Feature Completeness → Affected features only
On-Demand Scan
Run individual scans based on user request:
-
"check for drift" → Architecture Drift only
-
"dead code scan" → Silent Failure Detection only
-
"is this feature complete?" → Feature Completeness for specified feature
-
"verify my assertions" → Hallucination Prevention on recent claims
Integration Points
Execution Guardian
-
Guardian handles pre-execution safety; Supervisor handles post-execution integrity
-
Guardian's risk score can trigger a Supervisor audit (HIGH risk operations → auto-audit after completion)
-
No overlap: Guardian gates operations, Supervisor audits results
Genesis Orchestrator
-
Supervisor scans activate at phase boundaries (between SECTION_A → SECTION_B, etc.)
-
Quick audit after each section; full audit after phase completion
-
Supervisor does not interrupt mid-section execution
Council of Logic
-
Supervisor's silent failure detection complements Turing's complexity analysis
-
Shannon compression applies to all Supervisor report formats
-
Von Neumann architecture review feeds into drift detection baseline
Execution Modes
-
EXPLORATION: Supervisor off
-
BUILD: Quick audit at phase boundaries only
-
SCALE: Full audit capability
-
STRATEGY: Supervisor off (planning phase)
Anti-Patterns
Pattern Problem Correct Approach
Running full audit during active coding Interrupts flow, wastes tokens Audit at phase boundaries or on request
Flagging all ORPHANED code as errors Some utilities are intentionally decoupled Check if referenced in tests or docs before flagging
Treating INFERRED assertions as FABRICATED Over-verification wastes effort INFERRED is acceptable when logic chain is clear
Auditing generated/config files for drift False positives from auto-generated content Apply ignore patterns from drift-rules.md
Running Strategic Intelligence every scan Token-heavy, low signal frequency Run at milestone boundaries only
Checklist
-
Drift scan covers all spec files in docs/phases/ and docs/features/
-
Silent failure scan checks all detection patterns
-
Assertions classified (CONFIRMED/INFERRED/ASSUMED/FABRICATED)
-
ASSUMED assertions trigger verification pass
-
Feature completeness checks all 7 layers
-
Completeness score calculated correctly
-
Quick audit used at phase boundaries (not full audit)
-
Full audit reserved for SCALE mode, merge prep, or explicit request
-
Reports use compressed table format (Shannon principle)
-
Strategic intelligence limited to portable, technical metrics
Response Format
[AGENT_ACTIVATED]: System Supervisor [MODE]: {BUILD | SCALE} [SCAN_TYPE]: {full | quick | on-demand} [STATUS]: {scanning | complete}
{audit reports}
[NEXT_ACTION]: {proceed with merge | fix {n} issues | schedule follow-up}
Australian Localisation (en-AU)
-
Date Format: DD/MM/YYYY
-
Currency: AUD ($)
-
Spelling: colour, behaviour, optimisation, analyse, centre, authorisation
-
Tone: Direct, factual — report findings without hedging