Codebase Explorer
A field manual for understanding unfamiliar code. Produces navigable maps of system behavior, not summaries of file contents.
Core Principle
The file tree is storage, not understanding. Directory structure reflects how a team organized artifacts. Logic traces reveal why code exists and what it does. When these conflict, trust the trace.
Understanding is not "I have read the code." Understanding is "I can tell you what happens when X changes."
You cannot understand a whole system. Build a portfolio of vertical slices — complete paths from input to output that let you predict behavior in bounded regions.
Who This Is For
This skill produces different value for different audiences. Adapt depth and vocabulary to the user's role.
| Audience | What They Need | Emphasize |
|---|---|---|
| Developer inheriting code | Where to start, what's safe to change | Vertical slices, shared fate groups, implicit contracts |
| Tech lead / architect | Risk map, coupling analysis, refactoring targets | Smells, shared fate groups, substrate constraints |
| Engineering manager | Change cost estimates, team coordination needs | Risk profile, shared fate groups, unknowns |
| Product manager | What's hard to change vs. easy, where bottlenecks live | Risk profile (in business terms), substrate constraints |
| Security auditor | Attack surface, data flow, trust boundaries | I/O boundary map, implicit contracts, data lineage |
| New hire onboarding | Mental model of how the system works | Pipeline flow / architecture diagram, atmosphere, vertical slices |
| Consultant / contractor | Fast orientation before making changes | Quick Assessment template, archetype classification |
| Due diligence reviewer | Code quality signal, technical debt inventory | Smells, activity heat map, unknowns |
When the user's role is unclear, default to the developer perspective but include risk profile and unknowns sections — these are universally useful.
When This Skill Shines
This skill adds the most value in these situations:
- Inherited codebases with poor or missing documentation
- Multi-service architectures where no one person understands the whole system
- Pre-rewrite assessment — understanding what exists before deciding what to replace
- Incident response on unfamiliar systems — quickly identifying blast radius
- Open source contribution — understanding a project before submitting PRs
- Post-acquisition evaluation — assessing code quality and technical debt
- Compliance / security audits — mapping data flow and trust boundaries
- Onboarding — giving new team members a navigable mental model
Workflow
The exploration follows four phases. Complete each before moving to the next.
Phase 0: Archetype Detection (First 2 Minutes)
Before reading any business logic, determine the archetype of the codebase. The archetype determines which vertical slice strategy to use, which smells to prioritize, and what the primary output artifact should be.
Detection method: Check the root directory for signature files.
| Signal Files | Archetype |
|---|---|
template.yaml / serverless.yml / statemachine/ / stepfunctions/ | Serverless Pipeline |
Dockerfile + route handlers (app.py, server.js, main.go) | Backend API |
package.json with React/Vue/Svelte/Next + src/components/ | Frontend Application |
dags/ / pipeline/ / airflow.cfg / dbt_project.yml | Data Pipeline |
setup.py / pyproject.toml with lib structure, no main entry point | Library / SDK |
main.tf / *.tf / cdk.json / CloudFormation without Lambda code | Infrastructure as Code |
CLI entry points (__main__.py, bin/, Makefile as primary interface) | CLI Tool |
train.py / model/ / notebooks + requirements.txt with ML libs | ML System |
If multiple signals match, pick the primary archetype based on where the most code lives. Note secondary archetypes.
Output: Archetype classification with confidence level. This gates all subsequent phases.
Phase 1: Orientation
Before reading business logic, answer four questions about the shape of the system. For each, produce the specified output artifact.
Q1: Where is the Dirt?
Map the I/O boundary — where the system touches the world.
- Identify entry points (main functions, route handlers, event listeners, exported APIs)
- For each entry point, find the first external call
- Mark functions: ○ Pure (no I/O) · ● Impure (touches world) · ◐ Mixed
- Map the boundary: where data enters, where it leaves
Signals: fetch/axios/http → network · fs/readFile → filesystem · query/execute/find/save → database · Date.now()/Math.random() → non-determinism · process.env/config.get → environment coupling
Output: I/O boundary map
Q2: What is the Substrate?
Identify the hard limits — memory, payload size, timeouts, rate limits — that constrain what the code can actually do.
- Check dependency manifests for runtime constraints
- Find config files, note limits:
maxPayloadSize,timeout,poolSize,maxRetries - For web apps: measure initial payload size and request count
- Search for memory patterns: streams, pagination, cursors, chunking — absence is a smell
Signals: Hardcoded round-number timeouts (30000, 60000) · Missing pagination on list endpoints · Synchronous reads of user-provided paths · Unbounded array accumulation
Output: Constraint list with compliance notes
Q3: What are the Implicit Contracts?
Find magic values — literals that encode business semantics without naming them.
- Search for string literals in conditionals:
if (x === "some_string") - Search for numeric literals other than -1, 0, 1, 2
- For each: would this break if the meaning of data changed?
- Flag literals that assume specific values exist in databases or external systems
Distinguish: Config values (ports, batch sizes) are tuning parameters, not magic. Magic values encode semantics: status codes, type discriminators, string-matched business logic.
Output: Fragility list
Q4: Where is Attention Flowing?
Use git history to map organizational memory. Start with a single probe command before investing in detailed analysis.
Step 1 — Probe: Run git log --oneline -1 first. If this fails (orphan branch, no commits), skip to fallback signals.
Step 2a — If history exists:
git log --oneline -100— recent activity themesgit log --format='%H' --since='6 months ago' | wc -l— velocity- For key files:
git log --oneline -10 <file>— change recency - Identify files with zero commits in 12+ months
Signals: 50+ commits in 3 months → hot zone · Unchanged 2+ years → frozen · Recent "fix"/"hotfix"/"revert" clusters → instability · Single-author files → knowledge silos
Step 2b — If NO history exists (orphan branch, fresh repo, extracted code):
- Check
git status— staged vs. untracked files reveal intent - Check
git remote -v— presence/absence of remote reveals deployment maturity - Check
git branch -a— orphan branch or missing main = pre-initial-commit - Look for deleted files in staging (
ADprefix) — reveals prior architecture that was replaced - Check file modification times via
ls -lton key directories
Output: Activity heat map (from git history) OR maturity assessment (from fallback signals). State which signal source was used.
Phase 2: Vertical Slices (The Value Trail)
Do not survey the whole system. Trace a single observable value from one end of the system to the other. One complete vertical slice is worth more than a shallow survey of everything.
The slicing strategy depends on the archetype detected in Phase 0.
Strategy A: Frontend Applications — Trace a Visible Value
Step 1 — Pick a Value: Something visible and concrete. "The green marker." "The number 63 in the header."
Step 2 — Trace Upstream (Render → Logic): Find the code that produces this value. You want the render site.
Step 3 — Trace Data (Logic → State): Follow backward through assignments. You want the decision point.
Step 4 — Trace Source (State → Input): Find where the data originates — API, database, user input. You want the boundary.
Step 5 — Record: Entry point, decision points, render site, impure calls along the path.
Strategy B: Pipelines — Trace a Record Through Stages
Pipelines have no "render site." Data transforms as it flows. The natural slice is: follow one entity from ingestion to final output.
Step 1 — Pick an Entity: A concrete data object the pipeline processes. "One arXiv paper." "One customer order."
Step 2 — Trace Forward (Ingestion → Output): At each stage, answer:
- What fields does this stage read?
- What fields does this stage add, modify, or remove?
- What external calls happen?
- What is the storage location where the result lands?
- What implicit contracts exist with the next stage?
Step 3 — Map Schema Evolution: How does the entity's shape change at each stage? This is the pipeline's real architecture.
Step 4 — Identify Inter-Stage Contracts: For each handoff:
- Serialization format (JSON, JSONL, Parquet)
- Storage location convention (hardcoded vs. config)
- Which fields the next stage assumes exist
- Whether validation happens at the boundary
Step 5 — Record: Produce a data lineage map: stage → transformation → output location → next stage. Mark each boundary with its contract type.
This is the primary artifact for pipeline codebases.
Strategy C: Backend APIs — Trace a Request
Step 1 — Pick a Request: A concrete endpoint that touches multiple layers.
Step 2 — Trace Inward: Route → middleware → handler → service → persistence.
Step 3 — Trace the Response: How is data shaped for return? Where are errors caught?
Step 4 — Record: Route → middleware chain → handler → service calls → data access → response shaping.
Strategy D: Libraries / SDKs — Trace a Public API
Step 1 — Pick an Exported Function: The most-used public API (check README or tests).
Step 2 — Trace Inward: Public interface → internal modules → core logic.
Step 3 — Map Abstraction Layers: Where does the library hide complexity?
Step 4 — Record: Public API → internal dispatch → core logic → edge case handling.
Strategy E: Infrastructure as Code — Trace Resource Dependencies
Step 1 — Pick a Core Resource: The one other resources depend on most.
Step 2 — Trace Dependents: What breaks if this changes? Follow Ref, !GetAtt, depends_on.
Step 3 — Identify Blast Radius: What must deploy together?
Step 4 — Record: Resource → dependents → blast radius → deployment order.
General Guidance
Repeat for 3–5 slices. These usually reveal the system's true architecture.
For advanced techniques (handling trace convergence, recognizing dead code), see references/advanced-techniques.md.
Phase 3: Smell Detection
Scan for architectural smells — patterns that signal structural problems. Prioritize smells that match the detected archetype.
Read references/smell-catalog.md for the full catalog with signals, risks, and verification procedures.
Universal smells (check regardless of archetype):
| Smell | Signal | Risk |
|---|---|---|
| Hidden Schema | String-matching values in conditionals | Fragility to data migration |
| Silent Overflow | Unbound iteration without size guards | Scale failure |
| God Object | 1000+ line files, 20+ method classes | Comprehension collapse |
| Orphaned Error | Empty catch blocks, log-and-forget | Silent failure |
| Time Bomb | Hardcoded years, fixed date comparisons | Predictable future failure |
Pipeline-specific smells (prioritize for Data Pipeline / Serverless Pipeline):
| Smell | Signal | Risk |
|---|---|---|
| Drifting Duplicates | Same config value in multiple stages | Silent inconsistency |
| Implicit Inter-Stage Contract | Dict access without schema validation at boundaries | Silent breakage when upstream changes |
| Path Convention as Schema | Storage paths hardcoded in every stage | Rename requires finding all stages |
| Assembly Without Validation | Final stage assembles N inputs without completeness check | Partial output published as complete |
| Shared Utility SPOF | One function imported by 50%+ of stages | Total pipeline failure from single bug |
Phase 4: Risk Profile and Unknowns
This is where exploration becomes actionable. Synthesize findings into decisions the reader can make.
Risk Profile — be specific, not vague:
- Not "complex" but: "Changing the User model requires migration + API update + frontend update."
- Not "risky" but: "This function has 8 callers and no tests. A bug here breaks scoring for all papers."
Unknowns — explicitly mark what you didn't explore and why. Unstated assumptions become false beliefs.
Verification pointers — for each high-risk area, note what you'd test first. See references/advanced-techniques.md for verification procedures.
Output Artifacts
Read references/documentation-templates.md for complete templates. Choose based on archetype and audience.
Choosing the Right Template
| Situation | Template |
|---|---|
| Full exploration, any archetype | Full Exploration Report |
| Full exploration, pipeline archetype | Pipeline Exploration Report |
| Time-limited or narrow scope | Quick Assessment |
| Tracing a single value | Vertical Slice Record |
| Pipeline data flow | Data Lineage Map |
| Focused health check | Smell Report |
| Non-technical stakeholder briefing | Stakeholder Brief |
Output Principles
Maps, not lists. Group code by shared fate — components that fail together, change together, or depend on the same external resource.
Decisions, not descriptions. Every section should help the reader decide something: where to look, what to change, what to avoid, who to coordinate with.
Audience-appropriate language. For technical audiences, use file paths and function names. For PMs and managers, translate to business impact: "This component is fragile" → "Changes to the scoring algorithm require updating two files that aren't linked — if someone updates one and forgets the other, newsletter quality degrades silently."
Quick Reference
Phase 0: Archetype → Strategy Mapping
| Archetype | Slice Strategy | Primary Artifact |
|---|---|---|
| Frontend Application | A: Trace visible value | Render-site trace |
| Data / Serverless Pipeline | B: Trace record through stages | Data lineage map |
| Backend API | C: Trace a request | Request lifecycle trace |
| Library / SDK | D: Trace public API | Abstraction layer map |
| Infrastructure as Code | E: Trace resource dependencies | Blast radius map |
| CLI Tool | D (adapted): Trace a command | Command execution trace |
| ML System | B (adapted): Trace training data | Data + model lineage map |
Orientation Checklist
| Question | Output |
|---|---|
| What is this? (Phase 0) | Archetype classification |
| Where is the dirt? | I/O boundary map |
| What is the substrate? | Constraint list |
| What are the implicit contracts? | Fragility list |
| Where is attention flowing? | Activity heat map OR maturity assessment |
Value Trail Steps (by archetype)
Frontend (A): Pick value → render site → decision point → input boundary → record
Pipeline (B): Pick entity → trace through stages → map schema evolution → identify contracts → data lineage map
Backend API (C): Pick request → trace inward → trace response → record
Library (D): Pick public API → trace inward → map abstractions → record
IaC (E): Pick resource → trace dependents → identify blast radius → record