Fossil Record
"Code tells you what a system does. History tells you what a system survived."
What It Does
git blame tells you WHO changed a line. git log tells you WHEN. Fossil Record tells you WHY — by analyzing patterns across the entire commit history to reconstruct the evolutionary pressures that shaped the codebase.
Every line of code is the result of a decision. Most decisions aren't documented. But they leave fossils: commit patterns, revert sequences, hotfix clusters, refactor waves, and the sediment of a hundred small choices that accumulated into the architecture you see today.
The Geological Model
Fossil Record treats your git history as a geological record, with distinct layers and eras:
| Geological Concept | Code Equivalent |
|---|---|
| Sediment Layers | Periods of steady development (feature commits) |
| Fault Lines | Major refactors, rewrites, or architecture changes |
| Impact Craters | Incident responses, emergency hotfixes, reverts |
| Fossil Beds | Code that hasn't changed in a long time (stable or forgotten?) |
| Erosion Patterns | Gradual drift from original design intent |
| Extinction Events | Deleted modules, abandoned features, removed dependencies |
| Adaptive Radiation | Rapid diversification after a major change (new abstraction spawning many implementations) |
The Eight Excavation Modes
1. Pressure Analysis
Question: What external forces shaped this code?
Analyzes commit message patterns, timing, and clustering to identify:
- Deadline pressure: Commits accelerating toward a date, then stopping
- Incident pressure: Hotfix → fix → fix-the-fix → revert → different-fix patterns
- Stakeholder pressure: Feature requests appearing as interruptive commit sequences
- Technical debt pressure: Refactors that are started, abandoned, restarted
Output: Timeline of external pressures with their impact on code quality.
Example: "Between March 3-17, commit velocity tripled and test coverage
dropped from 84% to 61%. Three hotfixes followed in the next week.
This region of code still carries the scars of that deadline."
2. Decision Reconstruction
Question: What decisions were made here, and what alternatives were considered?
Analyzes:
- Reverted commits (something was tried and rejected)
- Branches that were created but never merged (abandoned approaches)
- Comments that reference alternatives ("we could have used X but...")
- Sequential implementations of the same feature (iteration history)
Output: Decision tree showing what was tried, what stuck, and what was abandoned.
Example: "Authentication was implemented 3 times:
v1 (session-based, commits a1b2..c3d4, reverted)
v2 (JWT, commits e5f6..g7h8, lived 4 months)
v3 (OAuth2, commits i9j0..k1l2, current)
Pressure: v1→v2 driven by scaling issues. v2→v3 driven by SSO requirement."
3. Hotspot Archaeology
Question: Why is this specific area of code so volatile?
Goes beyond "this file changes often" to ask "what kind of changes happen here and what drives them?"
CHANGE TAXONOMY:
├── Bug Fix: Same function modified to fix different bugs (fragile design)
├── Feature Accretion: Function grows as features are bolted on (missing abstraction)
├── Config Churn: Constants/thresholds repeatedly adjusted (unclear requirements)
├── Refactor Oscillation: Code restructured back and forth (no consensus on design)
└── Dependency Turbulence: Changes driven by upstream library updates (fragile coupling)
4. Extinction Mapping
Question: What used to be here, and why did it die?
Traces deleted code through git history to reconstruct what was removed and the conditions of its removal:
- Was it replaced? By what?
- Was it gradually abandoned or suddenly deleted?
- Did its removal cause any subsequent issues (fixes referencing the deleted module)?
- Is anything still alive that was designed to work with the extinct module?
Output: Extinction timeline showing what disappeared, when, and what it left behind.
Example: "The 'recommendations' module was deleted in commit x1y2z3 (June 2024).
3 orphaned database tables still exist.
2 API routes still reference recommendation types in their schemas.
1 test file still imports a mock of the recommendation engine."
5. Sediment Dating
Question: How old is this code really, and has it been maintained or just preserved?
For each module/file, determines:
- Birth date: When was it first created?
- Last meaningful change: Not just whitespace/formatting — actual behavior change
- Maintenance frequency: Is it regularly updated or untouched?
- Author diversity: Has only one person ever modified this? (bus factor = 1)
- Era classification: Which architectural era does this code belong to?
Output: Age map of the codebase with era boundaries.
Example:
src/auth/ Born: 2023-01, Last modified: 2025-11, Era: "Current" (3rd gen)
src/utils/ Born: 2021-06, Last modified: 2022-03, Era: "Founding" (1st gen)
src/payments/ Born: 2024-08, Last modified: 2024-08, Era: "Growth" (2nd gen)
⚠️ src/utils/ hasn't been meaningfully modified in 3 years. Fossil bed.
6. Fault Line Detection
Question: Where are the tectonic boundaries in this codebase?
Identifies major architectural shifts by finding:
- Large-scale rename/move operations
- Dependency replacements (library A → library B)
- Directory restructuring
- Changes to build systems, frameworks, or deployment targets
Output: Fault line map showing architectural eras and their boundaries.
Example: "3 major fault lines detected:
1. [2022-09] Monolith → microservices split (142 files moved)
2. [2023-06] REST → GraphQL migration (89 files modified)
3. [2024-03] JavaScript → TypeScript conversion (204 files renamed)
Warning: Fault line #2 is incomplete. 23 endpoints still REST."
7. Author Topology
Question: How was knowledge distributed, and where are the gaps?
Maps which developers contributed to which areas, and identifies:
- Knowledge monopolies: Areas only one person has ever touched
- Knowledge transfers: When a new contributor takes over an area
- Knowledge voids: When all contributors to an area have left the project
- Collaboration patterns: Which areas have healthy multi-author contribution
Output: Knowledge topology map with risk assessment.
Example: "src/billing/ — ALL 247 commits by developer X (last active: 2024-01).
Developer X is no longer on the team.
No other contributor has ever modified this module.
Knowledge void. Recommend: dedicated onboarding session for this module."
8. Evolution Trajectory
Question: Where is this codebase heading?
Extrapolates from historical patterns to predict:
- Which areas are actively evolving (increasing commit diversity and frequency)
- Which areas are calcifying (decreasing modifications, aging contributors)
- Which architectural patterns are expanding vs. contracting
- What the next likely "extinction event" or "fault line" might be
Output: Trajectory forecast based on historical momentum.
Example: "The codebase is trending toward:
✓ Full TypeScript adoption (92% converted, ~2 months to completion)
✓ GraphQL as primary API layer (78% migrated)
⚠ Growing divergence between /api and /services naming conventions
⚠ Test coverage declining in modules > 2 years old (neglect pattern)"
Integration
Invoke Fossil Record when:
├── Joining a new project → Run full geological survey
├── Before modifying old code → Run sediment dating + decision reconstruction
├── After an incident → Run pressure analysis on the affected area
├── During architecture review → Run fault line detection + evolution trajectory
├── When someone asks "why?" → Run decision reconstruction on that specific area
└── Onboarding new developers → Generate the complete evolutionary narrative
Output: The Geological Survey
╔══════════════════════════════════════════════════════════════╗
║ FOSSIL RECORD: GEOLOGICAL SURVEY ║
║ Repository: acme-platform ║
║ History depth: 3 years, 4,721 commits ║
╠══════════════════════════════════════════════════════════════╣
║ ║
║ ERAS IDENTIFIED: 3 ║
║ ├── Founding (2022-01 → 2022-09): Monolith, Express, JS ║
║ ├── Growth (2022-09 → 2024-03): Microservices, REST, JS/TS ║
║ └── Current (2024-03 → now): Microservices, GraphQL, TS ║
║ ║
║ FAULT LINES: 3 major, 7 minor ║
║ IMPACT CRATERS: 12 incidents (3 P0, 5 P1, 4 P2) ║
║ FOSSIL BEDS: 4 modules unchanged > 18 months ║
║ KNOWLEDGE VOIDS: 2 modules (all authors departed) ║
║ EXTINCTION EVENTS: 8 modules deleted, 3 left artifacts ║
║ ║
║ TRAJECTORY: Healthy evolution with 2 risk areas ║
╚══════════════════════════════════════════════════════════════╝
Why It Matters
Code review looks at the present. Testing validates the expected. Fossil Record illuminates the past — because a codebase that doesn't understand its own history is condemned to repeat its own mistakes.
Zero external dependencies. Pure git analysis. No APIs, no cloud, no cost.