Documentation Reading
Systematic approaches for extracting actionable information from project documentation efficiently while identifying gaps, contradictions, and outdated content.
When to Use
-
Onboarding to an unfamiliar codebase or service
-
Verifying implementation matches specification requirements
-
Understanding API contracts before integration
-
Parsing configuration files for deployment or debugging
-
Investigating discrepancies between docs and actual behavior
-
Preparing to extend or modify existing functionality
Reading Strategies by Document Type
README Files
READMEs are entry points. Extract these elements in order:
-
Project Purpose: First paragraph usually states what the project does
-
Quick Start: Look for "Getting Started", "Installation", or "Usage" sections
-
Prerequisites: Dependencies, environment requirements, version constraints
-
Architecture Hints: Links to other docs, directory structure descriptions
-
Maintenance Status: Last updated date, badges, contribution activity
Reading Pattern:
- Scan headings to build mental map (30 seconds)
- Read purpose/description section fully
- Locate quick start commands - test if they work
- Note any "gotchas" or "known issues" sections
- Identify links to deeper documentation
Red Flags:
-
No update in 12+ months on active project
-
Quick start commands that fail
-
References to deprecated dependencies
-
Missing license or security sections
API Documentation
Extract information in this priority:
-
Authentication: How to authenticate (API keys, OAuth, tokens)
-
Base URL / Endpoints: Entry points and environment variations
-
Request Format: Headers, body structure, content types
-
Response Format: Success/error shapes, status codes
-
Rate Limits: Throttling, quotas, retry policies
-
Versioning: How versions are specified, deprecation timeline
Reading Pattern:
- Find authentication section first - nothing works without it
- Locate a simple endpoint (health check, list operation)
- Trace a complete request/response cycle
- Note pagination patterns for list endpoints
- Identify error response structure
- Check for SDK/client library availability
Cross-Reference Checks:
-
Compare documented endpoints against actual network calls
-
Verify response schemas match real responses
-
Test documented error codes actually occur
Technical Specifications
Specifications define expected behavior. Extract:
-
Requirements List: Numbered requirements, acceptance criteria
-
Constraints: Technical limitations, compatibility requirements
-
Data Models: Entity definitions, relationships, constraints
-
Interfaces: API contracts, message formats, protocols
-
Non-Functional Requirements: Performance, security, scalability targets
Reading Pattern:
- Identify document type (PRD, SDD, RFC, ADR)
- Locate requirements or acceptance criteria section
- Extract testable assertions (MUST, SHALL, SHOULD language)
- Map requirements to implementation locations
- Note any open questions or TBD items
Verification Approach:
-
Create checklist from requirements
-
Mark each as: Implemented / Partial / Missing / Contradicted
-
Document gaps for follow-up
Configuration Files
Configuration files control runtime behavior. Approach by file type:
Package Manifests (package.json, Cargo.toml, pyproject.toml)
- Project metadata: name, version, description
- Entry points: main, bin, exports
- Dependencies: runtime vs dev, version constraints
- Scripts/commands: available automation
- Engine requirements: Node version, Python version
Environment Configuration (.env, config.yaml, settings.json)
- Required variables (those without defaults)
- Environment-specific overrides
- Secret references (never actual values)
- Feature flags and toggles
- Service URLs and connection strings
Build/Deploy Configuration (Dockerfile, CI configs, terraform)
- Base images or providers
- Build stages and dependencies
- Environment variable injection points
- Secret management approach
- Output artifacts and destinations
Reading Pattern:
- Identify configuration format and schema (if available)
- List all configurable options
- Determine which have defaults vs require values
- Trace where configuration values are consumed in code
- Note any environment-specific overrides
Architecture Decision Records (ADRs)
ADRs capture why decisions were made. Extract:
-
Context: What problem prompted the decision
-
Decision: What was chosen
-
Consequences: Trade-offs accepted
-
Status: Accepted, Deprecated, Superseded
-
Related Decisions: Links to related ADRs
Reading Pattern:
- Read context to understand the problem space
- Note alternatives that were considered
- Understand why current approach was chosen
- Check if decision is still active or superseded
- Consider if context has changed since decision
Identifying Documentation Issues
Outdated Documentation
Signals that documentation may be stale:
-
Version Mismatches: Docs reference v1.x, code is v2.x
-
Missing Features: Code has capabilities not in docs
-
Dead Links: References to moved or deleted resources
-
Deprecated Patterns: Docs use patterns code has abandoned
-
Date Indicators: "Last updated 2 years ago" on active project
Verification Steps:
- Check doc commit history vs code commit history
- Compare documented API against actual code signatures
- Run documented examples - do they work?
- Search code for terms used in docs - are they present?
Conflicting Documentation
When multiple docs disagree:
-
Identify the conflict explicitly: Quote both sources
-
Check timestamps: Newer usually wins
-
Check authority: Official > community, code > docs
-
Test behavior: What does the system actually do?
-
Document the resolution: Note which source was correct
Resolution Priority:
- Actual system behavior (empirical truth)
- Most recent official documentation
- Code comments and inline documentation
- External/community documentation
- Older official documentation
Missing Documentation
Recognize documentation gaps:
-
Undocumented Endpoints: Routes exist in code but not docs
-
Hidden Configuration: Env vars used but not listed
-
Implicit Requirements: Dependencies not in requirements file
-
Tribal Knowledge: Processes that exist only in team memory
Gap Documentation Template:
Documentation Gap: [Topic]
Discovered: [Date] Location: [Where this should be documented] Current State: [What exists now] Required Information: [What's missing] Source of Truth: [Where to get correct info]
Cross-Referencing Documentation with Code
Tracing Requirements to Implementation
- Extract requirement ID or description
- Search codebase for requirement reference
- If not found, search for key domain terms
- Locate implementation and verify behavior
- Document mapping: Requirement -> File:Line
Validating API Documentation
- Find endpoint in documentation
- Locate route definition in code
- Compare: method, path, parameters
- Trace to handler implementation
- Verify response shape matches docs
Configuration Value Tracing
- Identify configuration key in docs
- Search for key in codebase
- Find where value is read/consumed
- Trace through to actual usage
- Verify documented behavior matches code
Best Practices
-
Read completely before acting: Avoid skimming that misses critical details
-
Verify before trusting: Test documented commands and examples
-
Note contradictions immediately: Document conflicts as you find them
-
Maintain a questions list: Track unclear items for follow-up
-
Cross-reference constantly: Docs without code verification are unreliable
-
Update as you learn: Fix documentation issues you discover
Anti-Patterns
-
Assuming documentation is current: Always verify against code
-
Reading without testing: Documentation lies; code reveals truth
-
Ignoring "Notes" and "Warnings": These often contain critical information
-
Skipping prerequisites: Missing requirements cause cascading failures
-
Trusting examples blindly: Examples may be simplified or outdated