Analyze Open-Source Project
Systematically analyze an open-source project's codebase to help the user quickly understand its architecture, core logic, data flows, key APIs, and algorithm implementations.
All analysis output MUST be in Chinese (zh-CN).
Execution Workflow
Follow these steps strictly in order. Use parallel subagents (Task tool with subagent_type="explore") where noted.
Phase 1: Context Gathering
Read these files first (use parallel reads):
README.md(orREADME.rst,README.txt) — project purpose, features, quick start- Primary config/dependency file — detect tech stack:
- Node.js:
package.json - Python:
pyproject.toml>setup.py>requirements.txt - Go:
go.mod - Java/Kotlin:
pom.xmlorbuild.gradle - Rust:
Cargo.toml - C/C++:
CMakeLists.txtorMakefile - .NET:
*.csprojor*.sln
- Node.js:
- CI/Docker files if present (
Dockerfile,.github/workflows/) — reveals build & deploy info
Summarize: project name, purpose, tech stack, major dependencies, and build/run commands.
Phase 2: Directory Structure Scan
Run a directory listing (depth 2) to map out the project layout.
Classify each top-level directory into one of:
- core: main business logic
- api: HTTP/gRPC/CLI interface layer
- model/entity: data models or domain objects
- config: configuration and environment
- util/common: shared utilities
- test: test suites
- docs: documentation
- scripts/tools: build or deployment scripts
- other: anything else
Phase 3: Entry Point Identification
Search for program entry points based on the detected tech stack:
| Tech Stack | Typical Entry Points |
|---|---|
| Node.js | package.json "main"/"scripts.start", index.js, src/index.ts, app.js |
| Python | __main__.py, main.py, app.py, manage.py, cli.py |
| Go | main.go, cmd/*/main.go |
| Java | classes with public static void main, @SpringBootApplication |
| Rust | src/main.rs, src/lib.rs |
| C/C++ | main.c, main.cpp |
| Web Frontend | src/index.tsx, src/main.ts, src/App.vue |
Read the entry point file(s) and trace the initialization/bootstrap sequence.
Phase 4: Deep Analysis
Perform all four dimensions of analysis. Use parallel explore subagents for independent dimensions.
4a. Architecture & Module Dependencies
- Identify the architectural pattern (MVC, Clean Architecture, Hexagonal, Microservices, Monolith, etc.)
- Map module dependencies — which modules import/call which
- Produce a Mermaid graph showing module relationships
4b. Core Business Flow & Data Flow
- Trace the primary user-facing workflow(s) end-to-end
- Identify how data enters, transforms, persists, and exits the system
- Produce a Mermaid flowchart or sequence diagram for the most important flow
4c. Key API Interfaces & Call Chains
- List public API endpoints or exported interfaces
- For the top 3-5 most important APIs, trace the call chain from handler to data layer
- Note middleware, interceptors, or decorators in the chain
4d. Algorithm & Function Implementation
- Identify non-trivial algorithms or complex business logic
- Extract the key code snippets (keep concise, max ~30 lines each)
- Annotate each snippet explaining the logic step by step
Output Format
Use the template defined in template.md to structure the final report.
Key formatting rules:
- Use Markdown headings (
##,###) for clear hierarchy - Include at least 2 Mermaid diagrams (architecture graph + primary flow)
- Code snippets use CODE REFERENCE format (
startLine:endLine:filepath) when citing existing code - Keep the entire report readable in under 15 minutes
Guidelines
- Depth over breadth: It is better to deeply explain 3 critical modules than to shallowly list 20.
- Follow the data: When in doubt about what to analyze next, follow the data flow.
- Cite code: Always reference specific files and line numbers — never make vague claims.
- Be opinionated: State clearly what the architectural strengths and weaknesses are.
- Progressive disclosure: Start with executive summary; put detailed analysis in later sections. The user should get 80% of the value from the first 20% of the report.