Project Onboarding
Help developers quickly build mental models of unfamiliar projects through systematic analysis and staged delivery.
Interactive Workflow
This skill uses a phased approach with user checkpoints to manage context efficiently and focus analysis where needed.
Phase 0: Context Collection
Before starting analysis, gather essential context. If the user provides only a project path, auto-detect the rest:
I'll help you understand this project. First, confirm a few details:
1. **Project path**: [confirm or ask]
2. **Your goal**:
- Quick onboarding (run locally within 1 hour)
- Fix specific issue
- Add new feature
- Understand for maintenance
- Other: _____
3. **Time budget**: 1 hour / 1 day / 1 week / custom
4. **Known tech stack** (optional, auto-detect if empty): _____
If you just want to start quickly, I'll auto-detect everything from the code.
Skip questions for which answers are already clear from context.
Phase 1: Project Health Check (5-10 minutes)
Execute rapid diagnostic scan to identify project type, quality signals, and immediate concerns.
Automated Detection Steps
- Identify project type
# Check for ecosystem markers
ls -la | grep -E "package.json|pom.xml|go.mod|Cargo.toml|requirements.txt|Gemfile|composer.json"
- Health indicators
# Check documentation
ls README* 2>/dev/null && echo "✅ README exists" || echo "❌ No README"
# Check git history activity
git log --oneline --since="6 months ago" | wc -l
# Find configuration examples
ls .env.example .env.sample 2>/dev/null || echo "⚠️ No config examples"
# Test coverage indicators
find . -type d -name "__tests__" -o -name "test" -o -name "tests" 2>/dev/null
- Code quality signals (especially for poorly maintained code)
# Large files (complexity indicator)
find . -type f \( -name "*.js" -o -name "*.ts" -o -name "*.py" -o -name "*.java" \) \
-exec wc -l {} \; | sort -rn | head -10
# TODO/FIXME density
find . -type f \( -name "*.js" -o -name "*.ts" -o -name "*.py" \) \
-exec grep -Hn "TODO\|FIXME\|HACK\|XXX\|BUG" {} \; | head -20
# Commented-out code (smell for technical debt)
find . -name "*.js" | xargs grep -c "^[[:space:]]*\/\/" | \
awk -F: '$2>50 {print $1 ": " $2 " lines"}' | head -5
# Dead code indicators
find . \( -name "*backup*" -o -name "*old*" -o -name "*temp*" -o -name "*.bak" \) 2>/dev/null
- Find entry points
# Server/service entry points
grep -r "listen\|createServer\|app.run\|main()" --include="*.ts" --include="*.js" \
--include="*.py" --include="*.go" | head -5
# Build/start scripts
cat package.json | grep -A 3 '"scripts"' 2>/dev/null
Output Format: Health Report
# Project Health Check: [PROJECT_NAME]
## Basic Info
- **Type**: [monolith/microservice/library/CLI/frontend app/monorepo]
- **Primary language**: [Language] ([version if detectable])
- **Package manager**: [npm/yarn/pnpm/pip/cargo/...]
- **Build tool**: [webpack/vite/gradle/...]
## Health Score: [X/10]
### ✅ Strengths
- [List positive findings]
### ⚠️ Concerns
- [List medium-priority issues]
### ❌ Critical Issues
- [List serious problems]
### 🔍 Code Quality Indicators
- **TODO/FIXME count**: X occurrences ([top 5 locations with file:line])
- **Large files (>1000 lines)**: [count] files ([top 3: filename:lines])
- **Commented code density**: [high/medium/low]
- **Dead code artifacts**: [list temp/backup files if found]
- **Naming inconsistencies**: [examples if detected]
## One-Line Summary
[Inferred purpose from README and code structure]
## Core Value (3 points)
1. [What problem it solves]
2. [Key user/stakeholder]
3. [Primary capability]
## Entry Points
- **Main file**: `[path:line]`
- **Start command**: `[command]` (from `[source]`)
- **Routes/API**: `[path]` (if applicable)
---
**Next steps**: Choose what to explore:
1. 📄 Quick start guide (run locally in 15 min)
2. 🏗️ Architecture & code map (deep dive)
3. 📊 Specific analysis (pick dimensions)
[WAIT FOR USER CHOICE]
Phase 2: Quick Start Guide
Provide when user wants to run the project immediately or chooses option 1.
# Quick Start Guide: [PROJECT_NAME]
## Prerequisites
```bash
# Version check commands
[language] --version # Need: [version requirement]
[package_manager] --version
# Installation if missing:
[specific installation commands or links]
Startup Sequence (Target: 15 minutes)
Step 1: Install dependencies
cd [project_path]
[install_command] # e.g., npm install, pip install -r requirements.txt
# Expected output:
# [sample successful output]
# Common failures:
# ❌ "EACCES permission denied" → Solution: [specific fix]
# ❌ "Cannot find module X" → Solution: [specific fix]
Evidence:
- Dependency file:
[path] - Lock file:
[path](checksum: [hash])
Step 2: Configure environment
# If .env.example exists:
cp .env.example .env
# Otherwise create manually:
cat > .env << 'EOF'
[required variables with defaults]
EOF
Evidence:
- Variables loaded at:
[file:line] - Required vars: [list from code analysis]
- Optional vars: [list with defaults]
Step 3: Initialize data (if needed)
[migration/seed commands if applicable]
# Verify:
[verification command]
Step 4: Start service
[start command]
# Success indicators:
# [specific log messages to look for]
# Server ready at: http://localhost:[port]
# Verification:
curl [health_check_endpoint] || [alternative verification]
# Expected: [response]
Evidence:
- Port config:
[file:line] - Startup logs:
[file:line where logging configured]
Test Execution
[test command]
# Expected:
# [sample passing output]
# Common failures:
1. **Database not ready**
Error: `[specific error text]`
Fix: `[specific command]` ([source: file:line])
2. **Port conflict**
Error: `EADDRINUSE`
Fix: `lsof -i :[port] | grep LISTEN | awk '{print $2}' | xargs kill`
Verification Checklist
- Service starts without errors
- Health endpoint responds (or equivalent test)
- Basic test suite passes
- Logs show expected initialization
- Can access UI/API (if applicable)
Command Reference
| Action | Command | Source |
|---|---|---|
| Dev mode | [command] | [package.json:line or equivalent] |
| Build | [command] | [source] |
| Test | [command] | [source] |
| Lint | [command] | [source] |
What's next?
- Understand architecture → Ask for "architecture analysis"
- Trace specific request → Tell me the endpoint/feature
- Start coding → Ask for "safe contribution points"
[WAIT FOR USER CHOICE]
### Phase 3: Deep Analysis (On Demand)
Only provide when user explicitly requests. Offer menu of analysis dimensions:
```markdown
Select analysis areas (can choose multiple):
1. 🏗️ **Architecture view** - Modules, dependencies, data flow
2. 🗺️ **Code map** - Directory structure, key files, coupling
3. 📊 **Business logic** - End-to-end flow tracing
4. 💾 **Data model** - Database, schemas, relationships
5. ⚙️ **Configuration** - Environment vars, secrets management
6. 📈 **Observability** - Logging, debugging, monitoring
7. 🧪 **Testing** - Test structure, coverage, strategies
8. 🚀 **CI/CD** - Build pipeline, deployment process
9. 🔒 **Security** - Auth, permissions, vulnerabilities
Input: [numbers] or "describe your need"
Architecture View Template
When user selects architecture analysis:
## Architecture Analysis
### System Topology
```mermaid
graph TB
Client[Client/Browser]
API[API Server<br/>[main entry file]]
DB[(Database)]
Cache[(Cache)]
Client -->|HTTP| API
API -->|Query| DB
API -->|Get/Set| Cache
Evidence:
- API entry:
[file:line] - DB connection:
[file:line] - Cache client:
[file:line]
Module Dependencies
graph LR
Routes[routes/]
Services[services/]
Data[data/]
Utils[utils/]
Routes --> Services
Services --> Data
Services --> Utils
Evidence: Import analysis from [grep command used]
Key Modules
| Module | Purpose | Key Files | Risks |
|---|---|---|---|
[path] | [responsibility] | [file] ([N] lines) | [⚠️ issues if any] |
External Dependencies
| Type | Service | Config Location | Local Dev Option |
|---|---|---|---|
| Database | [name] | [env var] at [file:line] | Docker Compose / SQLite |
| Cache | [name] | [env var] at [file:line] | Redis local / in-memory |
Critical Data Flow: [Example Use Case]
sequenceDiagram
participant C as Client
participant A as API
participant S as Service
participant D as Database
C->>A: [Request]
A->>S: [Method call]
S->>D: [Query]
D-->>S: [Data]
S-->>A: [Response]
A-->>C: [JSON]
Evidence:
- Route handler:
[file:line-range] - Service method:
[file:line-range] - Database query:
[file:line-range]
🚨 Architecture Risks
- [Risk title]
- Location:
[file:line] - Impact: [description]
- Evidence: [code snippet or pattern]
- Mitigation: [suggestion]
- Location:
[Continue with other risk areas]
### Phase 4: Action Plans
Provide when user has completed orientation and wants concrete next steps:
```markdown
## Action Plan: [TIME_BUDGET]
### 1-Hour Plan: Run & Understand
**Goal**: Service running + core flow traced
Tasks:
- [ ] Complete environment setup
- [ ] Start service and verify
- [ ] Execute one complete request: [specific example]
```bash
[curl or API call]
- Review logs for this request
[log location or command]
Verification:
[specific verification steps with expected output]
1-Day Plan: Navigate & Modify
Goal: Locate modules + make safe change
Morning: Code navigation drill
- Pick API endpoint:
[example endpoint] - Trace from route → service → data
- Route:
[file:line] - Service:
[file:line] - Query:
[file:line]
- Route:
- Practice change: Add field to response
- Location:
[file:function] - Change: [specific modification]
- Test: [verification]
- Location:
Afternoon: Fix simple bug Priority targets (low risk):
- Documentation typos
- Log message improvements
- Missing input validation on
[specific field]
Find bugs:
grep -rn "TODO\|FIXME" [source_dir]/
git log --grep="bug\|fix" --oneline -20
1-Week Plan: Own Feature
Goal: Complete small feature independently
Knowledge gaps to fill:
- [Domain concept] - Read
[doc/file] - [Tech stack detail] - Review
[framework docs section] - [Test patterns] - Study
[test_file]
Reading order:
[entry_file] (15 min)
↓
[routes_file] (30 min - map all endpoints)
↓
[services_dir] (1-2 hours - core logic)
↓
[data_dir] (30 min - understand persistence)
↓
[Choose one module to deep dive] (2 hours)
Recommended first issues:
- 🟢 Add tests for
[uncovered module] - 🟢 Improve error messages in
[file] - 🟡 Implement
[simple CRUD endpoint] - 🟡 Add caching to
[specific query]
Contribution Safety Guide
Code Style (Inferred)
Analysis:
# File naming
find [src] -name "*.ts" | head -20
# Result: [camelCase / kebab-case / mixed]
# Function naming
grep -rh "^export function\|^export const" [src]/ | head -10
# Pattern: [observed pattern]
Conventions (with evidence):
- Files: [naming style] (see
[example files]) - Functions: [naming style] (see
[file:line examples]) - Classes: [naming style]
Directory Organization
[project_root]/
├── [dir]/ - [inferred purpose]
├── [dir]/ - [inferred purpose]
└── [dir]/ - [inferred purpose]
Where to add new code:
- New API:
[directory/file pattern] - New logic:
[directory/file pattern] - New data model:
[directory/file pattern]
Commit Strategy
Git history analysis:
git log --oneline --format="%s" -50 | head -10
Pattern detected: [conventional commits / freeform / other]
Recommendation:
[type]([scope]): [subject]
Examples:
- feat(api): add user profile endpoint
- fix(auth): handle expired tokens correctly
Common Pitfalls
| Issue | Symptom | Cause | Solution |
|---|---|---|---|
| [problem] | [error message] | [root cause at file:line] | [fix command] |
Safe Contribution Points (3 examples)
1. Add logging (lowest risk)
Location: [file:function]
Change:
// Add at line [N]:
console.log('[Module] Action:', [data]);
Verify: [how to see log] Rollback: Delete line
2. Adjust config (low risk)
Location: [config_file:line]
Change: [parameter] from [old] to [new]
Verify: [test command]
Rollback: Revert value
3. Add validation (medium risk)
Location: [file:function]
Change:
if (![validation]) {
throw new Error('[message]');
}
Verify: [test with invalid input] Rollback: Remove check
Follow-Up Questions (If Needed)
Ask ONLY if cannot infer from codebase (max 6):
-
Database migrations: Found
[migration_dir]but unclear if manual SQL or ORM-driven. How are migrations run? -
Test data: How is test database populated? (no seed scripts found)
-
Deployment: Is production containerized? (no Dockerfile found, but [evidence of X])
-
Branch policy: PR review required? (no
.github/CODEOWNERSor branch protection visible) -
Secrets management: Production secrets via
[vault/AWS/other]? (.env not in .gitignore - risky) -
Code ownership: Any modules with designated owners?
## Tool Usage Guidelines
### Efficient Scanning
```bash
# Project type identification
ls -la | grep -E "package.json|pom.xml|go.mod|Cargo.toml|requirements.txt"
# Entry point discovery
grep -r "listen\|createServer\|app.run\|if __name__" \
--include="*.ts" --include="*.js" --include="*.py"
# Configuration files
find . -maxdepth 2 \( -name "*.config.*" -o -name ".*rc" -o -name ".env*" \)
# Code quality signals
find . -name "*.ts" -exec wc -l {} \; | sort -rn | head -10 # Large files
grep -rn "console.log" --include="*.ts" | wc -l # Debug statements
grep -rn "http://\|https://" --include="*.ts" | grep -v "^//" # Hardcoded URLs
File Reading Priority
Must read (Phase 1):
- README.md
- package.json (or equivalent manifest)
- .env.example (if exists)
- Main entry file
Important (Phase 2):
- Route definitions
- Core service/controller files
- Config initialization
On-demand (Phase 3):
- Specific modules as needed
- Test files for patterns
- Migration scripts
Handling Large Files
For files >500 lines:
# Read strategically
view(path="large_file.ts", view_range=[1, 50]) # Start
view(path="large_file.ts", view_range=[-50, -1]) # End
# Then middle sections as needed
Evidence Citation Format
Every conclusion must cite source:
- File content:
path/to/file.ts:45-67orpath:line - Configuration:
package.json → scripts.start → "node server.js" - Command output: Result of
[bash command] - Inference: "Based on [pattern/structure/naming], likely [conclusion]"
Flag uncertainty: [UNCERTAIN] [reason] - Verify via [command/file]
Output Principles
- Progressive disclosure: Deliver in phases, wait for user choice
- Evidence-based: All claims cite file:line or command output
- Action-oriented: Every section ends with "What's next?"
- Risk-aware: Highlight quality issues without judgment
- Beginner-friendly: Assume smart but unfamiliar reader
References
See references/ for:
analysis-patterns.md- Templates for different project typesquality-signals.md- Code smell detection patternsrecovery-strategies.md- How to handle missing docs/tests