project-onboarding

Rapidly understand unfamiliar codebases through structured analysis. Use when the user needs to understand a new project, especially poorly documented or non-standard code. Triggers include "help me understand this project", "analyze this codebase", "onboard me to [project]", "what does this project do", or when the user provides a project path for exploration. Particularly effective for legacy code, hack projects, and codebases lacking documentation.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "project-onboarding" with this command: npx skills add mkdir700/myskills/mkdir700-myskills-project-onboarding

Project Onboarding

Help developers quickly build mental models of unfamiliar projects through systematic analysis and staged delivery.

Interactive Workflow

This skill uses a phased approach with user checkpoints to manage context efficiently and focus analysis where needed.

Phase 0: Context Collection

Before starting analysis, gather essential context. If the user provides only a project path, auto-detect the rest:

I'll help you understand this project. First, confirm a few details:

1. **Project path**: [confirm or ask]
2. **Your goal**:
   - Quick onboarding (run locally within 1 hour)
   - Fix specific issue
   - Add new feature
   - Understand for maintenance
   - Other: _____
3. **Time budget**: 1 hour / 1 day / 1 week / custom
4. **Known tech stack** (optional, auto-detect if empty): _____

If you just want to start quickly, I'll auto-detect everything from the code.

Skip questions for which answers are already clear from context.

Phase 1: Project Health Check (5-10 minutes)

Execute rapid diagnostic scan to identify project type, quality signals, and immediate concerns.

Automated Detection Steps

  1. Identify project type
# Check for ecosystem markers
ls -la | grep -E "package.json|pom.xml|go.mod|Cargo.toml|requirements.txt|Gemfile|composer.json"
  1. Health indicators
# Check documentation
ls README* 2>/dev/null && echo "✅ README exists" || echo "❌ No README"

# Check git history activity
git log --oneline --since="6 months ago" | wc -l

# Find configuration examples
ls .env.example .env.sample 2>/dev/null || echo "⚠️ No config examples"

# Test coverage indicators
find . -type d -name "__tests__" -o -name "test" -o -name "tests" 2>/dev/null
  1. Code quality signals (especially for poorly maintained code)
# Large files (complexity indicator)
find . -type f \( -name "*.js" -o -name "*.ts" -o -name "*.py" -o -name "*.java" \) \
  -exec wc -l {} \; | sort -rn | head -10

# TODO/FIXME density
find . -type f \( -name "*.js" -o -name "*.ts" -o -name "*.py" \) \
  -exec grep -Hn "TODO\|FIXME\|HACK\|XXX\|BUG" {} \; | head -20

# Commented-out code (smell for technical debt)
find . -name "*.js" | xargs grep -c "^[[:space:]]*\/\/" | \
  awk -F: '$2>50 {print $1 ": " $2 " lines"}' | head -5

# Dead code indicators
find . \( -name "*backup*" -o -name "*old*" -o -name "*temp*" -o -name "*.bak" \) 2>/dev/null
  1. Find entry points
# Server/service entry points
grep -r "listen\|createServer\|app.run\|main()" --include="*.ts" --include="*.js" \
  --include="*.py" --include="*.go" | head -5

# Build/start scripts
cat package.json | grep -A 3 '"scripts"' 2>/dev/null

Output Format: Health Report

# Project Health Check: [PROJECT_NAME]

## Basic Info
- **Type**: [monolith/microservice/library/CLI/frontend app/monorepo]
- **Primary language**: [Language] ([version if detectable])
- **Package manager**: [npm/yarn/pnpm/pip/cargo/...]
- **Build tool**: [webpack/vite/gradle/...]

## Health Score: [X/10]

### ✅ Strengths
- [List positive findings]

### ⚠️ Concerns
- [List medium-priority issues]

### ❌ Critical Issues
- [List serious problems]

### 🔍 Code Quality Indicators
- **TODO/FIXME count**: X occurrences ([top 5 locations with file:line])
- **Large files (>1000 lines)**: [count] files ([top 3: filename:lines])
- **Commented code density**: [high/medium/low]
- **Dead code artifacts**: [list temp/backup files if found]
- **Naming inconsistencies**: [examples if detected]

## One-Line Summary
[Inferred purpose from README and code structure]

## Core Value (3 points)
1. [What problem it solves]
2. [Key user/stakeholder]
3. [Primary capability]

## Entry Points
- **Main file**: `[path:line]`
- **Start command**: `[command]` (from `[source]`)
- **Routes/API**: `[path]` (if applicable)

---
**Next steps**: Choose what to explore:
1. 📄 Quick start guide (run locally in 15 min)
2. 🏗️ Architecture & code map (deep dive)
3. 📊 Specific analysis (pick dimensions)

[WAIT FOR USER CHOICE]

Phase 2: Quick Start Guide

Provide when user wants to run the project immediately or chooses option 1.

# Quick Start Guide: [PROJECT_NAME]

## Prerequisites
```bash
# Version check commands
[language] --version  # Need: [version requirement]
[package_manager] --version

# Installation if missing:
[specific installation commands or links]

Startup Sequence (Target: 15 minutes)

Step 1: Install dependencies

cd [project_path]
[install_command]  # e.g., npm install, pip install -r requirements.txt

# Expected output:
# [sample successful output]

# Common failures:
# ❌ "EACCES permission denied" → Solution: [specific fix]
# ❌ "Cannot find module X" → Solution: [specific fix]

Evidence:

  • Dependency file: [path]
  • Lock file: [path] (checksum: [hash])

Step 2: Configure environment

# If .env.example exists:
cp .env.example .env

# Otherwise create manually:
cat > .env << 'EOF'
[required variables with defaults]
EOF

Evidence:

  • Variables loaded at: [file:line]
  • Required vars: [list from code analysis]
  • Optional vars: [list with defaults]

Step 3: Initialize data (if needed)

[migration/seed commands if applicable]

# Verify:
[verification command]

Step 4: Start service

[start command]

# Success indicators:
# [specific log messages to look for]
# Server ready at: http://localhost:[port]

# Verification:
curl [health_check_endpoint] || [alternative verification]
# Expected: [response]

Evidence:

  • Port config: [file:line]
  • Startup logs: [file:line where logging configured]

Test Execution

[test command]

# Expected:
# [sample passing output]

# Common failures:
1. **Database not ready**
   Error: `[specific error text]`
   Fix: `[specific command]` ([source: file:line])

2. **Port conflict**
   Error: `EADDRINUSE`
   Fix: `lsof -i :[port] | grep LISTEN | awk '{print $2}' | xargs kill`

Verification Checklist

  • Service starts without errors
  • Health endpoint responds (or equivalent test)
  • Basic test suite passes
  • Logs show expected initialization
  • Can access UI/API (if applicable)

Command Reference

ActionCommandSource
Dev mode[command][package.json:line or equivalent]
Build[command][source]
Test[command][source]
Lint[command][source]

What's next?

  • Understand architecture → Ask for "architecture analysis"
  • Trace specific request → Tell me the endpoint/feature
  • Start coding → Ask for "safe contribution points"

[WAIT FOR USER CHOICE]


### Phase 3: Deep Analysis (On Demand)

Only provide when user explicitly requests. Offer menu of analysis dimensions:

```markdown
Select analysis areas (can choose multiple):

1. 🏗️ **Architecture view** - Modules, dependencies, data flow
2. 🗺️ **Code map** - Directory structure, key files, coupling
3. 📊 **Business logic** - End-to-end flow tracing
4. 💾 **Data model** - Database, schemas, relationships
5. ⚙️ **Configuration** - Environment vars, secrets management
6. 📈 **Observability** - Logging, debugging, monitoring
7. 🧪 **Testing** - Test structure, coverage, strategies
8. 🚀 **CI/CD** - Build pipeline, deployment process
9. 🔒 **Security** - Auth, permissions, vulnerabilities

Input: [numbers] or "describe your need"

Architecture View Template

When user selects architecture analysis:

## Architecture Analysis

### System Topology
```mermaid
graph TB
    Client[Client/Browser]
    API[API Server<br/>[main entry file]]
    DB[(Database)]
    Cache[(Cache)]
    
    Client -->|HTTP| API
    API -->|Query| DB
    API -->|Get/Set| Cache

Evidence:

  • API entry: [file:line]
  • DB connection: [file:line]
  • Cache client: [file:line]

Module Dependencies

graph LR
    Routes[routes/]
    Services[services/]
    Data[data/]
    Utils[utils/]
    
    Routes --> Services
    Services --> Data
    Services --> Utils

Evidence: Import analysis from [grep command used]

Key Modules

ModulePurposeKey FilesRisks
[path][responsibility][file] ([N] lines)[⚠️ issues if any]

External Dependencies

TypeServiceConfig LocationLocal Dev Option
Database[name][env var] at [file:line]Docker Compose / SQLite
Cache[name][env var] at [file:line]Redis local / in-memory

Critical Data Flow: [Example Use Case]

sequenceDiagram
    participant C as Client
    participant A as API
    participant S as Service
    participant D as Database
    
    C->>A: [Request]
    A->>S: [Method call]
    S->>D: [Query]
    D-->>S: [Data]
    S-->>A: [Response]
    A-->>C: [JSON]

Evidence:

  • Route handler: [file:line-range]
  • Service method: [file:line-range]
  • Database query: [file:line-range]

🚨 Architecture Risks

  1. [Risk title]
    • Location: [file:line]
    • Impact: [description]
    • Evidence: [code snippet or pattern]
    • Mitigation: [suggestion]

[Continue with other risk areas]


### Phase 4: Action Plans

Provide when user has completed orientation and wants concrete next steps:

```markdown
## Action Plan: [TIME_BUDGET]

### 1-Hour Plan: Run & Understand
**Goal**: Service running + core flow traced

Tasks:
- [ ] Complete environment setup
- [ ] Start service and verify
- [ ] Execute one complete request: [specific example]
  ```bash
  [curl or API call]
  • Review logs for this request
    [log location or command]
    

Verification:

[specific verification steps with expected output]

1-Day Plan: Navigate & Modify

Goal: Locate modules + make safe change

Morning: Code navigation drill

  1. Pick API endpoint: [example endpoint]
  2. Trace from route → service → data
    • Route: [file:line]
    • Service: [file:line]
    • Query: [file:line]
  3. Practice change: Add field to response
    • Location: [file:function]
    • Change: [specific modification]
    • Test: [verification]

Afternoon: Fix simple bug Priority targets (low risk):

  1. Documentation typos
  2. Log message improvements
  3. Missing input validation on [specific field]

Find bugs:

grep -rn "TODO\|FIXME" [source_dir]/
git log --grep="bug\|fix" --oneline -20

1-Week Plan: Own Feature

Goal: Complete small feature independently

Knowledge gaps to fill:

  1. [Domain concept] - Read [doc/file]
  2. [Tech stack detail] - Review [framework docs section]
  3. [Test patterns] - Study [test_file]

Reading order:

[entry_file] (15 min) 
  ↓
[routes_file] (30 min - map all endpoints)
  ↓
[services_dir] (1-2 hours - core logic)
  ↓
[data_dir] (30 min - understand persistence)
  ↓
[Choose one module to deep dive] (2 hours)

Recommended first issues:

  • 🟢 Add tests for [uncovered module]
  • 🟢 Improve error messages in [file]
  • 🟡 Implement [simple CRUD endpoint]
  • 🟡 Add caching to [specific query]

Contribution Safety Guide

Code Style (Inferred)

Analysis:

# File naming
find [src] -name "*.ts" | head -20
# Result: [camelCase / kebab-case / mixed]

# Function naming  
grep -rh "^export function\|^export const" [src]/ | head -10
# Pattern: [observed pattern]

Conventions (with evidence):

  • Files: [naming style] (see [example files])
  • Functions: [naming style] (see [file:line examples])
  • Classes: [naming style]

Directory Organization

[project_root]/
├── [dir]/ - [inferred purpose]
├── [dir]/ - [inferred purpose]
└── [dir]/ - [inferred purpose]

Where to add new code:

  • New API: [directory/file pattern]
  • New logic: [directory/file pattern]
  • New data model: [directory/file pattern]

Commit Strategy

Git history analysis:

git log --oneline --format="%s" -50 | head -10

Pattern detected: [conventional commits / freeform / other]

Recommendation:

[type]([scope]): [subject]

Examples:
- feat(api): add user profile endpoint
- fix(auth): handle expired tokens correctly

Common Pitfalls

IssueSymptomCauseSolution
[problem][error message][root cause at file:line][fix command]

Safe Contribution Points (3 examples)

1. Add logging (lowest risk)

Location: [file:function] Change:

// Add at line [N]:
console.log('[Module] Action:', [data]);

Verify: [how to see log] Rollback: Delete line

2. Adjust config (low risk)

Location: [config_file:line] Change: [parameter] from [old] to [new] Verify: [test command] Rollback: Revert value

3. Add validation (medium risk)

Location: [file:function]
Change:

if (![validation]) {
  throw new Error('[message]');
}

Verify: [test with invalid input] Rollback: Remove check


Follow-Up Questions (If Needed)

Ask ONLY if cannot infer from codebase (max 6):

  1. Database migrations: Found [migration_dir] but unclear if manual SQL or ORM-driven. How are migrations run?

  2. Test data: How is test database populated? (no seed scripts found)

  3. Deployment: Is production containerized? (no Dockerfile found, but [evidence of X])

  4. Branch policy: PR review required? (no .github/CODEOWNERS or branch protection visible)

  5. Secrets management: Production secrets via [vault/AWS/other]? (.env not in .gitignore - risky)

  6. Code ownership: Any modules with designated owners?


## Tool Usage Guidelines

### Efficient Scanning

```bash
# Project type identification
ls -la | grep -E "package.json|pom.xml|go.mod|Cargo.toml|requirements.txt"

# Entry point discovery  
grep -r "listen\|createServer\|app.run\|if __name__" \
  --include="*.ts" --include="*.js" --include="*.py"

# Configuration files
find . -maxdepth 2 \( -name "*.config.*" -o -name ".*rc" -o -name ".env*" \)

# Code quality signals
find . -name "*.ts" -exec wc -l {} \; | sort -rn | head -10  # Large files
grep -rn "console.log" --include="*.ts" | wc -l  # Debug statements
grep -rn "http://\|https://" --include="*.ts" | grep -v "^//"  # Hardcoded URLs

File Reading Priority

Must read (Phase 1):

  1. README.md
  2. package.json (or equivalent manifest)
  3. .env.example (if exists)
  4. Main entry file

Important (Phase 2):

  1. Route definitions
  2. Core service/controller files
  3. Config initialization

On-demand (Phase 3):

  1. Specific modules as needed
  2. Test files for patterns
  3. Migration scripts

Handling Large Files

For files >500 lines:

# Read strategically
view(path="large_file.ts", view_range=[1, 50])    # Start
view(path="large_file.ts", view_range=[-50, -1])  # End
# Then middle sections as needed

Evidence Citation Format

Every conclusion must cite source:

  • File content: path/to/file.ts:45-67 or path:line
  • Configuration: package.json → scripts.start → "node server.js"
  • Command output: Result of [bash command]
  • Inference: "Based on [pattern/structure/naming], likely [conclusion]"

Flag uncertainty: [UNCERTAIN] [reason] - Verify via [command/file]

Output Principles

  1. Progressive disclosure: Deliver in phases, wait for user choice
  2. Evidence-based: All claims cite file:line or command output
  3. Action-oriented: Every section ends with "What's next?"
  4. Risk-aware: Highlight quality issues without judgment
  5. Beginner-friendly: Assume smart but unfamiliar reader

References

See references/ for:

  • analysis-patterns.md - Templates for different project types
  • quality-signals.md - Code smell detection patterns
  • recovery-strategies.md - How to handle missing docs/tests

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

decision-log

No summary provided by upstream source.

Repository SourceNeeds Review
General

ai-review-validator

No summary provided by upstream source.

Repository SourceNeeds Review
General

archive-and-cleanup-vibe-docs

No summary provided by upstream source.

Repository SourceNeeds Review