debug-systematic

Systematic debugging workflow with hypothesis testing

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "debug-systematic" with this command: npx skills add manastalukdar/claude-devstudio/manastalukdar-claude-devstudio-debug-systematic

Systematic Debugging Workflow

I'll help you debug issues systematically using the scientific method - hypothesis formation, testing, and iterative refinement.

Arguments: $ARGUMENTS - error description, reproduction steps, or context

Token Optimization

Target: 50% reduction (4,000-6,000 → 1,500-3,000 tokens)

Core Optimization Strategies

1. Hypothesis-Driven Debugging (Not Exhaustive Analysis)

  • AVOID: Reading entire codebase to find bugs
  • DO: Form hypotheses about likely causes, test top 2-3 first
  • Token savings: 90% (200 tokens vs 2,000+ tokens)
  • Pattern: Prioritize recently changed files, common failure patterns

2. Git Diff for Recently Changed Files (Likely Bug Source)

  • AVOID: ls -R then reading all files
  • DO: git diff --name-only HEAD~3..HEAD to find changed files
  • DO: git log --oneline --since="3 days ago" for recent commits
  • Token savings: 85% (300 tokens vs 2,000+ tokens)
  • Pattern: Bugs often introduced in recent changes

3. Stack Trace Parsing with Grep

  • AVOID: Reading entire log files with Read tool
  • DO: grep -i "error\|exception\|fatal" logs/*.log | tail -20
  • DO: Parse stack traces to extract file paths and line numbers
  • Token savings: 95% (100 tokens vs 2,000+ tokens for large logs)
  • Pattern: Stack traces reveal exact failure locations

4. Test Failure Analysis Caching

  • ✅ Cache test results in debug/state.json
  • ✅ Cache hypothesis outcomes to avoid retesting
  • ✅ Cache reproduction steps once confirmed
  • Token savings: 70% on subsequent debugging turns
  • Pattern: Multi-turn debugging sessions benefit from state

5. Progressive Investigation (Narrow Before Deep)

  • ✅ Start with stack trace → identify file → read specific function
  • ✅ Hypothesis testing: test most likely causes first
  • ✅ Binary search through git history when needed
  • Token savings: 60% (stop early when cause found)
  • Pattern: Most bugs have obvious causes in changed code

6. Session State Tracking for Multi-Turn Debugging

  • ✅ Session files in debug/ directory
  • ✅ Track tested hypotheses to avoid repetition
  • ✅ Resume from last checkpoint on subsequent runs
  • Token savings: 80% on resumed sessions (skip completed work)
  • Pattern: Complex bugs require multiple debugging turns

Token Usage by Operation

OperationUnoptimizedOptimizedSavings
Initial bug analysis2,000-3,000500-1,00060-75%
Hypothesis formation1,500-2,000400-80060-73%
Stack trace parsing2,000+100-20090-95%
File investigation2,000+300-60070-85%
Test reproduction1,000-1,500200-40073-80%
Session resume2,000-3,000300-60080-85%

Average Reduction: 50% (4,000-6,000 → 1,500-3,000 tokens)

Debugging-Specific Patterns

Stack Trace Analysis:

# Extract file paths and line numbers from stack traces
grep -E "at .+ \(.+:[0-9]+:[0-9]+\)" error.log | head -10
# Focus investigation on these specific files/lines

Recent Changes Focus:

# Find files changed in last 3 days (likely bug sources)
git diff --name-only HEAD~10..HEAD
# Only read files that changed recently

Hypothesis Prioritization:

  1. Recent changes (80% of bugs) - Check git diff first
  2. Stack trace files (90% reliability) - Read exact failure locations
  3. Error message patterns (70% of bugs) - Grep for similar errors
  4. Environment/config (20% of bugs) - Check if configs changed
  5. External dependencies (10% of bugs) - Check updates

Binary Search for Regressions:

# Use git bisect to find regression commit
git bisect start HEAD v1.2.3
git bisect run npm test  # Automated testing
# Saves 95% tokens vs manual testing each commit

Caching Behavior

Session Location: debug/ (in project root)

  • debug/plan.md - Debugging plan with hypotheses and results
  • debug/state.json - Session state and test results
  • debug/reproduction.log - Issue reproduction steps and logs

Cache Location: .claude/cache/debug/

  • hypotheses.json - Tested hypotheses and outcomes
  • stack-traces.json - Parsed stack trace information
  • changed-files.json - Recently changed files analysis

Cache Validity:

  • Until issue resolved (status: "solved" in state.json)
  • Until source files change (checksum-based)
  • 7 days maximum for stale sessions

Shared With:

  • /debug-root-cause - Root cause analysis skill
  • /debug-session - Debug session documentation
  • /test - Test execution for verification

Usage Examples

Start New Debugging Session:

debug-systematic "API returns 500 on POST /users"
# Expected tokens: 1,500-3,000 (full analysis)

Resume Existing Session:

debug-systematic resume
# Expected tokens: 800-1,500 (skips completed hypotheses)

Test Specific Hypothesis:

debug-systematic test 1
# Expected tokens: 500-1,000 (focused testing)

Check Debugging Progress:

debug-systematic status
# Expected tokens: 200-500 (read session state only)

Mark Issue as Solved:

debug-systematic solved
# Expected tokens: 300-600 (generate summary)

Early Exit Conditions

Exit immediately (saves 90% tokens) when:

  • ✅ Issue already solved (check debug/state.json status)
  • ✅ No test framework available (can't reproduce)
  • ✅ Not a git repository (can't check recent changes)
  • ✅ Root cause already identified in session state

Progressive disclosure saves 60-80% tokens:

  • Show hypothesis formation → wait for user confirmation
  • Test one hypothesis at a time → report results
  • Only deep dive when hypothesis confirms

Implementation Checklist

  • ✅ Git diff analysis for recent changes (PRIMARY optimization)
  • ✅ Stack trace parsing with Grep (saves 90-95%)
  • ✅ Session-based hypothesis tracking (saves 70-80% on reruns)
  • ✅ Progressive hypothesis testing (most likely → least likely)
  • ✅ Bash-based log analysis (minimal tokens)
  • ✅ Test failure result caching
  • ✅ Early exit when issue resolved
  • ✅ Binary search for regressions (git bisect)
  • ✅ Focus area flags (specific file/function debugging)

Optimization Status: ✅ Optimized (Phase 2 Batch 2, 2026-01-26) Expected Tokens: 1,500-3,000 (vs. 4,000-6,000 unoptimized) Achieved Reduction: 50% average across all debugging operations

Session Intelligence

I'll maintain debugging session continuity:

Session Files (in current project directory):

  • debug/plan.md - Debugging plan with hypotheses and results
  • debug/state.json - Session state and test results
  • debug/reproduction.log - Issue reproduction steps and logs

IMPORTANT: Session files are stored in a debug folder in your current project root

Auto-Detection:

  • If session exists: Resume debugging from last hypothesis
  • If no session: Create debugging plan and initial reproduction
  • Commands: resume, reproduce, status, solved

Phase 1: Issue Reproduction & Information Gathering

Extended Thinking for Complex Debugging

For complex or elusive bugs, I'll use extended thinking to explore debugging strategies:

<think> When debugging complex issues: - Multiple potential root causes that interact - Timing-sensitive or race condition bugs - Environment-specific failures - Subtle state corruption scenarios - Performance degradation patterns - Security vulnerability exploitation paths </think>

Triggers for Extended Analysis:

  • Intermittent or non-deterministic bugs
  • Production-only failures
  • Performance issues without obvious cause
  • Security vulnerabilities
  • Multi-component system failures

MANDATORY FIRST STEPS:

  1. Check if debug directory exists in current working directory
  2. If directory exists, check for session files:
    • Look for debug/state.json
    • Look for debug/plan.md
    • If found, resume from last hypothesis
  3. If no directory or session exists:
    • Gather error information
    • Create reproduction steps
    • Initialize debugging session

Information Gathering (Token-Efficient):

#!/bin/bash
# Systematic Debugging - Information Gathering

gather_debug_info() {
    echo "=== Issue Reproduction Information ==="
    echo ""

    # 1. Error logs (use Grep, not cat)
    echo "Recent error logs:"
    if [ -d "logs" ]; then
        grep -i "error\|exception\|fatal" logs/*.log 2>/dev/null | tail -20 || echo "  No errors in logs"
    fi

    # 2. Git status (what changed recently)
    echo ""
    echo "Recent changes:"
    git log --oneline --since="3 days ago" | head -10 || echo "  Not a git repository"

    # 3. Environment info
    echo ""
    echo "Environment:"
    if [ -f "package.json" ]; then
        echo "  Node: $(node --version 2>/dev/null || echo 'not installed')"
        echo "  NPM: $(npm --version 2>/dev/null || echo 'not installed')"
    elif [ -f "requirements.txt" ]; then
        echo "  Python: $(python --version 2>/dev/null || echo 'not installed')"
    fi

    # 4. System resources
    echo ""
    echo "System resources:"
    echo "  Memory: $(free -h 2>/dev/null | grep Mem | awk '{print $3 "/" $2}' || echo 'N/A')"
    echo "  Disk: $(df -h . 2>/dev/null | tail -1 | awk '{print $3 "/" $2 " (" $5 ")"}' || echo 'N/A')"

    # 5. Running processes (if server issue)
    echo ""
    echo "Relevant processes:"
    ps aux | grep -E "node|python|java" | grep -v grep | head -5 || echo "  No relevant processes"
}

gather_debug_info > debug/initial-state.log
cat debug/initial-state.log

Reproduction Steps:

#!/bin/bash
# Create reproducible test case

create_reproduction() {
    cat > debug/reproduction.sh << 'EOF'
#!/bin/bash
# Minimal reproduction script

echo "=== Bug Reproduction Steps ==="
echo ""
echo "Step 1: Setup environment"
# TODO: Add setup commands

echo "Step 2: Execute actions that trigger bug"
# TODO: Add trigger commands

echo "Step 3: Verify bug occurs"
# TODO: Add verification

echo ""
echo "Expected: [describe expected behavior]"
echo "Actual: [describe actual behavior]"
EOF

    chmod +x debug/reproduction.sh
    echo "Created reproduction script: debug/reproduction.sh"
}

create_reproduction

Phase 2: Hypothesis Formation

I'll formulate testable hypotheses about the root cause:

Hypothesis Generation Framework:

# Debugging Plan - [timestamp]

## Issue Description
**Summary**: [brief description]
**Severity**: Critical | High | Medium | Low
**Impact**: [affected users/systems]
**Frequency**: Always | Intermittent | Rare

## Error Details

[Full error message/stack trace]


## Environment
- **Platform**: [OS, runtime version]
- **Configuration**: [relevant settings]
- **Recent Changes**: [commits/deployments]

## Hypotheses (Prioritized)

### Hypothesis 1: [Most likely cause] - PRIORITY: HIGH
**Theory**: [explanation of suspected cause]
**Evidence**: [supporting observations]
**Test**: [how to verify/disprove]
**Expected**: [what should happen if correct]
**Result**: [ ] Pending | [ ] Confirmed | [ ] Disproved

### Hypothesis 2: [Second most likely] - PRIORITY: MEDIUM
**Theory**: [explanation]
**Evidence**: [observations]
**Test**: [verification method]
**Expected**: [expected outcome]
**Result**: [ ] Pending | [ ] Confirmed | [ ] Disproved

### Hypothesis 3: [Alternative cause] - PRIORITY: LOW
**Theory**: [explanation]
**Evidence**: [observations]
**Test**: [verification method]
**Expected**: [expected outcome]
**Result**: [ ] Pending | [ ] Confirmed | [ ] Disproved

## Investigation Log
- [timestamp]: Initial reproduction successful
- [timestamp]: Hypothesis 1 testing in progress

Hypothesis Prioritization:

  1. Recent changes - Check git history
  2. Common patterns - Known bug categories
  3. Environment issues - Dependencies, config
  4. Logic errors - Code analysis
  5. External factors - Third-party services

Phase 3: Systematic Testing

I'll test each hypothesis methodically:

Testing Framework:

#!/bin/bash
# Hypothesis Testing Script

test_hypothesis() {
    local hypothesis_num="$1"
    local test_description="$2"

    echo "=== Testing Hypothesis $hypothesis_num ==="
    echo "Test: $test_description"
    echo ""

    # Create checkpoint before testing
    git stash push -m "Debug checkpoint before hypothesis $hypothesis_num"

    # Run test
    local result="PENDING"

    # Log result
    echo "[$hypothesis_num] $test_description: $result" >> debug/test-results.log
}

# Example: Test hypothesis about missing dependency
test_dependency_hypothesis() {
    echo "Hypothesis: Missing or incompatible dependency"

    # Check dependency versions
    if [ -f "package.json" ]; then
        echo "Checking npm dependencies..."
        npm list --depth=0 2>&1 | grep -i "missing\|error" && {
            echo "❌ CONFIRMED: Missing dependencies detected"
            return 0
        }
    fi

    echo "✓ DISPROVED: All dependencies present"
    return 1
}

# Example: Test hypothesis about race condition
test_race_condition_hypothesis() {
    echo "Hypothesis: Race condition in async code"

    # Add delays to test timing sensitivity
    echo "Running test with delays..."
    # TODO: Add test with deliberate delays

    echo "Running test rapidly..."
    for i in {1..10}; do
        # TODO: Run test in tight loop
        true
    done
}

# Test each hypothesis in priority order
test_dependency_hypothesis
test_race_condition_hypothesis

Binary Search Debugging:

#!/bin/bash
# Binary search through git history to find regression

git_bisect_debug() {
    echo "=== Git Bisect Debugging ==="

    # Find last known good commit
    read -p "Enter last known good commit (or tag): " good_commit
    read -p "Enter first known bad commit (or 'HEAD'): " bad_commit

    git bisect start
    git bisect bad $bad_commit
    git bisect good $good_commit

    cat > debug/bisect-test.sh << 'EOF'
#!/bin/bash
# Automated bisect test script

# Run test
npm test || exit 1  # Exit 1 if bad, 0 if good

# Or manual verification
echo "Test the current commit and press:"
echo "  g - if this commit is good"
echo "  b - if this commit is bad"
read -n 1 response
[ "$response" = "g" ] && exit 0 || exit 1
EOF

    chmod +x debug/bisect-test.sh
    echo "Run: git bisect run ./debug/bisect-test.sh"
}

Phase 4: Isolation & Simplification

I'll create minimal test cases:

Issue Isolation:

#!/bin/bash
# Create minimal reproducible example

create_minimal_reproduction() {
    local issue_type="$1"

    mkdir -p debug/minimal-case

    case $issue_type in
        "api")
            cat > debug/minimal-case/test.js << 'EOF'
// Minimal API test case
const fetch = require('node-fetch');

async function testIssue() {
    const response = await fetch('http://localhost:3000/api/endpoint');
    const data = await response.json();
    console.log('Response:', data);
    // Add assertion that fails
}

testIssue().catch(console.error);
EOF
            ;;

        "frontend")
            cat > debug/minimal-case/test.html << 'EOF'
<!DOCTYPE html>
<html>
<head>
    <title>Minimal Test Case</title>
</head>
<body>
    <button id="testBtn">Click to trigger issue</button>
    <div id="output"></div>

    <script>
        document.getElementById('testBtn').addEventListener('click', () => {
            // Minimal code to reproduce issue
            console.log('Testing...');
        });
    </script>
</body>
</html>
EOF
            ;;

        "database")
            cat > debug/minimal-case/test.sql << 'EOF'
-- Minimal database query to reproduce issue
BEGIN TRANSACTION;

-- Setup test data
CREATE TEMP TABLE test_data (id INT, value TEXT);
INSERT INTO test_data VALUES (1, 'test');

-- Query that demonstrates issue
SELECT * FROM test_data WHERE condition;

ROLLBACK;
EOF
            ;;
    esac

    echo "Created minimal test case in debug/minimal-case/"
}

Phase 5: Solution Implementation

Once root cause is identified, I'll implement the fix:

Fix Validation:

#!/bin/bash
# Validate fix before committing

validate_fix() {
    echo "=== Fix Validation ==="

    # 1. Run original reproduction - should now pass
    echo "Step 1: Run original reproduction..."
    if [ -f "debug/reproduction.sh" ]; then
        ./debug/reproduction.sh && echo "✓ Original issue resolved" || {
            echo "❌ Issue still reproduces"
            return 1
        }
    fi

    # 2. Run full test suite
    echo "Step 2: Run test suite..."
    npm test 2>&1 | tee debug/post-fix-tests.log

    # 3. Check for regressions
    echo "Step 3: Check for regressions..."
    git diff HEAD -- . | grep -E "^\+" | grep -v "^+++" | head -20

    # 4. Verify no new errors
    echo "Step 4: Lint check..."
    npm run lint 2>&1 | grep -i "error" && {
        echo "⚠️  New linting errors introduced"
    } || echo "✓ No new linting errors"

    echo ""
    echo "✓ Fix validation complete"
}

validate_fix

Fix Documentation:

## Solution

### Root Cause
[Detailed explanation of what caused the issue]

### Fix Applied
[Description of the solution]

```diff
// Before
- problematic code

// After
+ corrected code

Verification

  • Original reproduction no longer triggers issue
  • All tests passing
  • No regressions introduced
  • Edge cases handled

Prevention

[How to prevent similar issues in the future]

  • Add test coverage for [scenario]
  • Update validation to catch [condition]
  • Add monitoring for [metric]

## Phase 6: Regression Prevention

I'll add safeguards to prevent recurrence:

**Test Addition:**

```bash
#!/bin/bash
# Add regression test

add_regression_test() {
    local test_framework="$1"

    case $test_framework in
        "jest")
            cat >> tests/regression.test.js << 'EOF'

describe('Regression: [Issue Description]', () => {
  test('should not reproduce issue #123', async () => {
    // Reproduce the scenario that previously failed
    const result = await functionThatHadBug();

    // Assert correct behavior
    expect(result).toBe(expectedValue);
  });
});
EOF
            ;;

        "pytest")
            cat >> tests/test_regression.py << 'EOF'

def test_issue_123_regression():
    """Regression test for [issue description]"""
    # Reproduce the scenario
    result = function_that_had_bug()

    # Assert correct behavior
    assert result == expected_value
EOF
            ;;
    esac

    echo "Added regression test to prevent future occurrence"
}

Context Continuity

Session Resume: When you return and run /debug-systematic or /debug-systematic resume:

  • Load debugging plan and hypothesis results
  • Show which hypotheses have been tested
  • Continue from next untested hypothesis
  • Track full debugging timeline

Progress Example:

RESUMING DEBUGGING SESSION
├── Issue: API timeout on user search
├── Hypotheses: 5 total
├── Tested: 3 (2 disproved, 1 confirmed)
├── Current: Testing database query optimization
└── Status: Root cause identified

Continuing investigation...

Practical Examples

Start Debugging:

/debug-systematic "API returns 500 on POST /users"
/debug-systematic reproduce    # Create reproduction steps
/debug-systematic             # Auto-resume if session exists

Hypothesis Testing:

/debug-systematic test 1      # Test specific hypothesis
/debug-systematic isolate     # Create minimal reproduction
/debug-systematic bisect      # Git bisect to find regression

Session Control:

/debug-systematic resume      # Continue debugging
/debug-systematic status      # Show current progress
/debug-systematic solved      # Mark as solved and summarize

Debugging Techniques

Common Debugging Patterns:

  1. Print Debugging:
add_debug_logging() {
    echo "Adding strategic debug points..."
    # Add before suspected issue
    # Add after suspected issue
    # Compare outputs
}
  1. Rubber Duck Debugging:
## Explain to Rubber Duck
1. What the code should do: [expected behavior]
2. What the code actually does: [actual behavior]
3. Step-by-step execution: [trace through]
4. Where it diverges: [AHA moment]
  1. Divide and Conquer:
# Comment out half the code
# Does issue persist?
# - Yes: Issue in remaining half
# - No: Issue in commented half
# Repeat until isolated

Safety Guarantees

Protection Measures:

  • Git checkpoints before each test
  • Automated state restoration
  • No destructive operations without confirmation
  • Clear rollback paths

Important: I will NEVER:

  • Modify production code without validation
  • Skip hypothesis testing
  • Apply fixes without verification
  • Add AI attribution

Skill Integration

When appropriate, I may suggest:

  • /test - Run comprehensive test suite
  • /security-scan - Check if bug is security-related
  • /commit - Commit fix with clear message

Advanced Debugging Tools

Performance Profiling:

profile_performance() {
    # Node.js profiling
    node --prof app.js
    node --prof-process isolate-*.log > profile.txt

    # Python profiling
    python -m cProfile -o profile.stats script.py
    python -m pstats profile.stats
}

Memory Leak Detection:

detect_memory_leak() {
    # Monitor memory over time
    while true; do
        ps aux | grep node | awk '{print $6}' | head -1
        sleep 5
    done | tee memory.log

    # Analyze pattern
    gnuplot << 'EOF'
set terminal png
set output 'memory-usage.png'
plot 'memory.log' with lines
EOF
}

Network Debugging:

debug_network() {
    # Capture network traffic
    tcpdump -i any -w debug/network.pcap port 3000

    # Analyze with tshark
    tshark -r debug/network.pcap -Y "http.response.code >= 400"
}

What I'll Actually Do

  1. Gather information - Comprehensive context using Grep
  2. Reproduce issue - Create reliable reproduction
  3. Form hypotheses - Prioritized theories about cause
  4. Test systematically - Validate each hypothesis
  5. Isolate problem - Minimal reproducible case
  6. Implement fix - Targeted solution
  7. Prevent regression - Add tests and monitoring

I'll maintain complete debugging session continuity, tracking all hypotheses and results across sessions.

Credits: Systematic debugging methodology based on scientific method and debugging best practices from "Debugging: The 9 Indispensable Rules" by David Agans.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

cache-strategy

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

sessions-init

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

postman-convert

No summary provided by upstream source.

Repository SourceNeeds Review
debug-systematic | V50.AI