Testing & Coverage Skill

Test execution with coverage analysis, interactive gap identification, test generation, and iterative coverage improvement.

Quick Start

Run all tests

/sc:test

Unit tests with coverage

/sc:test src/components --type unit --coverage

Coverage gap analysis - identify untested code

/sc:test --gap-analysis --target 80

Generate missing tests to reach target coverage

/sc:test --generate --target 80 --module src/services

Watch mode with auto-fix

/sc:test --watch --fix

Fix existing failures before adding coverage

/sc:test --fix-first --generate --target 80

Web search for testing guidance (uses Rube MCP's LINKUP_SEARCH)

/sc:test --linkup --query "pytest asyncio best practices"

Philosophy: Ask Early, Ask Often

This skill liberally uses AskUserQuestion at decision points. Test coverage involves tradeoffs between speed and thoroughness. Validate priorities rather than guessing what the user wants covered.

Before writing tests -- confirm which modules to prioritize
When gaps are large -- ask where to focus effort first
When source code looks buggy -- ask before fixing vs just testing
After coverage run -- ask about iteration vs stopping

Behavioral Flow

Discover - Categorize tests using runner patterns
Configure - Set up test environment and parameters
Execute - Run tests with real-time progress tracking
Analyze - Generate coverage reports and diagnostics
Confirm Priorities - Interactive checkpoint: ask user what to tackle
Study Patterns - Read test infrastructure and existing conventions
Generate - Write missing tests following project conventions
Validate - Run new tests, fix failures, lint
Verify Coverage - Re-run coverage and compare before/after
Iterate - Ask user: continue, stop, or change focus
Report - Provide recommendations and quality metrics

Flags

Flag Type Default Description

--type

string all unit, integration, e2e, all

--coverage

bool false Generate coverage report

--watch

bool false Continuous watch mode

--fix

bool false Auto-fix simple failures

--gap-analysis

bool false Identify coverage gaps without generating tests

--generate

bool false Generate missing tests to reach target

--target

int 80 Coverage percentage target

--module

string

Restrict scope to a specific module

--fix-first

bool false Fix existing failures before generating new tests

--dry-run

bool false Show gap analysis and test plan without writing

--linkup

bool false Web search for guidance (via Rube MCP)

--query

string

Search query for LINKUP_SEARCH

Personas Activated

qa-specialist - Test analysis and quality assessment

MCP Integration

PAL MCP (Quality & Debugging)

Tool When to Use Purpose

mcp__pal__debug

Test failures Root cause analysis for failing tests

mcp__pal__codereview

Test quality Review test coverage and quality

mcp__pal__thinkdeep

Complex failures Multi-stage investigation of flaky tests

mcp__pal__consensus

Test strategy Multi-model validation of testing approach

mcp__pal__apilookup

Framework docs Get current testing framework documentation

PAL Usage Patterns

Debug failing test

mcp__pal__debug( step="Investigating intermittent test failure", hypothesis="Race condition in async setup", confidence="medium", relevant_files=["/tests/test_api.py"] )

Review test quality

mcp__pal__codereview( review_type="full", findings="Test coverage, assertion quality, edge cases", focus_on="test isolation and mocking patterns" )

Validate testing strategy

mcp__pal__consensus( models=[{"model": "gpt-5.2", "stance": "neutral"}, {"model": "gemini-3-pro", "stance": "neutral"}], step="Evaluate: Is integration testing sufficient for this feature?" )

Rube MCP (Automation & Research)

Tool When to Use Purpose

mcp__rube__RUBE_SEARCH_TOOLS

CI/CD integration Find test reporting tools

mcp__rube__RUBE_MULTI_EXECUTE_TOOL

Notifications Post results to Slack, update tickets

mcp__rube__RUBE_REMOTE_WORKBENCH

Bulk processing Analyze large test result sets

Rube Usage Patterns

Search for testing best practices (--linkup flag uses LINKUP_SEARCH)

mcp__rube__RUBE_MULTI_EXECUTE_TOOL(tools=[ {"tool_slug": "LINKUP_SEARCH", "arguments": { "query": "pytest fixtures best practices", "depth": "deep", "output_type": "sourcedAnswer" }} ])

Post test results to Slack

mcp__rube__RUBE_MULTI_EXECUTE_TOOL(tools=[ {"tool_slug": "SLACK_SEND_MESSAGE", "arguments": { "channel": "#ci-results", "text": "Test run complete: 95% pass rate, 87% coverage" }} ])

Evidence Requirements

This skill requires evidence. You MUST:

Show test execution output and pass/fail counts
Reference coverage metrics when --coverage used
Provide actual error messages for failures

Test Type Definitions

Type What It Tests Markers

Unit Single function/class in isolation, mocked dependencies (none or framework default)

Integration Multiple components together, real DB via fixtures @pytest.mark.integration , describe("integration")

E2E Full pipeline with real or mocked external APIs @pytest.mark.slow , @pytest.mark.e2e

Phase 1: Coverage Baseline

1.1 Run Current Coverage Report

Detect the test framework and run coverage:

Python (pytest)

pytest --cov=src --cov-report=term-missing --cov-report=json:coverage.json -q --tb=no

JavaScript (jest/vitest)

npx jest --coverage --coverageReporters=json-summary

or: npx vitest run --coverage

Go

go test -coverprofile=coverage.out ./... go tool cover -func=coverage.out

Rust

cargo tarpaulin --out json

1.2 Parse Coverage Gaps

Read the coverage report to extract:

Per-file coverage percentage
Uncovered line numbers per file
Total project coverage

If --module is provided, filter coverage data to the specified module only.

1.3 Build Gap Report

Create a ranked list of files by coverage gap (lowest coverage first):

Coverage Gap Report

Current Coverage: 52% | Target: 80% | Gap: 28%

File	Coverage	Missing Lines	Priority
src/services/payment.py	0%	1-120	CRITICAL
src/utils/validator.py	15%	12-45, 67-89	HIGH
src/api/routes/users.py	42%	55-70, 88-102	MEDIUM

Priority rules:

CRITICAL: 0% coverage (no tests at all)
HIGH: < 30% coverage
MEDIUM: 30-60% coverage
LOW: > 60% but below target

1.4 Confirm Coverage Priorities with User

After building the gap report, ask the user what to prioritize:

AskUserQuestion: question: "Found <N> files below target. Which should I tackle first?" header: "Priority" multiSelect: false options: - label: "Highest-impact first (Recommended)" description: "<N CRITICAL + M HIGH priority files -- start with 0% coverage modules>" - label: "Quick wins first" description: "Start with files that need only 1-2 tests to reach target" - label: "Specific module" description: "I want to focus on a specific area of the codebase" - label: "Dry run only" description: "Just show me the gap report -- don't write any tests yet"

If "Specific module": Ask which module to focus on. If "Dry run only": Present the gap report and stop.

Skip these files by default:

Protocol/interface definitions with no logic
Migration files
init.py / index.ts with only re-exports
Generated code (protobuf, OpenAPI clients)

Phase 2: Analyze Existing Test Patterns

Before writing ANY tests, study the project's conventions.

2.1 Read Test Infrastructure

Discover and read test setup files:

Framework Files to Read

pytest conftest.py , pyproject.toml [tool.pytest]

jest jest.config.* , setupTests.* , mocks/

vitest vitest.config.* , setup.*

Go *_test.go helpers, testdata/

Rust tests/common/mod.rs

2.2 Find Pattern Examples

For each gap file, find the closest existing test as a template:

Search for tests in the same directory/module
Identify import patterns, fixture usage, assertion style
Note any test base classes or shared helpers

2.3 Identify Available Fixtures/Helpers

Map available test fixtures to their use cases. Build a reference table:

Fixture/Helper	Purpose	Used By
db_session	Database access	Integration tests
mock_api_client	Mock external API	Unit tests
...	...	...

2.4 Classify Functions for Test Type

Function Characteristic Test Type

Pure function (no I/O, no DB) Unit test

Uses validation only Unit test

Calls database/ORM Integration test (needs DB fixture)

Calls external API Unit test with mock

HTTP endpoint handler Integration test

Full pipeline execution E2E test

Phase 3: Generate Tests by Type

Process gaps in priority order (CRITICAL first). For each file:

3.1 Read the Source Module

Identify:

All public functions and classes
Their signatures, return types, and dependencies
Which dependencies need mocking vs real fixtures
Edge cases: empty inputs, None values, error conditions

3.2 Write Unit Tests

Follow the AAA pattern (Arrange/Act/Assert):

Python example

class TestFunctionName: """Tests for function_name."""

# Happy path
async def test_returns_expected_result(self):
    """function_name returns correct output for valid input."""
    # Arrange
    input_data = ...

    # Act
    result = await function_under_test(input_data)

    # Assert
    assert result.field == expected_value

# Edge cases
async def test_handles_empty_input(self):
    """function_name handles empty input correctly."""
    ...

# Error conditions (Let It Crash - verify errors propagate)
async def test_raises_on_invalid_input(self):
    """function_name raises ValueError for invalid input."""
    with pytest.raises(ValueError):
        await function_under_test(invalid_input)

// TypeScript/Jest example describe('functionName', () => { it('returns expected result for valid input', () => { // Arrange const input = { ... };

// Act
const result = functionName(input);

// Assert
expect(result.field).toBe(expectedValue);

});

it('throws on invalid input', () => { expect(() => functionName(invalidInput)).toThrow(); }); });

// Go example func TestFunctionName(t *testing.T) { t.Run("returns expected result", func(t *testing.T) { // Arrange input := ...

    // Act
    result, err := FunctionName(input)

    // Assert
    if err != nil {
        t.Fatalf("unexpected error: %v", err)
    }
    if result.Field != expected {
        t.Errorf("got %v, want %v", result.Field, expected)
    }
})

}

Rules for all languages:

Use AAA pattern (Arrange/Act/Assert) with comments
Docstrings/descriptions describe expected behavior, not implementation
Mock only external dependencies (APIs, DB, network)
Do NOT add try/except or try/catch in tests (Let It Crash principle)
Do NOT modify source code to make tests pass (unless there's a genuine bug)

3.3 Write Integration Tests

For tests requiring database or multiple components:

Mark with appropriate marker (@pytest.mark.integration , tagged describe blocks)
Use real DB fixtures or test containers
Test real component interactions, not mocked ones
Clean up is automatic via fixture teardown

3.4 Write E2E Tests

For full pipeline or API flow tests:

Mark with slow/e2e marker
Use HTTP test clients (httpx AsyncClient, supertest, net/http/httptest)
Mock external API responses via fixtures
Test full request/response cycle including auth
Verify response schema matches expected models

Phase 4: Validate Tests

4.1 Run New Tests Only

Run just the newly created/modified test file

pytest tests/test_<area>/test_<module>.py -v --tb=short npx jest tests/<module>.test.ts --verbose go test -v -run TestNewFunction ./pkg/...

4.2 Fix Failures

If tests fail:

Read the error output carefully
Determine if the failure is in the TEST or the SOURCE code
Fix the test if the source code behavior is correct
NEVER modify source code to make tests pass (unless there's a genuine bug)
If source code has a bug -- ask the user before fixing it vs just testing current behavior

4.3 Run Full Suite

After all new tests pass individually:

Full test suite - ensure no regressions

pytest -v --tb=short npx jest --verbose go test ./...

4.4 Lint Test Files

Python

ruff check tests/ --fix && ruff format tests/

TypeScript/JavaScript

npx eslint tests/ --fix && npx prettier tests/ --write

Go

gofmt -w *_test.go

Phase 5: Coverage Verification

5.1 Re-run Coverage

Run the same coverage command from Phase 1 to get updated metrics.

5.2 Compare Before/After

Report coverage delta:

Coverage Report

Metric	Before	After	Delta
Total Coverage	52%	78%	+26%
Files with 0%	5	1	-4

Per-File Improvements

File	Before	After	Tests Added
src/services/payment.py	0%	85%	8
src/utils/validator.py	15%	72%	5

5.3 Ask Whether to Iterate

If coverage is still below target, ask the user rather than auto-iterating:

AskUserQuestion: question: "Coverage is at <M%> (target: <N%>). Want to continue adding tests?" header: "Iterate" multiSelect: false options: - label: "Continue -- add more tests" description: "<K files remaining below target -- next batch would cover <list>" - label: "Good enough -- stop here" description: "Accept current coverage and move on" - label: "Raise the target" description: "Current coverage exceeded expectations -- set a higher target" - label: "Switch focus" description: "Cover a different area instead of continuing the current batch"

If continuing:

Re-run Phase 1 to identify remaining gaps
Focus on files with the largest uncovered line counts
Repeat Phases 3-5 until target is reached or user stops

Maximum iterations: 5 (per --loop convention).

Phase 6: Summary

Present final results:

Test Coverage Update Summary

Target: N% Achieved: M% Tests Added: X unit, Y integration, Z e2e

New Test Files Created

tests/test_<area>/test_<module>.py (N tests)
...

Modified Test Files

tests/test_<area>/test_<module>.py (+N tests)
...

Coverage by Category

Category	Coverage	Status
src/services/	??%	OK/NEEDS WORK
src/routers/	??%	OK/NEEDS WORK
src/models/	??%	OK/NEEDS WORK

Remaining Gaps (if any)

src/services/transformer.py - Excluded (needs full pipeline data)
...

Verification Commands

<framework-specific commands to re-run tests and coverage>

Anti-Patterns to Avoid

DO NOT write tests that:

Test private/internal methods directly (test via public API)
Duplicate existing test coverage (check first!)
Require real API keys to pass (mock external services)
Depend on test execution order
Use time.sleep() for async synchronization
Catch exceptions that should propagate (Let It Crash)
Add defensive if x is not None checks (test the contract)

DO NOT:

Modify source code to make it "more testable" (test what exists)
Add type stubs or docstrings to source files (only touch test files)
Create test utility frameworks or base classes (KISS)
Write parameterized tests for < 3 cases (just write separate tests)
Add comments or docstrings to code you didn't change

Coverage Analysis (standalone)

When only --coverage is enabled (without --generate ):

Line coverage metrics
Branch coverage metrics
Uncovered code identification
Coverage trend comparison

Tool Coordination

Bash - Test runner execution
Glob - Test file discovery
Grep - Result parsing, failure analysis
Read - Source and test file inspection
Write - New test files, coverage reports
Edit - Extending existing test files
AskUserQuestion - Priority confirmation, iteration decisions

sc-test

Safety Notice

Copy this and send it to your AI assistant to learn

Run all tests

Unit tests with coverage

Coverage gap analysis - identify untested code

Generate missing tests to reach target coverage

Watch mode with auto-fix

Fix existing failures before adding coverage

Web search for testing guidance (uses Rube MCP's LINKUP_SEARCH)

string

string

Debug failing test

Review test quality

Validate testing strategy

Search for testing best practices (--linkup flag uses LINKUP_SEARCH)

Post test results to Slack

Python (pytest)

JavaScript (jest/vitest)

or: npx vitest run --coverage

Go

Rust

Coverage Gap Report

Python example

Run just the newly created/modified test file

Full test suite - ensure no regressions

Python

TypeScript/JavaScript

Go

Coverage Report

Per-File Improvements

Test Coverage Update Summary

New Test Files Created

Modified Test Files

Coverage by Category

Remaining Gaps (if any)

Verification Commands

Source Transparency

Related Skills

sc-estimate

sc-design

sc-principles

learned-skills-index