Test-Driven Development

Collaborating skills

Refactor: skill: refactor for systematic code improvement during the refactor phase
Vitest: skill: vitest for JavaScript/TypeScript testing with Vitest framework
Design Patterns: skill: design-pattern-adopter for applying design patterns when refactoring test-covered code

Guide for Red-Green-Refactor workflow. Supports Python (pytest) and Ruby (RSpec).

Quick Reference

	Python (pytest)	Ruby (RSpec)
Naming	`test_<fn>_<scenario>_<expected>`	`describe ".method"` / `"#method"`
Structure	Mirror `src/` in `tests/`	One expectation per `it`
Data	`@pytest.fixture` + type hints	`let` / `let!`
Variants	`@pytest.mark.parametrize`	`context` blocks
Exceptions	`pytest.raises(Error, match=)`	`expect { }.to raise_error`
Run	`uv run pytest`	`bundle exec rspec`
Lint	`uv run ruff check src/`	`bundle exec rubocop`

The Cycle

🔴 RED → Write a failing test

Write the smallest test that defines the desired behavior. Confirm it fails.

🟢 GREEN → Make it pass

Write minimal production code. Run tests until they pass.

🔵 REFACTOR → Improve

Clean up test and production code. Ensure tests still pass.

Use the refactor skill for systematic code improvement during this phase.

Scenarios

New Feature

Follow the Red-Green-Refactor cycle. Start with one test, make it pass, then add the next.

<details> <summary>Example: Testing a new function</summary>

Python:

import pytest
from src.cart import calculate_total, Item


class TestCalculateTotal:
    @pytest.mark.parametrize("items,expected", [
        ([], 0.0),
        ([Item("Coke", 1.50)], 1.50),
    ], ids=["empty", "single"])
    def test_with_valid_items_returns_total(
        self, items: list[Item], expected: float
    ) -> None:
        assert calculate_total(items) == expected

    def test_with_negative_price_raises_error(self) -> None:
        with pytest.raises(ValueError, match="Price cannot be negative"):
            calculate_total([Item("Coke", -1.50)])

Ruby:

RSpec.describe Cart do
  describe '#calculate_total' do
    subject { described_class.calculate_total(items) }

    context 'when cart is empty' do
      let(:items) { [] }
      it { is_expected.to eq(0.0) }
    end

    context 'when item has negative price' do
      let(:items) { [Item.new(name: 'Coke', price: -1.50)] }
      it { expect { subject }.to raise_error(ArgumentError, /Price cannot be negative/) }
    end
  end
end

</details>

Bug Fix

Never fix a bug without writing a test first.

Write test reproducing the bug
Confirm test fails
Fix the bug
Confirm test passes

<details> <summary>Example: Bug fix test</summary>

Python:

def test_calculate_total_with_negative_price_raises_value_error(self) -> None:
    """Regression test for issue #123: negative prices should raise."""
    items = [Item(name="Coke", price=-1.50)]
    with pytest.raises(ValueError, match="Price cannot be negative"):
        calculate_total(items)

Ruby:

context 'when item has negative price' do
  # Regression test for issue #123
  let(:items) { [Item.new(name: 'Coke', price: -1.50)] }

  it 'raises ArgumentError' do
    expect { subject }.to raise_error(ArgumentError, /Price cannot be negative/)
  end
end

</details>

Legacy Code

Write characterization tests capturing current behavior
Run tests to establish baseline
Make small changes with test coverage
Refactor incrementally

Refactoring Opportunities

After each cycle, check for:

Implementation:

Duplicate logic → Consolidate
Long functions → Break down
Unclear naming → Rename
Complex conditionals → Simplify

Tests:

Redundant tests → Parametrize
Duplicate fixtures → Extract to shared
Testing private methods → Test public interface instead
Vague assertions → Make specific

Best Practices

<details> <summary>Python (pytest)</summary>

Organization

test_*.py naming, mirror src/ structure
Group related tests in classes

Naming

# Good
def test_calculate_tax_with_negative_amount_raises_value_error():
    pass

# Bad - too vague
def test_calculate_tax():  # ❌
    pass

Fixtures

@pytest.fixture
def sample_user() -> dict[str, str]:
    """Provide sample user data."""
    return {"id": "123", "name": "John Doe"}

Parametrization

@pytest.mark.parametrize("input,expected", [
    (0, 0), (1, 1), (2, 4),
], ids=["zero", "one", "two"])
def test_square(input: int, expected: int) -> None:
    assert square(input) == expected

Exceptions

def test_divide_by_zero_raises_error() -> None:
    with pytest.raises(ValueError, match="Cannot divide by zero"):
        divide(10, 0)

Mocking

def test_fetch_user(mocker) -> None:
    mock_get = mocker.patch('requests.get')
    mock_get.return_value.json.return_value = {"id": 1}

    result = fetch_user(1)

    mock_get.assert_called_once_with("https://api.example.com/users/1")

</details> <details> <summary>Ruby (RSpec)</summary>

Structure

Test class methods (.method) before instance methods (#method)
One expectation per it block

RSpec.describe User do
  describe '.find' do
    # Class methods first
  end

  describe '#full_name' do
    # Instance methods second
  end
end

Test Variables

Use let instead of instance variables:

# Good
let(:user) { described_class.new(name: "John") }

# Bad
before { @user = User.new(name: "John") }  # ❌

Subject

describe '#full_name' do
  subject { user.full_name }
  it { is_expected.to eq "John Doe" }
end

Predicate Matchers

# Good
it { is_expected.to be_admin }

# Bad
it "returns true" do
  expect(subject.admin?).to be true  # ❌
end

Doubles

# Good
let(:notifier) { instance_double(Notifier) }

# Bad
let(:notifier) { double("notifier") }  # ❌

Private Methods

Don't test private methods unless explicitly required. Test the public interface.

</details>

Verification

<details> <summary>Python (pytest) Commands</summary>

uv run pytest                              # All tests
uv run pytest tests/models/test_user.py -v # Specific file
uv run pytest tests/test_file.py::TestClass::test_fn -vv  # Specific test
uv run pytest --cov=src --cov-report=term-missing  # With coverage
uv run pytest --durations=10               # Check timing
uv run pytest -m "not slow"                # Fast tests only
uv run mypy src/                           # Type check
uv run ruff check src/                     # Lint

</details> <details> <summary>Ruby (RSpec) Commands</summary>

bundle exec rspec                              # All tests
bundle exec rspec spec/models/user_spec.rb     # Specific file
bundle exec rspec spec/models/user_spec.rb:42  # Specific line
bundle exec rspec --format documentation       # Doc format
COVERAGE=true bundle exec rspec                # With coverage
bundle exec rspec --profile                    # Check timing
bundle exec rspec --tag ~slow                  # Fast tests only
bundle exec rubocop                            # Lint
bundle exec rubocop --autocorrect              # Auto-fix

</details>

Checklist

	Python	Ruby
Tests pass	`uv run pytest`	`bundle exec rspec`
Coverage ok	`uv run pytest --cov=src`	`COVERAGE=true bundle exec rspec`
No lint errors	`uv run ruff check src/`	`bundle exec rubocop`
Types check	`uv run mypy src/`	—

Key Principles

Test First — Write tests before implementation
Verify Red — Confirm tests fail before implementing
One at a Time — Focus on one failing test
Maintain Coverage — Never decrease coverage
Small Changes — Incremental changes, run tests frequently
Refactor Systematically — Use the refactor skill during the refactor phase

tdd

Safety Notice

Copy this and send it to your AI assistant to learn

Test-Driven Development

Collaborating skills

Quick Reference

The Cycle

🔴 RED → Write a failing test

🟢 GREEN → Make it pass

🔵 REFACTOR → Improve

Scenarios

New Feature

Bug Fix

Legacy Code

Refactoring Opportunities

Best Practices

Organization

Naming

Fixtures

Parametrization

Exceptions

Mocking

Structure

Test Variables

Subject

Predicate Matchers

Doubles

Private Methods

Verification

Checklist

Key Principles

Source Transparency

Related Skills

testing

code-quality

code-quality

tdd