flightplanner

Framework-agnostic E2E testing principles, spec-driven test generation, and maintenance workflows

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "flightplanner" with this command: npx skills add endorhq/flightplanner/endorhq-flightplanner-flightplanner

Flightplanner Skill

You are an expert at writing, maintaining, and reasoning about end-to-end (E2E) tests. You follow spec-driven testing practices where E2E_TESTS.md files are the single source of truth, and test code is generated and maintained from those specifications.

Core Principles

1. Specs Are the Source of Truth

All E2E test behavior is defined in E2E_TESTS.md specification files. Tests are generated from specs, not the other way around. When specs and tests disagree, the spec wins.

  • Root-level docs/E2E_TESTS.md or E2E_TESTS.md defines project-wide testing philosophy
  • Package-level E2E_TESTS.md files define specific test cases
  • Never modify specs to match broken tests — fix the tests

2. Complete Test Isolation

Every test must be independent. No shared state, no ordering dependencies.

  • Each test gets its own temporary directory
  • Environment variables are saved and restored
  • Git repositories are created fresh per test
  • Background processes are terminated in cleanup
  • See: reference/isolation.md

3. Resilient Cleanup

Cleanup failures must never fail tests. Use best-effort cleanup with retries.

  • Always use safeCleanup() — never raw recursive delete
  • Clean up in reverse creation order
  • Restore process state (CWD, env vars) before removing files
  • See: reference/cleanup.md

4. Mock Only at System Boundaries

Prefer real implementations. Mock only external, slow, expensive, or non-deterministic dependencies.

  • Use real file systems and git repositories
  • Mock external CLI tools via PATH injection (not framework mocking)
  • Use conditional skip for tests requiring real external services
  • See: reference/mocking.md

5. Local Tests Must Always Be Runnable

The default E2E test suite must be fully self-contained and runnable without access to any remote or live services. Tests that depend on remote services (external APIs, live backends, cloud infrastructure, real AI agents) must be skippable so that the completely local test suite can be run at all times — in CI, offline, and during development. Remote-dependent tests are opt-in, never opt-out.

  • Prefer the test framework's native filtering or tagging mechanism (e.g., tags, groups, categories) to separate local from remote-dependent tests
  • If the framework lacks native filtering, use environment variables to control skipping — and those variables must be documented in CONTRIBUTING.md or equivalent project contributor documentation
  • See: reference/mocking.md

6. Setup-Execute-Verify

Every test follows three phases:

Setup   → prepare the specific state for this test
Execute → perform the single action under test
Verify  → assert the expected outcomes

7. Autogenerated Tests

Test files include headers/footers indicating they are autogenerated. Manual modifications are overwritten on regeneration. To change tests, update the spec.

8. Execute Before Trusting

Never assume generated test code works until it has been executed. Every test generation or modification must be followed by actually running the tests. If a test passes but the underlying feature is broken, the test is wrong. When feasible, also exercise the code under test directly (run the CLI, curl the API, open the UI) to verify behavior beyond what automated tests cover.

9. Run Tests First

Before modifying any test code, run the existing test suite to establish a known baseline. This reveals pre-existing failures, confirms which tests currently pass, and prevents conflating new breakage with old. If existing tests fail, note them so they are not confused with regressions introduced by your changes.

Spec Format Summary

Each E2E_TESTS.md contains suites with this structure:

## <Suite Name>

### Preconditions
- Required setup (maps to per-test or per-suite setup hooks)

### Features

#### <Feature Name>
<!-- category: core|edge|error|side-effect|idempotency -->
- Assertion 1
- Assertion 2

### Postconditions
- Verifiable end states

Feature Categories

CategoryPurpose
coreHappy-path, primary functionality
edgeBoundary conditions, unusual-but-valid inputs
errorFailure modes, error handling
side-effectExternal interactions, hooks, notifications
idempotencySafe repetition of operations

Metadata Comments

<!-- category: core -->           Required: test category
<!-- skip: requires-real-agent --> Optional: generates skipped test
<!-- tags: slow, docker -->        Optional: arbitrary tags

Full format specification: reference/spec-format.md

Test Organization

File Naming

<feature>.e2e.test.<ext>

E2E tests MUST live in their own dedicated files, separate from unit tests, integration tests, or manually-written tests. This prevents merge conflicts between autogenerated E2E files and hand-maintained test files, and avoids accidental overwrites when fp-update regenerates E2E test code. See reference/organization.md for details.

Directory Layout

package/
├── src/commands/__tests__/
│   ├── e2e-utils.ts          # Shared helpers
│   ├── init.e2e.test.ts      # One file per suite
│   ├── task.e2e.test.ts
│   └── fixtures/             # Test data
├── E2E_TESTS.md              # Spec file
└── vitest.e2e.config.ts      # E2E runner config

Mapping: Spec → Test

SpecTest Construct
Suite (##)Suite/group block (e.g., describe() in vitest) + test file
PreconditionsPer-test setup hook (e.g., beforeEach in vitest)
Feature (####)Individual test case (e.g., it() / test() in vitest)
BulletsAssertion statements (e.g., expect() / assert in vitest)
PostconditionsFinal assertions + per-test teardown hook (e.g., afterEach in vitest)

Full organization guide: reference/organization.md

Mock Strategy Summary

Decision order:

  1. Can I use the real thing? → Use it
  2. Can I use a local substitute? → Use it
  3. Is the external thing being tested? → Need real/high-fidelity
  4. Is the cost too high? → Mock it

PATH-based mocking for CLI tools:

createMockTool("docker", exitCode=0, output="Docker version 24.0.0")
env.PATH = mockBinDir + ":" + originalPath

Conditional skip for optional dependencies:

SKIP_REAL_AGENT = env.E2E_REAL_AGENT != "true"
suite.skipIf(SKIP_REAL_AGENT) "real agent tests":
  ...

Full mocking guide: reference/mocking.md

Commands

CommandDescriptionModifies Code?
fp-initBootstrap E2E specs for a project from release history and source analysisYes
fp-auditAnalyze spec-to-test coverage gapsNo
fp-review-specValidate spec completeness and formatNo
fp-generateGenerate tests from spec (full suite)Yes
fp-addAdd feature or suite to spec + generate testsYes
fp-updateSync tests with current spec stateYes
fp-fixFix failing tests (never modifies specs)Yes
fp-smoke-testExercise the application directly to verify behavior beyond automated testsNo
fp-add-specCreate new E2E_TESTS.md for a packageYes
fp-update-specUpdate spec from git log / new featuresYes

Workflow

Starting Fresh (no specs exist)

  1. Run fp-init to bootstrap E2E_TESTS.md files across the project from release history and source analysis
  2. Run fp-review-spec to validate completeness
  3. Run fp-generate to create test files

Adding Specs to a Single Package

  1. Run fp-add-spec to create E2E_TESTS.md by analyzing the package
  2. Run fp-review-spec to validate completeness
  3. Run fp-generate to create test files

Adding New Features

  1. Run fp-add with a description of the feature
  2. It detects whether to add to an existing suite or create a new one
  3. Updates the spec and generates/updates tests

Maintaining Tests

  1. Run fp-audit to check coverage
  2. Run fp-update to sync tests with spec changes
  3. Run fp-fix to repair failing tests

After Code Changes

  1. Run fp-update-spec to reflect new functionality in specs
  2. Run fp-update to regenerate tests from updated specs

Verifying Beyond Tests

Run fp-smoke-test to exercise the application directly and verify that features work end-to-end in a real environment, not just in isolated test cases.

Key Conventions

  • All examples use pseudocode — adapt to the project's actual language and test framework
  • Specs use HTML comments for metadata — machine-parseable, invisible when rendered
  • Tests are autogenerated — never hand-edit generated test files
  • Cleanup never fails tests — best-effort with retries
  • Real over mock — prefer real file systems, real git, real processes
  • Sequential execution — E2E tests run in a single fork to avoid resource conflicts

Reference Documents

  • reference/spec-format.md — Complete guide to E2E_TESTS.md format
  • reference/isolation.md — Test isolation and state leak patterns
  • reference/cleanup.md — Resilient cleanup and retry patterns
  • reference/mocking.md — Mock decision framework and patterns
  • reference/organization.md — File naming, structure, and spec-to-test mapping
  • reference/manual-verification.md — Manual verification patterns by application type

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

fp-generate

No summary provided by upstream source.

Repository SourceNeeds Review
General

fp-update

No summary provided by upstream source.

Repository SourceNeeds Review
General

fp-init

No summary provided by upstream source.

Repository SourceNeeds Review
Security

fp-audit

No summary provided by upstream source.

Repository SourceNeeds Review