Feature Flag Management Skill
Overview
Feature flags decouple deployment from release, enabling trunk-based development, safe rollouts, and instant rollbacks. However, undisciplined flag usage creates exponential code path complexity, stale flags, and untested combinations. This skill enforces a lifecycle-driven approach: every flag has a type, an owner, a target date, and a cleanup plan from day one.
When to Use
-
When implementing trunk-based development with continuous deployment
-
When rolling out features gradually to reduce risk
-
When setting up A/B testing infrastructure
-
When auditing existing codebases for stale or orphaned feature flags
-
When choosing between feature flag platforms
-
When implementing kill-switches for critical features
Iron Laws
-
ALWAYS assign an owner and expiration date to every feature flag -- orphaned flags without owners accumulate indefinitely and become permanent tech debt.
-
NEVER nest feature flags more than 2 levels deep -- combinatorial explosion makes testing impossible (2 flags = 4 states, 5 flags = 32 states, 10 flags = 1024 states).
-
ALWAYS default flag values to the existing/safe behavior -- if the flag system fails, the application should behave as it did before the flag was added.
-
NEVER use feature flags as a substitute for configuration management -- flags are temporary toggles for release control, not permanent application settings.
-
ALWAYS clean up flags within 30 days of full rollout -- stale flags in code increase cognitive load, slow onboarding, and hide dead code paths.
Anti-Patterns
Anti-Pattern Why It Fails Correct Approach
Creating flags without expiration dates or owners Flags become permanent; nobody knows if they can be removed Require owner and target-date fields at creation time; alert on overdue flags
Nesting 3+ flags in conditional logic Testing requires covering all combinations; bugs hide in untested paths Limit nesting to 2 levels; combine related flags into a single multi-valued flag
Defaulting new features to ON when flag is missing Flag system outage enables untested features for all users Default to OFF (existing behavior); explicitly enable after validation
Using flags for permanent configuration Config changes require code deploys to remove; defeats the purpose of config Use environment variables or config files for permanent settings; flags are temporary
Testing only with flags ON or only with flags OFF Misses interaction bugs between flag states Test both states; add flag-combination matrix to CI for critical flags
Workflow
Step 1: Flag Classification
Classify every flag before creation:
Type Purpose Lifetime Example
Release Control feature visibility during rollout Days to weeks enable_new_checkout
Experiment A/B test with metrics collection Weeks to months experiment_pricing_page_v2
Ops Kill-switch for operational control Permanent (with review) circuit_breaker_payments
Permission User/role-based access control Permanent enable_admin_dashboard
Step 2: Implementation Pattern
// OpenFeature SDK pattern (vendor-neutral) import { OpenFeature } from '@openfeature/server-sdk';
const client = OpenFeature.getClient();
// Typed flag evaluation with safe default const showNewUI = await client.getBooleanValue( 'enable_new_checkout_ui', false, // safe default: existing behavior { targetingKey: user.id, attributes: { plan: user.plan } } );
if (showNewUI) { renderNewCheckout(); } else { renderLegacyCheckout(); }
Step 3: Gradual Rollout Strategy
Phase 1: Internal (0-1 day)
- Enable for development team
- Verify in production environment
Phase 2: Canary (1-3 days)
- Enable for 1% of users
- Monitor error rates, latency, business metrics
Phase 3: Controlled Rollout (3-7 days)
- Ramp: 5% -> 10% -> 25% -> 50% -> 100%
- Hold at each stage for minimum 24 hours
- Define rollback criteria before advancing
Phase 4: Cleanup (within 30 days of 100%)
- Remove flag checks from code
- Remove flag from platform
- Update documentation
Step 4: Flag-Aware Testing
// Test both flag states in CI describe('Checkout Flow', () => { describe('with new_checkout_ui enabled', () => { beforeEach(() => { flagProvider.setOverride('enable_new_checkout_ui', true); });
it('should render new checkout components', () => {
// test new path
});
});
describe('with new_checkout_ui disabled', () => { beforeEach(() => { flagProvider.setOverride('enable_new_checkout_ui', false); });
it('should render legacy checkout components', () => {
// test legacy path
});
}); });
Step 5: Stale Flag Detection
Find flags older than 30 days that are fully rolled out
Custom script pattern for codebase scanning
grep -rn 'isEnabled|getBooleanValue|getFlag' src/ |
awk -F"'" '{print $2}' |
sort -u > active_flags.txt
Compare against flag platform inventory
Flag any that are 100% enabled for > 30 days
Step 6: Cleanup Checklist
For each flag being retired:
-
Remove all flag evaluation calls from code
-
Remove unused code path (the one not selected)
-
Remove flag from platform/configuration
-
Remove flag from test overrides
-
Update documentation referencing the flag
-
Verify no other flags depend on this flag
-
Deploy and verify behavior matches full-rollout state
Complementary Skills
Skill Relationship
tdd
Test-driven development for flag-guarded features
ci-cd-implementation-rule
CI/CD pipeline integration with flag-aware deploys
qa-workflow
QA validation across flag combinations
proactive-audit
Audit for stale or orphaned flags in codebase
Memory Protocol (MANDATORY)
Before starting:
Read .claude/context/memory/learnings.md for prior feature flag patterns and platform-specific decisions.
After completing:
-
New pattern -> .claude/context/memory/learnings.md
-
Issue found -> .claude/context/memory/issues.md
-
Decision made -> .claude/context/memory/decisions.md
ASSUME INTERRUPTION: Your context may reset. If it's not in memory, it didn't happen.