Distributed Systems Patterns

Comprehensive patterns for building reliable distributed systems. Each category has individual rule files in rules/ loaded on-demand.

Quick Reference

Category Rules Impact When to Use

Distributed Locks 3 CRITICAL Redis/Redlock locks, PostgreSQL advisory locks, fencing tokens

Resilience 3 CRITICAL Circuit breakers, retry with backoff, bulkhead isolation

Idempotency 3 HIGH Idempotency keys, request dedup, database-backed idempotency

Rate Limiting 3 HIGH Token bucket, sliding window, distributed rate limits

Edge Computing 2 HIGH Edge workers, V8 isolates, CDN caching, geo-routing

Event-Driven 2 HIGH Event sourcing, CQRS, transactional outbox, sagas

Total: 16 rules across 6 categories

Quick Start

Redis distributed lock with Lua scripts

async with RedisLock(redis_client, "payment:order-123"): await process_payment(order_id)

Circuit breaker for external APIs

@circuit_breaker(failure_threshold=5, recovery_timeout=30) @retry(max_attempts=3, base_delay=1.0) async def call_external_api(): ...

Idempotent API endpoint

@router.post("/payments") async def create_payment( data: PaymentCreate, idempotency_key: str = Header(..., alias="Idempotency-Key"), ): return await idempotent_execute(db, idempotency_key, "/payments", process)

Token bucket rate limiting

limiter = TokenBucketLimiter(redis_client, capacity=100, refill_rate=10) if await limiter.is_allowed(f"user:{user_id}"): await handle_request()

Distributed Locks

Coordinate exclusive access to resources across multiple service instances.

Rule File Key Pattern

Redis & Redlock ${CLAUDE_SKILL_DIR}/rules/locks-redis-redlock.md

Lua scripts, SET NX, multi-node quorum

PostgreSQL Advisory ${CLAUDE_SKILL_DIR}/rules/locks-postgres-advisory.md

Session/transaction locks, lock ID strategies

Fencing Tokens ${CLAUDE_SKILL_DIR}/rules/locks-fencing-tokens.md

Owner validation, TTL, heartbeat extension

Resilience

Production-grade fault tolerance for distributed systems.

Rule File Key Pattern

Circuit Breaker ${CLAUDE_SKILL_DIR}/rules/resilience-circuit-breaker.md

CLOSED/OPEN/HALF_OPEN states, sliding window

Retry & Backoff ${CLAUDE_SKILL_DIR}/rules/resilience-retry-backoff.md

Exponential backoff, jitter, error classification

Bulkhead Isolation ${CLAUDE_SKILL_DIR}/rules/resilience-bulkhead.md

Semaphore tiers, rejection policies, queue depth

Idempotency

Ensure operations can be safely retried without unintended side effects.

Rule File Key Pattern

Idempotency Keys ${CLAUDE_SKILL_DIR}/rules/idempotency-keys.md

Deterministic hashing, Stripe-style headers

Request Dedup ${CLAUDE_SKILL_DIR}/rules/idempotency-dedup.md

Event consumer dedup, Redis + DB dual layer

Database-Backed ${CLAUDE_SKILL_DIR}/rules/idempotency-database.md

Unique constraints, upsert, TTL cleanup

Rate Limiting

Protect APIs with distributed rate limiting using Redis.

Rule File Key Pattern

Token Bucket ${CLAUDE_SKILL_DIR}/rules/ratelimit-token-bucket.md

Redis Lua scripts, burst capacity, refill rate

Sliding Window ${CLAUDE_SKILL_DIR}/rules/ratelimit-sliding-window.md

Sorted sets, precise counting, no boundary spikes

Distributed Limits ${CLAUDE_SKILL_DIR}/rules/ratelimit-distributed.md

SlowAPI + Redis, tiered limits, response headers

Edge Computing

Edge runtime patterns for Cloudflare Workers, Vercel Edge, and Deno Deploy.

Rule File Key Pattern

Edge Workers ${CLAUDE_SKILL_DIR}/rules/edge-workers.md

V8 isolate constraints, Web APIs, geo-routing, auth at edge

Edge Caching ${CLAUDE_SKILL_DIR}/rules/edge-caching.md

Cache-aside at edge, CDN headers, KV storage, stale-while-revalidate

Event-Driven

Event sourcing, CQRS, saga orchestration, and reliable messaging patterns.

Rule File Key Pattern

Event Sourcing ${CLAUDE_SKILL_DIR}/rules/event-sourcing.md

Event-sourced aggregates, CQRS read models, optimistic concurrency

Event Messaging ${CLAUDE_SKILL_DIR}/rules/event-messaging.md

Transactional outbox, saga compensation, idempotent consumers

Key Decisions

Decision Recommendation

Lock backend Redis for speed, PostgreSQL if already using it, Redlock for HA

Lock TTL 2-3x expected operation time

Circuit breaker recovery Half-open probe with sliding window

Retry algorithm Exponential backoff + full jitter

Bulkhead isolation Semaphore-based tiers (Critical/Standard/Optional)

Idempotency storage Redis (speed) + DB (durability), 24-72h TTL

Rate limit algorithm Token bucket for most APIs, sliding window for strict quotas

Rate limit storage Redis (distributed, atomic Lua scripts)

When NOT to Use

No separate event-sourcing/saga/CQRS skills exist — they are rules within distributed-systems. But most projects never need them.

Pattern Interview Hackathon MVP Growth Enterprise Simpler Alternative

Event sourcing OVERKILL OVERKILL OVERKILL OVERKILL WHEN JUSTIFIED Append-only table with status column

Saga orchestration OVERKILL OVERKILL OVERKILL SELECTIVE APPROPRIATE Sequential service calls with manual rollback

Circuit breaker OVERKILL OVERKILL BORDERLINE APPROPRIATE REQUIRED Try/except with timeout

Distributed locks OVERKILL OVERKILL BORDERLINE APPROPRIATE REQUIRED Database row-level lock (SELECT FOR UPDATE)

CQRS OVERKILL OVERKILL OVERKILL OVERKILL WHEN JUSTIFIED Single model for read/write

Transactional outbox OVERKILL OVERKILL OVERKILL SELECTIVE APPROPRIATE Direct publish after commit

Rate limiting OVERKILL OVERKILL SIMPLE ONLY APPROPRIATE REQUIRED Nginx rate limit or cloud WAF

Rule of thumb: If you have a single server process, you do not need distributed systems patterns. Use in-process alternatives. Add distribution only when you actually have multiple instances.

Anti-Patterns (FORBIDDEN)

LOCKS: Never forget TTL (causes deadlocks)

await redis.set(f"lock:{name}", "1") # WRONG - no expiry!

LOCKS: Never release without owner check

await redis.delete(f"lock:{name}") # WRONG - might release others' lock

RESILIENCE: Never retry non-retryable errors

@retry(max_attempts=5, retryable_exceptions={Exception}) # Retries 401!

RESILIENCE: Never put retry outside circuit breaker

@retry # Would retry when circuit is open! @circuit_breaker async def call(): ...

IDEMPOTENCY: Never use non-deterministic keys

key = str(uuid.uuid4()) # Different every time!

IDEMPOTENCY: Never cache error responses

if response.status_code >= 400: await cache_response(key, response) # Errors should retry!

RATE LIMITING: Never use in-memory counters in distributed systems

request_counts = {} # Lost on restart, not shared across instances

Detailed Documentation

Resource Description

${CLAUDE_SKILL_DIR}/scripts/

Templates: lock implementations, circuit breaker, rate limiter

${CLAUDE_SKILL_DIR}/checklists/

Pre-flight checklists for each pattern category

${CLAUDE_SKILL_DIR}/references/

Deep dives: Redlock algorithm, bulkhead tiers, token bucket

${CLAUDE_SKILL_DIR}/examples/

Complete integration examples

Related Skills

caching
Redis caching patterns, cache as fallback
background-jobs
Job deduplication, async processing with retry
observability-monitoring
Metrics and alerting for circuit breaker state changes
error-handling-rfc9457
Structured error responses for resilience failures
auth-patterns
API key management, authentication integration

distributed-systems

Safety Notice

Copy this and send it to your AI assistant to learn

Redis distributed lock with Lua scripts

Circuit breaker for external APIs

Idempotent API endpoint

Token bucket rate limiting

LOCKS: Never forget TTL (causes deadlocks)

LOCKS: Never release without owner check

RESILIENCE: Never retry non-retryable errors

RESILIENCE: Never put retry outside circuit breaker

IDEMPOTENCY: Never use non-deterministic keys

IDEMPOTENCY: Never cache error responses

RATE LIMITING: Never use in-memory counters in distributed systems

Source Transparency

Related Skills

ui-components

responsive-patterns

domain-driven-design