Error Handling Patterns
Build resilient applications with robust error handling strategies that gracefully handle failures and provide excellent debugging experiences.
When to Use This Skill
-
Implementing error handling in new features
-
Designing error-resilient APIs
-
Debugging production issues
-
Improving application reliability
-
Creating better error messages for users and developers
-
Implementing retry and circuit breaker patterns
-
Handling async/concurrent errors
-
Building fault-tolerant distributed systems
Core Concepts
- Error Handling Philosophies
Exceptions vs Result Types:
-
Exceptions: Traditional try-catch, disrupts control flow
-
Result Types: Explicit success/failure, functional approach
-
Error Codes: C-style, requires discipline
-
Option/Maybe Types: For nullable values
When to Use Each:
-
Exceptions: Unexpected errors, exceptional conditions
-
Result Types: Expected errors, validation failures
-
Panics/Crashes: Unrecoverable errors, programming bugs
- Error Categories
Recoverable Errors:
-
Network timeouts
-
Missing files
-
Invalid user input
-
API rate limits
Unrecoverable Errors:
-
Out of memory
-
Stack overflow
-
Programming bugs (null pointer, etc.)
Language-Specific Patterns
For detailed code examples in Python, TypeScript, Rust, and Go, see: 👉 examples/language-patterns.md
Universal Patterns
Pattern 1: Circuit Breaker
Prevent cascading failures in distributed systems.
from enum import Enum from datetime import datetime, timedelta from typing import Callable, TypeVar
T = TypeVar('T')
class CircuitState(Enum): CLOSED = "closed" # Normal operation OPEN = "open" # Failing, reject requests HALF_OPEN = "half_open" # Testing if recovered
class CircuitBreaker: def init( self, failure_threshold: int = 5, timeout: timedelta = timedelta(seconds=60), success_threshold: int = 2 ): self.failure_threshold = failure_threshold self.timeout = timeout self.success_threshold = success_threshold self.failure_count = 0 self.success_count = 0 self.state = CircuitState.CLOSED self.last_failure_time = None
def call(self, func: Callable[[], T]) -> T:
if self.state == CircuitState.OPEN:
if datetime.now() - self.last_failure_time > self.timeout:
self.state = CircuitState.HALF_OPEN
self.success_count = 0
else:
raise Exception("Circuit breaker is OPEN")
try:
result = func()
self.on_success()
return result
except Exception as e:
self.on_failure()
raise
def on_success(self):
self.failure_count = 0
if self.state == CircuitState.HALF_OPEN:
self.success_count += 1
if self.success_count >= self.success_threshold:
self.state = CircuitState.CLOSED
self.success_count = 0
def on_failure(self):
self.failure_count += 1
self.last_failure_time = datetime.now()
if self.failure_count >= self.failure_threshold:
self.state = CircuitState.OPEN
Usage
circuit_breaker = CircuitBreaker()
def fetch_data(): return circuit_breaker.call(lambda: external_api.get_data())
Pattern 2: Error Aggregation
Collect multiple errors instead of failing on first error.
class ErrorCollector { private errors: Error[] = [];
add(error: Error): void { this.errors.push(error); }
hasErrors(): boolean { return this.errors.length > 0; }
getErrors(): Error[] { return [...this.errors]; }
throw(): never {
if (this.errors.length === 1) {
throw this.errors[0];
}
throw new AggregateError(
this.errors,
${this.errors.length} errors occurred,
);
}
}
// Usage: Validate multiple fields function validateUser(data: any): User { const errors = new ErrorCollector();
if (!data.email) { errors.add(new ValidationError("Email is required")); } else if (!isValidEmail(data.email)) { errors.add(new ValidationError("Email is invalid")); }
if (!data.name || data.name.length < 2) { errors.add(new ValidationError("Name must be at least 2 characters")); }
if (!data.age || data.age < 18) { errors.add(new ValidationError("Age must be 18 or older")); }
if (errors.hasErrors()) { errors.throw(); }
return data as User; }
Pattern 3: Graceful Degradation
Provide fallback functionality when errors occur.
from typing import Optional, Callable, TypeVar
T = TypeVar('T')
def with_fallback( primary: Callable[[], T], fallback: Callable[[], T], log_error: bool = True ) -> T: """Try primary function, fall back to fallback on error.""" try: return primary() except Exception as e: if log_error: logger.error(f"Primary function failed: {e}") return fallback()
Usage
def get_user_profile(user_id: str) -> UserProfile: return with_fallback( primary=lambda: fetch_from_cache(user_id), fallback=lambda: fetch_from_database(user_id) )
Multiple fallbacks
def get_exchange_rate(currency: str) -> float: return ( try_function(lambda: api_provider_1.get_rate(currency)) or try_function(lambda: api_provider_2.get_rate(currency)) or try_function(lambda: cache.get_rate(currency)) or DEFAULT_RATE )
def try_function(func: Callable[[], Optional[T]]) -> Optional[T]: try: return func() except Exception: return None
Best Practices
-
Fail Fast: Validate input early, fail quickly
-
Preserve Context: Include stack traces, metadata, timestamps
-
Meaningful Messages: Explain what happened and how to fix it
-
Log Appropriately: Error = log, expected failure = don't spam logs
-
Handle at Right Level: Catch where you can meaningfully handle
-
Clean Up Resources: Use try-finally, context managers, defer
-
Don't Swallow Errors: Log or re-throw, don't silently ignore
-
Type-Safe Errors: Use typed errors when possible
Good error handling example
def process_order(order_id: str) -> Order: """Process order with comprehensive error handling.""" try: # Validate input if not order_id: raise ValidationError("Order ID is required")
# Fetch order
order = db.get_order(order_id)
if not order:
raise NotFoundError("Order", order_id)
# Process payment
try:
payment_result = payment_service.charge(order.total)
except PaymentServiceError as e:
# Log and wrap external service error
logger.error(f"Payment failed for order {order_id}: {e}")
raise ExternalServiceError(
f"Payment processing failed",
service="payment_service",
details={"order_id": order_id, "amount": order.total}
) from e
# Update order
order.status = "completed"
order.payment_id = payment_result.id
db.save(order)
return order
except ApplicationError:
# Re-raise known application errors
raise
except Exception as e:
# Log unexpected errors
logger.exception(f"Unexpected error processing order {order_id}")
raise ApplicationError(
"Order processing failed",
code="INTERNAL_ERROR"
) from e
Common Pitfalls
-
Catching Too Broadly: except Exception hides bugs
-
Empty Catch Blocks: Silently swallowing errors
-
Logging and Re-throwing: Creates duplicate log entries
-
Not Cleaning Up: Forgetting to close files, connections
-
Poor Error Messages: "Error occurred" is not helpful
-
Returning Error Codes: Use exceptions or Result types
-
Ignoring Async Errors: Unhandled promise rejections
Resources
-
references/exception-hierarchy-design.md: Designing error class hierarchies
-
references/error-recovery-strategies.md: Recovery patterns for different scenarios
-
references/async-error-handling.md: Handling errors in concurrent code
-
assets/error-handling-checklist.md: Review checklist for error handling
-
assets/error-message-guide.md: Writing helpful error messages
-
scripts/error-analyzer.py: Analyze error patterns in logs