Error Handling

Overview

Standardize error handling across a codebase by implementing a unified error taxonomy, stable error codes, proper propagation chains, and structured logging. The core principle: every error must be categorized, coded, wrapped with context, and split into a safe user-facing message and a detailed internal log entry.

When to use

When adding error handling to an API or service
When implementing custom error/exception classes
When standardizing how errors propagate through layers
When designing error response formats for an API
When the user mentions "error handling", "exceptions", "error codes", or "error responses"
When adding structured logging for errors with correlation IDs

Do NOT use when:

The user needs to debug a specific runtime error (use a debugging skill)
The user wants monitoring/alerting setup (use a monitoring skill)
The task is only about input validation logic, not error handling patterns

Workflow

1. Audit existing error handling

Scan the codebase for current error patterns:

Search for try/catch, try/except, .catch(), rescue, error middleware
Identify swallowed exceptions (empty catch blocks, catch-and-ignore)
Find bare throws/raises without context wrapping
Note inconsistent error response formats across endpoints
Check for leaked internal details in user-facing responses

Output: A list of error handling gaps and inconsistencies.

2. Define the error taxonomy

Create error categories that map to HTTP status codes (for APIs) or exit conditions (for services). Every error in the system must belong to exactly one category.

Category	HTTP Status	Error Code Prefix	Description
`validation`	400	`ERR_VALIDATION_`	Invalid input, malformed request
`authentication`	401	`ERR_AUTH_`	Missing or invalid credentials
`authorization`	403	`ERR_FORBIDDEN_`	Valid credentials but insufficient access
`not_found`	404	`ERR_NOT_FOUND_`	Requested resource does not exist
`conflict`	409	`ERR_CONFLICT_`	State conflict (duplicate, version mismatch)
`rate_limit`	429	`ERR_RATE_LIMIT_`	Too many requests
`internal`	500	`ERR_INTERNAL_`	Unexpected server error
`service_unavailable`	503	`ERR_UPSTREAM_`	Dependency failure (DB, external API)

3. Implement error class hierarchy

Create a base error class and category-specific subclasses. Every error class must carry:

code — Stable string code clients can match on (e.g., ERR_USER_NOT_FOUND)
message — Safe, user-facing message (no stack traces, no internal paths)
statusCode — HTTP status code for API responses
category — Error taxonomy category
details — Optional structured data (field validation errors, constraints)
cause — Original error for propagation chain (preserves stack trace)

TypeScript example:

export class AppError extends Error {
  public readonly code: string;
  public readonly statusCode: number;
  public readonly category: string;
  public readonly details?: Record<string, unknown>;
  public readonly isOperational: boolean;

  constructor(params: {
    code: string;
    message: string;
    statusCode: number;
    category: string;
    details?: Record<string, unknown>;
    cause?: Error;
    isOperational?: boolean;
  }) {
    super(params.message, { cause: params.cause });
    this.code = params.code;
    this.statusCode = params.statusCode;
    this.category = params.category;
    this.details = params.details;
    this.isOperational = params.cause !== undefined
      ? true
      : (params.isOperational ?? true);
    this.name = this.constructor.name;
    Error.captureStackTrace(this, this.constructor);
  }
}

export class ValidationError extends AppError {
  constructor(
    message: string,
    details?: Record<string, unknown>,
    cause?: Error,
  ) {
    super({
      code: "ERR_VALIDATION",
      message,
      statusCode: 400,
      category: "validation",
      details,
      cause,
    });
  }
}

export class NotFoundError extends AppError {
  constructor(resource: string, identifier: string, cause?: Error) {
    super({
      code: `ERR_NOT_FOUND_${resource.toUpperCase()}`,
      message: `${resource} not found`,
      statusCode: 404,
      category: "not_found",
      details: { resource, identifier },
      cause,
    });
  }
}

export class ConflictError extends AppError {
  constructor(
    message: string,
    details?: Record<string, unknown>,
    cause?: Error,
  ) {
    super({
      code: "ERR_CONFLICT",
      message,
      statusCode: 409,
      category: "conflict",
      details,
      cause,
    });
  }
}

export class AuthenticationError extends AppError {
  constructor(message = "Authentication required", cause?: Error) {
    super({
      code: "ERR_AUTH_INVALID",
      message,
      statusCode: 401,
      category: "authentication",
      cause,
    });
  }
}

export class AuthorizationError extends AppError {
  constructor(message = "Insufficient permissions", cause?: Error) {
    super({
      code: "ERR_FORBIDDEN",
      message,
      statusCode: 403,
      category: "authorization",
      cause,
    });
  }
}

export class RateLimitError extends AppError {
  constructor(retryAfterSeconds?: number, cause?: Error) {
    super({
      code: "ERR_RATE_LIMIT",
      message: "Too many requests",
      statusCode: 429,
      category: "rate_limit",
      details: retryAfterSeconds
        ? { retryAfter: retryAfterSeconds }
        : undefined,
      cause,
    });
  }
}

export class InternalError extends AppError {
  constructor(message: string, cause?: Error) {
    super({
      code: "ERR_INTERNAL",
      message: "An unexpected error occurred",
      statusCode: 500,
      category: "internal",
      cause,
      isOperational: false,
    });
  }
}

Python equivalent:

class AppError(Exception):
    def __init__(
        self,
        code: str,
        message: str,
        status_code: int,
        category: str,
        details: dict | None = None,
        cause: Exception | None = None,
        is_operational: bool = True,
    ):
        super().__init__(message)
        self.code = code
        self.message = message
        self.status_code = status_code
        self.category = category
        self.details = details or {}
        self.is_operational = is_operational
        self.__cause__ = cause

class NotFoundError(AppError):
    def __init__(self, resource: str, identifier: str, cause: Exception | None = None):
        super().__init__(
            code=f"ERR_NOT_FOUND_{resource.upper()}",
            message=f"{resource} not found",
            status_code=404,
            category="not_found",
            details={"resource": resource, "identifier": identifier},
            cause=cause,
        )

class ValidationError(AppError):
    def __init__(self, message: str, details: dict | None = None, cause: Exception | None = None):
        super().__init__(
            code="ERR_VALIDATION",
            message=message,
            status_code=400,
            category="validation",
            details=details,
            cause=cause,
        )

4. Implement error propagation rules

Never swallow exceptions. Every catch block must either:

Re-raise the error unchanged (if this layer can't add context)
Wrap the error in a domain-specific error with the original as cause
Handle the error completely (log it, return a response, trigger recovery)

Always preserve the chain. When wrapping, pass the original error as cause so the full stack trace is available in logs:

// WRONG: swallows the original error
try {
  await db.query(sql);
} catch (err) {
  throw new Error("Database query failed"); // original error lost
}

// RIGHT: wraps with context, preserves cause
try {
  await db.query(sql);
} catch (err) {
  throw new InternalError("Database query failed", err as Error);
}

5. Separate user-facing from internal errors

User-facing response — safe, minimal, actionable:

{
  "error": {
    "code": "ERR_NOT_FOUND_USER",
    "message": "User not found",
    "details": {
      "resource": "user",
      "identifier": "usr_abc123"
    }
  },
  "requestId": "req_7f3a2b1c"
}

Internal log entry — full diagnostic context:

{
  "level": "error",
  "code": "ERR_NOT_FOUND_USER",
  "message": "User not found",
  "category": "not_found",
  "correlationId": "req_7f3a2b1c",
  "userId": "usr_abc123",
  "path": "/api/users/usr_abc123",
  "method": "GET",
  "stack": "NotFoundError: User not found\n    at UserService.getById ...",
  "cause": "MongoError: connection refused at 10.0.0.5:27017",
  "timestamp": "2026-03-06T10:15:32.456Z",
  "service": "user-api",
  "environment": "production"
}

Rules:

Never expose stack traces, file paths, or database errors to users
Never expose internal service names or infrastructure details
Always include the error code in both user response and log
Always include a requestId / correlationId in both
Log the full causal chain internally; show only the top-level message to users
For InternalError (500), always use a generic message: "An unexpected error occurred"

6. Implement error response middleware

Centralize error-to-response conversion in middleware (Express) or exception handlers (FastAPI, Django). This is the single place where errors become HTTP responses.

// Express error middleware
function errorHandler(
  err: Error,
  req: Request,
  res: Response,
  next: NextFunction,
) {
  const correlationId = req.headers["x-request-id"] || crypto.randomUUID();

  if (err instanceof AppError) {
    // Operational error — expected, safe to expose
    logger.error({
      code: err.code,
      message: err.message,
      category: err.category,
      correlationId,
      path: req.path,
      method: req.method,
      stack: err.stack,
      cause: err.cause?.message,
    });

    return res.status(err.statusCode).json({
      error: {
        code: err.code,
        message: err.message,
        ...(err.details && { details: err.details }),
      },
      requestId: correlationId,
    });
  }

  // Unexpected error — do NOT expose details
  logger.error({
    code: "ERR_INTERNAL",
    message: err.message,
    category: "internal",
    correlationId,
    path: req.path,
    method: req.method,
    stack: err.stack,
  });

  return res.status(500).json({
    error: {
      code: "ERR_INTERNAL",
      message: "An unexpected error occurred",
    },
    requestId: correlationId,
  });
}

7. Add structured error logging

Every error log entry must include:

Field	Required	Description
`level`	Yes	`error`, `warn`, or `fatal`
`code`	Yes	Stable error code (e.g., `ERR_NOT_FOUND_USER`)
`message`	Yes	Human-readable description
`category`	Yes	Error taxonomy category
`correlationId`	Yes	Request ID for tracing across services
`path`	Yes	Request path or operation name
`method`	Yes	HTTP method or operation type
`stack`	Yes	Full stack trace
`cause`	No	Original error message if wrapped
`timestamp`	Yes	ISO 8601 timestamp
`service`	Yes	Service name for multi-service architectures
`userId`	No	Authenticated user ID if available
`details`	No	Structured error details (validation fields)

Checklist

Error Response Schema

All API error responses must follow this schema:

{
  "error": {
    "code": "ERR_VALIDATION",
    "message": "Invalid email format",
    "details": {
      "field": "email",
      "constraint": "Must be a valid email address",
      "received": "not-an-email"
    }
  },
  "requestId": "req_7f3a2b1c"
}

Field	Type	Required	Description
`error.code`	string	Yes	Stable error code for programmatic matching
`error.message`	string	Yes	Human-readable description, safe for users
`error.details`	object	No	Structured context (validation fields, etc.)
`requestId`	string	Yes	Correlation ID for support and debugging

Common mistakes

Mistake	Fix
Swallowing exceptions in empty catch blocks	Every catch must re-raise, wrap, or fully handle. Log at minimum.
Leaking stack traces to API consumers	Error middleware must strip internals. Only expose code + message + safe details.
Using HTTP status codes as error codes	Status codes are transport-level. Use stable string codes (ERR_USER_NOT_FOUND) for programmatic use.
Inconsistent error response format	Centralize in error middleware. Every error response uses the same JSON schema.
Throwing raw strings instead of error objects	Always throw typed error instances with code, category, and cause chain.
Missing correlation IDs	Generate a request ID at the edge (middleware/gateway) and propagate through all layers and logs.
Logging user-facing message only	Internal logs must include stack trace, cause chain, request context, and correlation ID.
Different error formats per endpoint	One error middleware, one response schema. Endpoints throw typed errors; middleware formats responses.
Catching too broadly (catch Exception)	Catch specific error types when possible. Use broad catch only at the top-level error boundary.
Not distinguishing operational vs programmer errors	Operational errors (bad input, not found) are expected. Programmer errors (null deref) need alerts.

Key principles

Every error gets a stable code — HTTP status codes change meaning across contexts. String error codes like ERR_USER_NOT_FOUND are stable contracts that clients, monitoring, and documentation can rely on. Never use numeric codes alone.
Never swallow, always wrap — Empty catch blocks hide bugs. Every caught error must be re-raised, wrapped with domain context (preserving the original as cause), or fully handled. The causal chain must survive from origin to log.
User-facing and internal are separate concerns — Users see a safe message and an error code. Logs see the full stack trace, causal chain, correlation ID, and request context. The error middleware is the boundary between these two worlds.
Correlation IDs connect everything — A single request ID generated at the edge must appear in the HTTP response, every log entry, and any downstream service calls. Without this, debugging production errors across services is impossible.
Centralize the error boundary — One error middleware, one response schema, one logging format. Individual endpoints throw typed errors; they never format error responses directly. This eliminates inconsistency.

error-handling

Safety Notice

Copy this and send it to your AI assistant to learn