performance

Performance Optimization

Performance work follows one rule above all others: measure before you change anything. Intuition about bottlenecks is wrong more often than it is right. Every optimization should start with profiling, produce a hypothesis, apply a targeted fix, and verify with another measurement.

Core Principles

Principle Meaning

Measure first Never optimize without profiling data -- gut feelings about bottlenecks are unreliable

Optimize the critical path Focus on the code that runs most frequently or blocks user-visible latency

Set budgets Define acceptable latency, throughput, and resource usage before you start

Avoid premature optimization Readable, correct code first -- optimize only when measurements show a real problem

Know your tradeoffs Every optimization trades something (memory for speed, complexity for throughput, freshness for latency)

Profiling and Benchmarking

Profiling identifies where time and resources are spent. Without it, you are guessing.

Types of Profiling

Type What It Reveals When to Use

CPU profiling Hot functions, call frequency, execution time distribution Slow request handling, high CPU usage

Memory profiling Allocation rates, heap size, object retention, leaks Growing memory usage, OOM errors, GC pressure

I/O profiling Disk reads/writes, network calls, blocking waits Slow file operations, external service latency

Database profiling Query execution time, query count per request, slow queries High DB load, N+1 patterns, missing indexes

The Profiling Workflow

Baseline -- Capture metrics under normal conditions before any changes
Identify -- Find the hotspot consuming the most time or resources
Hypothesize -- Form a specific theory about why it is slow
Fix -- Apply a single, targeted change
Verify -- Measure again to confirm improvement and check for regressions

Performance Budgets

Define limits that trigger action when exceeded:

Response time: P50, P95, P99 latency targets per endpoint
Throughput: Minimum requests per second under expected load
Resource usage: CPU, memory, and connection limits per service
Page weight: Maximum transfer size for frontend assets

See Profiling Patterns Reference for detailed profiling workflows, bottleneck signatures, and load testing strategies.

Caching Strategies

Caching eliminates redundant computation and data fetching by storing results closer to where they are needed.

Cache Layers

Layer Location Latency Use Case

L1 -- In-process Application memory (object cache, memoization) Nanoseconds Hot data accessed many times per request

L2 -- Distributed Redis, Memcached, shared cache Sub-millisecond to low milliseconds Data shared across application instances

HTTP cache Browser, reverse proxy (Varnish, Nginx) Zero network round-trip for client cache Static assets, cacheable API responses

CDN Edge servers worldwide Low latency from geographic proximity Static files, pre-rendered pages, media

Database cache Query result cache, buffer pool Varies Repeated identical queries

Invalidation Approaches

Strategy How It Works Best For

TTL-based Cache entries expire after a fixed duration Data that tolerates bounded staleness

Event-based Cache is cleared when the source data changes Data that must stay fresh after writes

Write-through Writes update both the cache and the backing store simultaneously Read-heavy workloads needing strong consistency

Write-behind Writes update the cache immediately; backing store is updated asynchronously High write throughput where eventual consistency is acceptable

Cache Stampede Prevention

When a popular cache key expires, many concurrent requests may all try to regenerate it at once, overwhelming the backend. Three approaches prevent this:

Locking -- Only one request regenerates; others wait or serve stale data
Probabilistic early recomputation -- Requests randomly refresh the cache before expiration, spreading regeneration over time
Request coalescing -- Duplicate in-flight requests are collapsed into a single backend call

See Caching Strategies Reference for implementation patterns with multi-language examples.

Database Optimization

Database queries are the most common performance bottleneck in web applications.

Index Strategy

Create indexes on columns used in WHERE, JOIN, and ORDER BY clauses
Use composite indexes that match your most frequent query patterns (leftmost prefix rule)
Covering indexes include all columns a query needs, avoiding table lookups entirely
Monitor unused indexes -- they slow down writes without helping reads

N+1 Query Prevention

The N+1 problem occurs when code fetches a list of N records, then issues one additional query per record to load related data. Instead of 1 query, you execute N+1.

Detection signals:

Query count scales linearly with result set size
Many nearly identical queries differing only in a single parameter
Profiler shows dozens or hundreds of queries for a single page load

Prevention strategies:

Eager loading (JOIN or separate batch query upfront)
Batch loading (collect IDs, fetch all related records in one query)
DataLoader pattern (automatic batching and deduplication within a request)

Connection Pooling

Opening a database connection is expensive (TCP handshake, authentication, TLS negotiation). Connection pools maintain a set of reusable connections:

Size the pool based on expected concurrency -- too small causes queueing, too large overwhelms the database
Always return connections to the pool promptly -- leaked connections exhaust the pool
Set idle timeouts to reclaim unused connections
Use external poolers (like PgBouncer for PostgreSQL) when application-level pooling is insufficient

See Database Optimization Reference for query patterns, explain plan analysis, and multi-language examples.

Memory and Resource Management

Memory Optimization Patterns

Pattern Description

Object pooling Reuse expensive objects instead of allocating and discarding them

Streaming Process large datasets as streams instead of loading everything into memory

Lazy initialization Defer creation of expensive objects until they are actually needed

Weak references Hold references that do not prevent garbage collection

Buffer reuse Allocate buffers once and reuse them across operations

Lazy Loading

Lazy loading defers work until the result is actually needed. It reduces startup time and memory usage but adds complexity and can cause unexpected latency later.

Where lazy loading helps:

Loading related database records only when accessed
Initializing expensive service connections on first use
Loading UI components or assets only when they become visible

Where lazy loading hurts:

When the deferred work always happens anyway (just adds overhead)
When it moves latency from a predictable startup phase to unpredictable user interactions
When it creates N+1 query patterns (see Database Optimization above)

Batch Operations

Replace individual operations with batch alternatives wherever possible:

Batch inserts instead of inserting one row at a time
Batch API calls instead of calling an external service N times
Bulk file operations instead of processing files individually

Quick Reference: Common Bottleneck Patterns

Symptom Likely Cause First Investigation Step

Slow response times, low CPU I/O waits (database, network, disk) Profile I/O and check query logs

High CPU, normal response times Inefficient algorithms or excessive computation CPU profile to find hot functions

Growing memory over time Memory leak (unreleased references, unbounded caches) Heap dump comparison over time

Intermittent slowness under load Resource contention (locks, connection pool exhaustion) Check pool sizes and lock wait times

Fast locally, slow in production Network latency, missing caches, different data volumes Compare profiling data between environments

Reference Files

Reference Contents

Caching Strategies Cache layers, invalidation patterns, stampede prevention with multi-language examples

Database Optimization Query optimization, N+1 prevention, connection pooling, batch operations with multi-language examples

Profiling Patterns Profiling workflows, bottleneck signatures, performance budgets, load testing strategies

Integration with Other Skills

Situation Recommended Skill

Performance issues caused by poor architecture Install knowledge-virtuoso from krzysztofsurdy/code-virtuoso for clean architecture guidance

Need to refactor slow code paths Install knowledge-virtuoso from krzysztofsurdy/code-virtuoso for refactoring techniques

API response time optimization Install knowledge-virtuoso from krzysztofsurdy/code-virtuoso for API design principles

Database schema and query design Install knowledge-virtuoso from krzysztofsurdy/code-virtuoso for testing strategies to verify optimizations

Safety Notice

Copy this and send it to your AI assistant to learn

Source Transparency

Related Skills

solid

refactoring

symfony-components