postgresql-optimization

PostgreSQL Optimization

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "postgresql-optimization" with this command: npx skills add erichowens/some_claude_skills/erichowens-some-claude-skills-postgresql-optimization

PostgreSQL Optimization

Overview

Expert in PostgreSQL performance tuning, query optimization, and database administration. Specializes in EXPLAIN analysis, indexing strategies, connection pooling, partitioning, and production-grade PostgreSQL operations.

When to Use

  • Diagnosing slow queries with EXPLAIN ANALYZE

  • Creating optimal indexes for query patterns

  • Designing database schemas for performance

  • Configuring PostgreSQL for production workloads

  • Implementing connection pooling (PgBouncer, Supavisor)

  • Setting up partitioning for large tables

  • Analyzing and reducing lock contention

  • Migrating or upgrading PostgreSQL versions

Capabilities

Query Optimization

  • EXPLAIN / EXPLAIN ANALYZE interpretation

  • Query plan analysis and optimization

  • Identifying sequential scans vs index scans

  • Join optimization and query rewriting

  • CTE vs subquery performance trade-offs

  • Window function optimization

Indexing Strategies

  • B-tree, GIN, GiST, BRIN index selection

  • Partial indexes for filtered queries

  • Expression indexes for computed values

  • Covering indexes (INCLUDE clause)

  • Index-only scans optimization

  • Concurrent index creation

Schema Design

  • Normalization vs denormalization trade-offs

  • JSONB column design and indexing

  • Array columns and operations

  • Enum types vs lookup tables

  • Foreign key cascade strategies

  • Table inheritance and partitioning

Configuration Tuning

  • Memory settings (shared_buffers, work_mem, effective_cache_size)

  • Connection limits and pooling

  • WAL and checkpoint tuning

  • Autovacuum configuration

  • Statistics collection settings

Advanced Features

  • Partitioning (range, list, hash)

  • Materialized views with refresh strategies

  • Full-text search with tsvector/tsquery

  • PostGIS geospatial queries

  • Logical replication setup

  • pg_stat_statements analysis

Dependencies

Works well with:

  • database-modeler

  • Schema design and ERD creation

  • data-pipeline-engineer

  • ETL and data processing

  • site-reliability-engineer

  • Database monitoring and alerting

  • nextjs-app-router-expert

  • Full-stack data fetching

Examples

Reading EXPLAIN ANALYZE Output

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT u.*, COUNT(o.id) as order_count FROM users u LEFT JOIN orders o ON o.user_id = u.id WHERE u.created_at > '2024-01-01' GROUP BY u.id;

-- Key metrics to look for: -- - "Seq Scan" on large tables → needs index -- - "Rows Removed by Filter" high → filter before join -- - "Sort Method: external merge" → increase work_mem -- - "Buffers: shared hit" vs "shared read" → cache efficiency

Creating Effective Indexes

-- Basic B-tree for equality and range queries CREATE INDEX CONCURRENTLY idx_orders_user_created ON orders (user_id, created_at DESC);

-- Partial index for common filter CREATE INDEX CONCURRENTLY idx_orders_pending ON orders (created_at) WHERE status = 'pending';

-- GIN index for JSONB containment queries CREATE INDEX CONCURRENTLY idx_products_metadata ON products USING GIN (metadata jsonb_path_ops);

-- Covering index to enable index-only scans CREATE INDEX CONCURRENTLY idx_users_email_covering ON users (email) INCLUDE (name, created_at);

-- Expression index for case-insensitive search CREATE INDEX CONCURRENTLY idx_users_email_lower ON users (LOWER(email));

Optimizing N+1 Queries

-- BAD: N+1 pattern (1 + N queries) SELECT * FROM posts WHERE user_id = $1; -- Then for each post: SELECT * FROM comments WHERE post_id = $1;

-- GOOD: Single query with lateral join SELECT p., c.comments FROM posts p LEFT JOIN LATERAL ( SELECT json_agg(c.) as comments FROM comments c WHERE c.post_id = p.id ) c ON true WHERE p.user_id = $1;

-- GOOD: Window function for aggregates SELECT p., COUNT() OVER (PARTITION BY p.user_id) as user_post_count FROM posts p WHERE p.user_id = $1;

Table Partitioning

-- Create partitioned table by date range CREATE TABLE events ( id BIGSERIAL, event_type TEXT NOT NULL, payload JSONB, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), PRIMARY KEY (id, created_at) ) PARTITION BY RANGE (created_at);

-- Create monthly partitions CREATE TABLE events_2024_01 PARTITION OF events FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

CREATE TABLE events_2024_02 PARTITION OF events FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');

-- Automate partition creation with pg_partman CREATE EXTENSION pg_partman; SELECT partman.create_parent('public.events', 'created_at', 'native', 'monthly');

Connection Pooling Config (PgBouncer)

; pgbouncer.ini

[databases] myapp = host=localhost dbname=myapp

[pgbouncer] listen_addr = 0.0.0.0 listen_port = 6432 auth_type = md5 auth_file = /etc/pgbouncer/userlist.txt

; Pool settings pool_mode = transaction ; Recommended for most apps max_client_conn = 1000 default_pool_size = 20 reserve_pool_size = 5

; Timeouts server_idle_timeout = 600 client_idle_timeout = 0

Performance Configuration

-- Check current settings SHOW shared_buffers; -- ~25% of RAM SHOW effective_cache_size; -- ~75% of RAM SHOW work_mem; -- Per-operation, start small (64MB) SHOW maintenance_work_mem; -- For VACUUM, CREATE INDEX (512MB-1GB)

-- Recommended production settings (for 32GB RAM server) ALTER SYSTEM SET shared_buffers = '8GB'; ALTER SYSTEM SET effective_cache_size = '24GB'; ALTER SYSTEM SET work_mem = '64MB'; ALTER SYSTEM SET maintenance_work_mem = '1GB'; ALTER SYSTEM SET random_page_cost = 1.1; -- For SSD storage ALTER SYSTEM SET effective_io_concurrency = 200; -- For SSD

-- Reload configuration SELECT pg_reload_conf();

Finding Slow Queries

-- Enable pg_stat_statements CREATE EXTENSION pg_stat_statements;

-- Top 10 slowest queries by total time SELECT round(total_exec_time::numeric, 2) as total_ms, calls, round(mean_exec_time::numeric, 2) as avg_ms, round((100 * total_exec_time / sum(total_exec_time) OVER())::numeric, 2) as pct, query FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 10;

-- Queries with most I/O SELECT round(shared_blks_read::numeric, 2) as disk_reads, round(shared_blks_hit::numeric, 2) as cache_hits, round(100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0), 2) as cache_hit_ratio, query FROM pg_stat_statements ORDER BY shared_blks_read DESC LIMIT 10;

Analyzing Table Bloat

-- Check table bloat SELECT schemaname, tablename, pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) as total_size, pg_size_pretty(pg_relation_size(schemaname || '.' || tablename)) as table_size, n_dead_tup, n_live_tup, round(100.0 * n_dead_tup / nullif(n_live_tup + n_dead_tup, 0), 2) as dead_pct FROM pg_stat_user_tables ORDER BY n_dead_tup DESC LIMIT 10;

-- Manual VACUUM for critical tables VACUUM (VERBOSE, ANALYZE) orders;

-- Reclaim space (requires exclusive lock) VACUUM FULL orders; -- Use during maintenance window

Best Practices

  • Always use EXPLAIN ANALYZE - Don't guess, measure actual query performance

  • Create indexes CONCURRENTLY - Avoid blocking writes during index creation

  • Partial indexes for hot paths - Index only the rows you query frequently

  • Use connection pooling - PgBouncer or Supavisor for production

  • Monitor pg_stat_statements - Track query performance over time

  • Regular ANALYZE - Keep statistics current for query planner

  • **Avoid SELECT *** - Only fetch columns you need

  • Batch large updates - Process in chunks to avoid lock contention

  • Use prepared statements - Reduce parsing overhead for repeated queries

Common Pitfalls

  • Missing indexes - Check for sequential scans on large tables

  • Over-indexing - Too many indexes slow down writes

  • work_mem too low - Causes disk-based sorts and hash joins

  • Connection exhaustion - Not using connection pooling

  • Stale statistics - Autovacuum not running frequently enough

  • Bloated tables - Not vacuuming after large deletes/updates

  • N+1 queries - Fetching related data in loops instead of joins

  • SELECT * everywhere - Fetching unnecessary columns

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

video-processing-editing

No summary provided by upstream source.

Repository SourceNeeds Review
General

cv-creator

No summary provided by upstream source.

Repository SourceNeeds Review
General

mobile-ux-optimizer

No summary provided by upstream source.

Repository SourceNeeds Review