🐘 PostgreSQL Optimization

"Beyond 'just add an index' — creative solutions for real performance problems."

Unconventional optimization techniques for PostgreSQL that go beyond standard DBA playbooks.

Purpose

When conventional approaches fall short — query rewrites, adding indexes, VACUUM, ANALYZE — these techniques offer creative solutions:

Eliminate impossible query scans with constraint exclusion
Reduce index size with function-based indexes
Enforce uniqueness with hash indexes instead of B-Trees

When to Use

Ad-hoc query environments where users make mistakes
Large indexes approaching table size
Uniqueness constraints on large text values (URLs, documents)
Timestamp columns queried at coarser granularity

Technique 1: Constraint Exclusion

The Problem

Check constraints prevent invalid data, but PostgreSQL doesn't use them to optimize queries by default.

CREATE TABLE users (
    id INT PRIMARY KEY,
    username TEXT NOT NULL,
    plan TEXT NOT NULL,
    CONSTRAINT plan_check CHECK (plan IN ('free', 'pro'))
);

An analyst writes:

SELECT * FROM users WHERE plan = 'Pro';  -- Note: capital P

Despite the check constraint making this condition impossible, PostgreSQL scans the entire table.

The Solution

SET constraint_exclusion TO 'on';

With constraint exclusion enabled:

EXPLAIN ANALYZE SELECT * FROM users WHERE plan = 'Pro';

Result  (cost=0.00..0.00 rows=0 width=0)
  One-Time Filter: false
Execution Time: 0.008 ms

PostgreSQL recognizes the condition contradicts the constraint and skips the scan entirely.

When to Enable

Environment	Recommendation
OLTP production	Leave as 'partition' (default)
BI / Data Warehouse	Set to 'on'
Ad-hoc query tools	Set to 'on'
Reporting databases	Set to 'on'

Tradeoffs

Benefit: Eliminates impossible query scans
Cost: Extra planning overhead evaluating constraints against conditions
Default: 'partition' — only used for partition pruning

Technique 2: Function-Based Indexes for Lower Cardinality

The Problem

You have a sales table with timestamps:

CREATE TABLE sale (
    id INT PRIMARY KEY,
    sold_at TIMESTAMPTZ NOT NULL,
    charged INT NOT NULL
);

Analysts query by day:

SELECT date_trunc('day', sold_at AT TIME ZONE 'UTC'), SUM(charged)
FROM sale
WHERE sold_at BETWEEN '2025-01-01 UTC' AND '2025-02-01 UTC'
GROUP BY 1;

You add a B-Tree index on sold_at — 214 MB for a 160 MB table. The index is almost half the table size!

The Solution

Index only what queries need:

CREATE INDEX sale_sold_at_date_ix 
ON sale((date_trunc('day', sold_at AT TIME ZONE 'UTC'))::date);

Index	Size
`sale_sold_at_ix` (full timestamp)	214 MB
`sale_sold_at_date_ix` (date only)	66 MB

The function-based index is 3x smaller because:

Dates are 4 bytes vs 8 bytes for timestamptz
Fewer distinct values enable deduplication

The Discipline Problem

Function-based indexes require exact expression match:

-- Uses the index ✓
WHERE date_trunc('day', sold_at AT TIME ZONE 'UTC')::date 
      BETWEEN '2025-01-01' AND '2025-01-31'

-- Does NOT use the index ✗
WHERE (sold_at AT TIME ZONE 'UTC')::date 
      BETWEEN '2025-01-01' AND '2025-01-31'

Solution: Virtual Generated Columns (PostgreSQL 18+)

ALTER TABLE sale ADD sold_at_date DATE
GENERATED ALWAYS AS (date_trunc('day', sold_at AT TIME ZONE 'UTC'));

Now queries use the virtual column:

SELECT sold_at_date, SUM(charged)
FROM sale
WHERE sold_at_date BETWEEN '2025-01-01' AND '2025-01-31'
GROUP BY 1;

Benefits:

Smaller index
Faster queries
No discipline required — column guarantees correct expression
No ambiguity about timezones

Limitation: PostgreSQL 18 doesn't support indexes directly on virtual columns (yet).

Technique 3: Hash Index for Uniqueness

The Problem

You have a table with large URLs:

CREATE TABLE urls (
    id INT PRIMARY KEY,
    url TEXT NOT NULL,
    data JSON
);

You add a unique B-Tree index:

CREATE UNIQUE INDEX urls_url_unique_ix ON urls(url);

Size
Table: 160 MB
B-Tree index: 154 MB

The index is almost as large as the table because B-Tree stores actual values in leaf blocks.

The Solution

Use an exclusion constraint with a hash index:

ALTER TABLE urls 
ADD CONSTRAINT urls_url_unique_hash 
EXCLUDE USING HASH (url WITH =);

Index	Size
B-Tree	154 MB
Hash	32 MB

The hash index is 5x smaller because it stores hash values, not the actual URLs.

Uniqueness Is Enforced

INSERT INTO urls (id, url) VALUES (1000002, 'https://example.com');
-- ERROR: conflicting key value violates exclusion constraint

Queries Still Fast

EXPLAIN ANALYZE SELECT * FROM urls WHERE url = 'https://example.com';

Index Scan using urls_url_unique_hash on urls
Execution Time: 0.022 ms  -- Faster than B-Tree's 0.046 ms!

Limitations

Feature	B-Tree Unique	Hash Exclusion
Foreign key reference	✓	✗
`ON CONFLICT (column)`	✓	✗
`ON CONFLICT ON CONSTRAINT`	✓	✓ (DO NOTHING only)
`ON CONFLICT DO UPDATE`	✓	✗
`MERGE`	✓	✓

Workaround: Use MERGE

Instead of INSERT ... ON CONFLICT DO UPDATE:

MERGE INTO urls t
USING (VALUES (1000004, 'https://example.com')) AS s(id, url)
ON t.url = s.url
WHEN MATCHED THEN UPDATE SET id = s.id
WHEN NOT MATCHED THEN INSERT (id, url) VALUES (s.id, s.url);

Quick Reference

Diagnostic Queries

Check index sizes:

\di+ table_*

Compare index to table size:

SELECT 
    relname AS name,
    pg_size_pretty(pg_relation_size(oid)) AS size
FROM pg_class 
WHERE relname LIKE 'your_table%'
ORDER BY pg_relation_size(oid) DESC;

Check constraint_exclusion setting:

SHOW constraint_exclusion;

Decision Tree

Is the query scanning impossibly?
├── Yes → Enable constraint_exclusion
└── No
    ↓
Is index nearly as large as table?
├── Yes, timestamp column → Function-based index on date
├── Yes, large text column → Hash exclusion constraint
└── No → Standard B-Tree is fine

Commands

Command	Action
`ANALYZE [table]`	Analyze query performance
`CHECK-CONSTRAINTS`	Evaluate constraint exclusion opportunity
`LOWER-CARDINALITY`	Find function-based index opportunities
`HASH-UNIQUE`	Evaluate hash index for large values
`COMPARE-INDEXES`	Compare index sizes and performance

Integration

Direction	Skill	Relationship
←	debugging	Query debugging leads here
→	plan-then-execute	Systematic optimization

postgres-optimization

Safety Notice

Copy this and send it to your AI assistant to learn

🐘 PostgreSQL Optimization

Purpose

When to Use

Technique 1: Constraint Exclusion

The Problem

The Solution

When to Enable

Tradeoffs

Technique 2: Function-Based Indexes for Lower Cardinality

The Problem

The Solution

The Discipline Problem

Solution: Virtual Generated Columns (PostgreSQL 18+)

Technique 3: Hash Index for Uniqueness

The Problem

The Solution

Uniqueness Is Enforced

Queries Still Fast

Limitations

Workaround: Use MERGE

Quick Reference

Diagnostic Queries

Decision Tree

Commands

Integration

Source Transparency

Related Skills

Database Engineering Mastery

AetherCore v3.3

database-admin