🐘 PostgreSQL Optimization
"Beyond 'just add an index' — creative solutions for real performance problems."
Unconventional optimization techniques for PostgreSQL that go beyond standard DBA playbooks.
Purpose
When conventional approaches fall short — query rewrites, adding indexes, VACUUM, ANALYZE — these techniques offer creative solutions:
- Eliminate impossible query scans with constraint exclusion
- Reduce index size with function-based indexes
- Enforce uniqueness with hash indexes instead of B-Trees
When to Use
- Ad-hoc query environments where users make mistakes
- Large indexes approaching table size
- Uniqueness constraints on large text values (URLs, documents)
- Timestamp columns queried at coarser granularity
Technique 1: Constraint Exclusion
The Problem
Check constraints prevent invalid data, but PostgreSQL doesn't use them to optimize queries by default.
CREATE TABLE users (
id INT PRIMARY KEY,
username TEXT NOT NULL,
plan TEXT NOT NULL,
CONSTRAINT plan_check CHECK (plan IN ('free', 'pro'))
);
An analyst writes:
SELECT * FROM users WHERE plan = 'Pro'; -- Note: capital P
Despite the check constraint making this condition impossible, PostgreSQL scans the entire table.
The Solution
SET constraint_exclusion TO 'on';
With constraint exclusion enabled:
EXPLAIN ANALYZE SELECT * FROM users WHERE plan = 'Pro';
Result (cost=0.00..0.00 rows=0 width=0)
One-Time Filter: false
Execution Time: 0.008 ms
PostgreSQL recognizes the condition contradicts the constraint and skips the scan entirely.
When to Enable
| Environment | Recommendation |
|---|---|
| OLTP production | Leave as 'partition' (default) |
| BI / Data Warehouse | Set to 'on' |
| Ad-hoc query tools | Set to 'on' |
| Reporting databases | Set to 'on' |
Tradeoffs
- Benefit: Eliminates impossible query scans
- Cost: Extra planning overhead evaluating constraints against conditions
- Default: 'partition' — only used for partition pruning
Technique 2: Function-Based Indexes for Lower Cardinality
The Problem
You have a sales table with timestamps:
CREATE TABLE sale (
id INT PRIMARY KEY,
sold_at TIMESTAMPTZ NOT NULL,
charged INT NOT NULL
);
Analysts query by day:
SELECT date_trunc('day', sold_at AT TIME ZONE 'UTC'), SUM(charged)
FROM sale
WHERE sold_at BETWEEN '2025-01-01 UTC' AND '2025-02-01 UTC'
GROUP BY 1;
You add a B-Tree index on sold_at — 214 MB for a 160 MB table. The index is almost half the table size!
The Solution
Index only what queries need:
CREATE INDEX sale_sold_at_date_ix
ON sale((date_trunc('day', sold_at AT TIME ZONE 'UTC'))::date);
| Index | Size |
|---|---|
sale_sold_at_ix (full timestamp) | 214 MB |
sale_sold_at_date_ix (date only) | 66 MB |
The function-based index is 3x smaller because:
- Dates are 4 bytes vs 8 bytes for timestamptz
- Fewer distinct values enable deduplication
The Discipline Problem
Function-based indexes require exact expression match:
-- Uses the index ✓
WHERE date_trunc('day', sold_at AT TIME ZONE 'UTC')::date
BETWEEN '2025-01-01' AND '2025-01-31'
-- Does NOT use the index ✗
WHERE (sold_at AT TIME ZONE 'UTC')::date
BETWEEN '2025-01-01' AND '2025-01-31'
Solution: Virtual Generated Columns (PostgreSQL 18+)
ALTER TABLE sale ADD sold_at_date DATE
GENERATED ALWAYS AS (date_trunc('day', sold_at AT TIME ZONE 'UTC'));
Now queries use the virtual column:
SELECT sold_at_date, SUM(charged)
FROM sale
WHERE sold_at_date BETWEEN '2025-01-01' AND '2025-01-31'
GROUP BY 1;
Benefits:
- Smaller index
- Faster queries
- No discipline required — column guarantees correct expression
- No ambiguity about timezones
Limitation: PostgreSQL 18 doesn't support indexes directly on virtual columns (yet).
Technique 3: Hash Index for Uniqueness
The Problem
You have a table with large URLs:
CREATE TABLE urls (
id INT PRIMARY KEY,
url TEXT NOT NULL,
data JSON
);
You add a unique B-Tree index:
CREATE UNIQUE INDEX urls_url_unique_ix ON urls(url);
| Size |
|---|
| Table: 160 MB |
| B-Tree index: 154 MB |
The index is almost as large as the table because B-Tree stores actual values in leaf blocks.
The Solution
Use an exclusion constraint with a hash index:
ALTER TABLE urls
ADD CONSTRAINT urls_url_unique_hash
EXCLUDE USING HASH (url WITH =);
| Index | Size |
|---|---|
| B-Tree | 154 MB |
| Hash | 32 MB |
The hash index is 5x smaller because it stores hash values, not the actual URLs.
Uniqueness Is Enforced
INSERT INTO urls (id, url) VALUES (1000002, 'https://example.com');
-- ERROR: conflicting key value violates exclusion constraint
Queries Still Fast
EXPLAIN ANALYZE SELECT * FROM urls WHERE url = 'https://example.com';
Index Scan using urls_url_unique_hash on urls
Execution Time: 0.022 ms -- Faster than B-Tree's 0.046 ms!
Limitations
| Feature | B-Tree Unique | Hash Exclusion |
|---|---|---|
| Foreign key reference | ✓ | ✗ |
ON CONFLICT (column) | ✓ | ✗ |
ON CONFLICT ON CONSTRAINT | ✓ | ✓ (DO NOTHING only) |
ON CONFLICT DO UPDATE | ✓ | ✗ |
MERGE | ✓ | ✓ |
Workaround: Use MERGE
Instead of INSERT ... ON CONFLICT DO UPDATE:
MERGE INTO urls t
USING (VALUES (1000004, 'https://example.com')) AS s(id, url)
ON t.url = s.url
WHEN MATCHED THEN UPDATE SET id = s.id
WHEN NOT MATCHED THEN INSERT (id, url) VALUES (s.id, s.url);
Quick Reference
Diagnostic Queries
Check index sizes:
\di+ table_*
Compare index to table size:
SELECT
relname AS name,
pg_size_pretty(pg_relation_size(oid)) AS size
FROM pg_class
WHERE relname LIKE 'your_table%'
ORDER BY pg_relation_size(oid) DESC;
Check constraint_exclusion setting:
SHOW constraint_exclusion;
Decision Tree
Is the query scanning impossibly?
├── Yes → Enable constraint_exclusion
└── No
↓
Is index nearly as large as table?
├── Yes, timestamp column → Function-based index on date
├── Yes, large text column → Hash exclusion constraint
└── No → Standard B-Tree is fine
Commands
| Command | Action |
|---|---|
ANALYZE [table] | Analyze query performance |
CHECK-CONSTRAINTS | Evaluate constraint exclusion opportunity |
LOWER-CARDINALITY | Find function-based index opportunities |
HASH-UNIQUE | Evaluate hash index for large values |
COMPARE-INDEXES | Compare index sizes and performance |
Integration
| Direction | Skill | Relationship |
|---|---|---|
| ← | debugging | Query debugging leads here |
| → | plan-then-execute | Systematic optimization |