Decision Tree
Need query optimization? → Use EXPLAIN ANALYZE Need cached aggregations? → Use Materialized View Need complex breakdown? → Use CTEs Need row ranking? → Use Window Functions Need partial coverage? → Use Partial Index
Reporting & Optimization Guidelines
Reporting queries often scan large datasets. Inefficient queries can degrade the performance of the entire system. Follow these guidelines to ensure speed and stability.
- Query Optimization
1.1 SELECT Efficiency
-
No SELECT * : Fetching unnecessary columns increases I/O and network payload. Select only required fields.
-
SARGable Queries: Ensure predicates (WHERE clauses) are "Search ARGument ABLE" to perform Index Seeks instead of Scans.
-
Bad: WHERE YEAR(created_at) = 2023 (Function on column prevents index use).
-
Good: WHERE created_at >= '2023-01-01' AND created_at < '2024-01-01' .
1.2 CTEs vs Temporary Tables
-
Common Table Expressions (CTEs): Use WITH clauses for readability and to break down complex logic. In modern PostgreSQL (12+), CTEs are materialized when beneficial, making them performant.
-
Temporary Tables: For extremely complex multi-step processing involving heavy intermediate indexing, use CREATE TEMP TABLE .
1.3 Subqueries
-
Avoid Correlated Subqueries in the SELECT list that run once per row. Rewrite them as JOIN s or LATERAL joins.
-
Bad: SELECT id, (SELECT count(*) FROM orders WHERE user_id = u.id) FROM users u
-
Good: SELECT u.id, count(o.id) FROM users u LEFT JOIN orders o ON u.id = o.user_id GROUP BY u.id
- Aggregation Strategies
2.1 Materialized Views
-
For dashboards requiring heavy aggregation (Count, Sum, Avg) over millions of rows, do not query the live transactional table every time.
-
Use Materialized Views to cache the result: CREATE MATERIALIZED VIEW mv_daily_sales AS SELECT day, SUM(total) FROM sales GROUP BY day;
-
Refresh explicitly: REFRESH MATERIALIZED VIEW CONCURRENTLY mv_daily_sales; .
2.2 Window Functions
- Use Window Functions (ROW_NUMBER() , RANK() , LEAD() , LAG() ) for complex reporting intra-row logic instead of self-joins.
- Safety Guardrails
3.1 Timeouts
- Set statement_timeout for reporting roles to prevent a runaway query from locking resources indefinitely.
3.2 Result Limits
- Always verify the estimated cardinality. If a report could return 1M+ rows, enforce LIMIT or pagination logic.
- Query Profiling
4.1 EXPLAIN ANALYZE
Before deploying complex queries, use EXPLAIN ANALYZE to understand the execution plan.
-
Look for Seq Scans on large tables (potential missing index).
-
Check Actual vs Estimated Rows for significant discrepancies (stale statistics).
-
Identify Nested Loops with high row counts that could be rewritten as Hash Joins.
EXPLAIN ANALYZE SELECT u.id, COUNT(o.id) FROM USER u LEFT JOIN ORDER o ON u.id = o.user_id WHERE u.is_active = TRUE GROUP BY u.id;
4.2 Partial Indexes
Create indexes that only cover a subset of rows to save space and speed up specific queries.
-
Example: Index only active users. CREATE INDEX idx_user_active ON USER(email) WHERE is_active = TRUE;
-
Use for columns with low cardinality where only specific values are frequently queried.