sql-guide

Applies to: PostgreSQL 15+, MySQL 8+, Database Migrations, Query Optimization

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "sql-guide" with this command: npx skills add ar4mirez/samuel/ar4mirez-samuel-sql-guide

SQL Guide

Applies to: PostgreSQL 15+, MySQL 8+, Database Migrations, Query Optimization

Core Principles

  • Parameterized Queries Always: Never concatenate user input into SQL strings -- use bind parameters ($1 , ? , :name ) without exception

  • Explicit Over Implicit: Name all constraints, specify column lists in INSERT, avoid SELECT * in production code

  • Migrations Are Immutable: Once applied to a shared environment, never modify a migration -- create a new one

  • Indexes Are Not Free: Every index speeds reads but slows writes; justify each index with a query plan

  • Transactions Are Boundaries: Keep transactions short, choose the correct isolation level, and always handle rollback

Guardrails

Naming Conventions

  • Tables: snake_case , plural (users , order_items , audit_logs )

  • Columns: snake_case , singular (email , created_at , is_active )

  • Primary keys: id (integer or UUID depending on project convention)

  • Foreign keys: <singular_table>_id (user_id , order_id )

  • Indexes: idx_<table>_<columns> (idx_users_email , idx_orders_user_id_created_at )

  • Unique constraints: uq_<table>_<columns> (uq_users_email )

  • Check constraints: ck_<table>_<description> (ck_orders_positive_total )

  • Boolean columns: is_ or has_ prefix (is_active , has_verified_email )

  • Timestamps: created_at , updated_at , deleted_at (always TIMESTAMPTZ )

Query Safety

  • ALWAYS use parameterized queries -- no string interpolation of user input

  • ALWAYS specify column lists in INSERT statements

  • NEVER use SELECT * in application code (acceptable in ad-hoc queries only)

  • ALWAYS add LIMIT to queries that return lists (prevent unbounded result sets)

  • NEVER use TRUNCATE or DROP in application code without explicit safeguards

  • ALWAYS use EXISTS instead of COUNT(*) > 0 for existence checks

  • ALWAYS qualify column names with table aliases in JOINs

-- GOOD: parameterized, explicit columns, bounded SELECT u.id, u.email, u.created_at FROM users u WHERE u.email = $1 AND u.is_active = true LIMIT 1;

-- BAD: string concatenation, SELECT *, no LIMIT SELECT * FROM users WHERE email = '" + userInput + "';

Indexing

  • Create indexes to support WHERE , JOIN , and ORDER BY clauses

  • Composite index column order: equality columns first, then range, then sort

  • Use UNIQUE indexes to enforce business rules at the database level

  • Prefer partial indexes when filtering on a known subset (WHERE is_active = true )

  • Use CONCURRENTLY for index creation on live tables (PostgreSQL)

  • Review index usage periodically -- drop unused indexes

-- Composite index: equality (status) first, then range (created_at) CREATE INDEX CONCURRENTLY idx_orders_status_created ON orders (status, created_at DESC);

-- Partial index: only index active users CREATE INDEX idx_users_email_active ON users (email) WHERE is_active = true;

Transactions

  • Keep transactions as short as possible (no I/O or external calls inside)

  • Use READ COMMITTED for most operations (PostgreSQL default)

  • Use REPEATABLE READ when a transaction reads the same data multiple times

  • Use SERIALIZABLE for financial or inventory operations requiring strict consistency

  • Always explicitly COMMIT or ROLLBACK -- never leave transactions hanging

  • Use SAVEPOINT for partial rollback within complex transactions

BEGIN; SAVEPOINT before_update;

UPDATE accounts SET balance = balance - 100.00 WHERE id = $1; UPDATE accounts SET balance = balance + 100.00 WHERE id = $2;

-- If second update fails, roll back to savepoint -- ROLLBACK TO SAVEPOINT before_update; COMMIT;

Migrations

  • Every migration MUST have both up and down functions

  • Migrations MUST be idempotent (IF NOT EXISTS , IF EXISTS guards)

  • Never rename or drop columns in a single step -- use a multi-step process

  • Add columns as NULL first, backfill, then add NOT NULL constraint

  • Never modify a migration that has been applied to a shared environment

  • Migration filenames: YYYYMMDDHHMMSS_description.sql or sequential numbering

  • Test both up and down in development before committing

-- UP: 20250115120000_add_users_phone.sql ALTER TABLE users ADD COLUMN IF NOT EXISTS phone VARCHAR(20); CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_users_phone ON users (phone) WHERE phone IS NOT NULL;

-- DOWN: 20250115120000_add_users_phone.sql DROP INDEX CONCURRENTLY IF EXISTS idx_users_phone; ALTER TABLE users DROP COLUMN IF EXISTS phone;

Schema Design

Standard Table Template

CREATE TABLE IF NOT EXISTS orders ( id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE, status VARCHAR(20) NOT NULL DEFAULT 'pending' CONSTRAINT ck_orders_valid_status CHECK (status IN ('pending', 'confirmed', 'shipped', 'delivered', 'cancelled')), total_cents INTEGER NOT NULL CONSTRAINT ck_orders_positive_total CHECK (total_cents >= 0), notes TEXT, created_at TIMESTAMPTZ NOT NULL DEFAULT now(), updated_at TIMESTAMPTZ NOT NULL DEFAULT now() );

-- Auto-update updated_at (PostgreSQL) CREATE OR REPLACE FUNCTION update_updated_at() RETURNS TRIGGER AS $$ BEGIN NEW.updated_at = now(); RETURN NEW; END; $$ LANGUAGE plpgsql;

CREATE TRIGGER trg_orders_updated_at BEFORE UPDATE ON orders FOR EACH ROW EXECUTE FUNCTION update_updated_at();

Design Rules

  • Use BIGINT GENERATED ALWAYS AS IDENTITY for auto-increment PKs (PostgreSQL 10+)

  • Use UUID PKs when IDs are exposed externally or distributed systems require it

  • Use TIMESTAMPTZ (not TIMESTAMP ) for all time columns -- always store UTC

  • Add NOT NULL constraints by default; allow NULL only when semantically meaningful

  • Use CHECK constraints to enforce domain rules at the database level

  • Use ENUM types sparingly -- prefer VARCHAR with CHECK for easier migration

  • Add ON DELETE CASCADE or ON DELETE SET NULL explicitly to every foreign key

  • Use soft deletes (deleted_at TIMESTAMPTZ ) when audit trail is required

Key Patterns

Common Table Expressions (CTEs)

Prefer CTEs over subqueries for readability and maintainability.

-- Readable: each step has a name WITH active_orders AS ( SELECT user_id, COUNT(*) AS order_count, SUM(total_cents) AS total_spent FROM orders WHERE status != 'cancelled' AND created_at >= now() - INTERVAL '30 days' GROUP BY user_id ), high_value_users AS ( SELECT user_id FROM active_orders WHERE total_spent > 50000 -- over $500 ) SELECT u.id, u.email, ao.order_count, ao.total_spent FROM users u JOIN active_orders ao ON ao.user_id = u.id WHERE u.id IN (SELECT user_id FROM high_value_users) ORDER BY ao.total_spent DESC;

Window Functions

Use window functions for ranking, running totals, and row comparisons without self-joins.

-- Rank users by spending within each region SELECT u.id, u.region, SUM(o.total_cents) AS total_spent, RANK() OVER (PARTITION BY u.region ORDER BY SUM(o.total_cents) DESC) AS rank FROM users u JOIN orders o ON o.user_id = u.id GROUP BY u.id, u.region;

-- Running total of daily revenue SELECT date_trunc('day', created_at) AS day, SUM(total_cents) AS daily_revenue, SUM(SUM(total_cents)) OVER (ORDER BY date_trunc('day', created_at)) AS running_total FROM orders WHERE status = 'delivered' GROUP BY date_trunc('day', created_at) ORDER BY day;

UPSERT (INSERT ... ON CONFLICT)

-- PostgreSQL: insert or update on conflict INSERT INTO user_preferences (user_id, theme, language) VALUES ($1, $2, $3) ON CONFLICT (user_id) DO UPDATE SET theme = EXCLUDED.theme, language = EXCLUDED.language, updated_at = now();

Batch Operations

-- Batch insert with unnest (PostgreSQL) INSERT INTO tags (name, category) SELECT unnest($1::text[]), unnest($2::text[]) ON CONFLICT (name) DO NOTHING;

-- Batch update with VALUES list UPDATE products AS p SET price_cents = v.new_price FROM (VALUES (1, 2999), (2, 4999), (3, 1499) ) AS v(id, new_price) WHERE p.id = v.id;

Parameterized Queries (Application Code)

Python (psycopg2/asyncpg) -- ALWAYS use parameterized queries

cursor.execute( "SELECT id, email FROM users WHERE email = %s AND is_active = %s", (email, True), )

Node.js (pg)

const result = await pool.query( 'SELECT id, email FROM users WHERE email = $1 AND is_active = $2', [email, true] );

Go (database/sql)

row := db.QueryRowContext(ctx, "SELECT id, email FROM users WHERE email = $1 AND is_active = $2", email, true, )

Performance

EXPLAIN ANALYZE

Always use EXPLAIN ANALYZE to validate query plans before deploying.

-- Check execution plan and actual timing EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT u.id, u.email, COUNT(o.id) AS order_count FROM users u LEFT JOIN orders o ON o.user_id = u.id WHERE u.created_at >= '2025-01-01' GROUP BY u.id, u.email ORDER BY order_count DESC LIMIT 20;

What to look for:

  • Seq Scan on large tables -- usually needs an index

  • Nested Loop with large outer sets -- consider Hash Join via index or rewrite

  • Sort operations with high cost -- add index matching ORDER BY

  • Rows Removed by Filter much larger than Actual Rows -- index the filter column

  • Buffers: shared hit vs shared read -- high reads indicate cold cache

Index Strategy

Access Pattern Index Type Example

Equality lookup B-tree (default) WHERE email = $1

Range scan B-tree WHERE created_at > $1

Full-text search GIN + tsvector WHERE search_vector @@ to_tsquery($1)

JSONB field access GIN WHERE metadata @> '{"key": "val"}'

Array containment GIN WHERE tags @> ARRAY['sql']

Geometric/spatial GiST WHERE location <-> point($1,$2) < $3

Pattern matching (LIKE 'abc%' ) B-tree with text_pattern_ops

WHERE name LIKE 'abc%'

N+1 Prevention

-- BAD: N+1 (one query per user to fetch orders) -- Application loops: for each user, SELECT orders WHERE user_id = ?

-- GOOD: single query with JOIN SELECT u.id, u.email, o.id AS order_id, o.total_cents FROM users u LEFT JOIN orders o ON o.user_id = u.id WHERE u.is_active = true ORDER BY u.id, o.created_at DESC;

-- GOOD: batch with ANY (when you have a list of IDs) SELECT id, email, created_at FROM users WHERE id = ANY($1::bigint[]);

Pagination

-- Cursor-based pagination (preferred for large datasets) SELECT id, email, created_at FROM users WHERE created_at < $1 -- cursor: last seen created_at AND is_active = true ORDER BY created_at DESC LIMIT 25;

-- Offset-based pagination (simpler but slower for deep pages) SELECT id, email, created_at FROM users WHERE is_active = true ORDER BY created_at DESC LIMIT 25 OFFSET $1;

Tooling

Essential Commands

PostgreSQL CLI

psql -h localhost -U myuser -d mydb # Connect psql -f migration.sql mydb # Run migration file pg_dump -Fc mydb > backup.dump # Backup (custom format) pg_restore -d mydb backup.dump # Restore

pgcli (enhanced CLI with autocomplete)

pgcli -h localhost -U myuser -d mydb

Schema inspection

\dt # List tables \d+ table_name # Describe table with details \di # List indexes \df # List functions

Migration Tools

golang-migrate

migrate create -ext sql -dir db/migrations -seq add_users_phone migrate -path db/migrations -database "$DATABASE_URL" up migrate -path db/migrations -database "$DATABASE_URL" down 1

Alembic (Python/SQLAlchemy)

alembic revision -m "add users phone" alembic upgrade head alembic downgrade -1

Prisma (Node.js)

npx prisma migrate dev --name add_users_phone npx prisma migrate deploy

dbmate

dbmate new add_users_phone dbmate up dbmate rollback

Useful Diagnostic Queries (PostgreSQL)

-- Find slow queries (requires pg_stat_statements extension) SELECT query, calls, mean_exec_time, total_exec_time FROM pg_stat_statements ORDER BY mean_exec_time DESC LIMIT 10;

-- Find unused indexes SELECT schemaname, tablename, indexname, idx_scan FROM pg_stat_user_indexes WHERE idx_scan = 0 AND indexname NOT LIKE '%pkey%' ORDER BY pg_relation_size(indexrelid) DESC;

-- Table sizes SELECT relname, pg_size_pretty(pg_total_relation_size(relid)) FROM pg_catalog.pg_statio_user_tables ORDER BY pg_total_relation_size(relid) DESC;

-- Active connections and queries SELECT pid, state, query, query_start, now() - query_start AS duration FROM pg_stat_activity WHERE state != 'idle' ORDER BY duration DESC;

References

For detailed patterns and examples, see:

  • references/patterns.md -- Window functions, CTE patterns, indexing strategies, advanced query techniques

External References

  • PostgreSQL Documentation

  • Use The Index, Luke -- SQL indexing and tuning

  • Modern SQL -- Modern SQL features across databases

  • pgMustard EXPLAIN Guide -- Reading query plans

  • SQL Style Guide (Holywell) -- Formatting conventions

  • Postgres Wiki: Don't Do This -- Common PostgreSQL anti-patterns

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

actix-web

No summary provided by upstream source.

Repository SourceNeeds Review
General

frontend-design

No summary provided by upstream source.

Repository SourceNeeds Review
General

blazor

No summary provided by upstream source.

Repository SourceNeeds Review