Database Migrations Core Knowledge

Deep Knowledge: Use mcp__documentation__fetch_docs with technology: migrations for comprehensive documentation.

Migration Fundamentals

What is a Migration?

A migration is a version-controlled change to your database schema or data.

Migration Types

Type Description Example

Schema DDL changes Add column, create index

Data DML changes Backfill data, transform values

Combined Both schema and data Add column with default, populate

Version Naming Conventions

Timestamp-based (Recommended)

V20240115103000__create_users_table.sql V20240115104500__add_email_index.sql V20240116090000__add_status_column.sql

Sequential

V001__create_users_table.sql V002__add_email_index.sql V003__add_status_column.sql

Semantic

V1.0.0__initial_schema.sql V1.1.0__add_orders_table.sql V1.1.1__fix_orders_constraint.sql

Migration File Structure

Flyway Format

-- V20240115103000__create_users_table.sql

CREATE TABLE users ( id SERIAL PRIMARY KEY, email VARCHAR(255) NOT NULL UNIQUE, name VARCHAR(100) NOT NULL, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP );

CREATE INDEX idx_users_email ON users(email);

With Rollback (Flyway Pro/Enterprise)

-- V20240115103000__create_users_table.sql CREATE TABLE users (...);

-- U20240115103000__create_users_table.sql (undo) DROP TABLE IF EXISTS users;

Liquibase Format

<databaseChangeLog> <changeSet id="1" author="dev"> <createTable tableName="users"> <column name="id" type="int" autoIncrement="true"> <constraints primaryKey="true"/> </column> <column name="email" type="varchar(255)"> <constraints nullable="false" unique="true"/> </column> </createTable> <rollback> <dropTable tableName="users"/> </rollback> </changeSet> </databaseChangeLog>

Migration Strategies

Expand-Contract Pattern

For backward-compatible changes:

Phase 1: EXPAND ├── Add new column (nullable or with default) ├── Add new table ├── Deploy new code that writes to both old and new └── Backfill existing data

Phase 2: CONTRACT ├── Remove old column usage from code ├── Make new column non-nullable if needed ├── Drop old column └── Deploy final code

Example - Renaming a column:

-- Phase 1: Expand ALTER TABLE users ADD COLUMN full_name VARCHAR(200); UPDATE users SET full_name = name; -- Deploy code that reads from both, writes to both

-- Phase 2: Contract (after verification) ALTER TABLE users DROP COLUMN name;

Blue-Green Deployment

┌─────────────┐ ┌─────────────┐ │ Blue │ │ Green │ │ (Current) │ │ (New) │ └──────┬──────┘ └──────┬──────┘ │ │ └───────┬───────────┘ │ ┌──────┴──────┐ │ Database │ │ (Shared) │ └─────────────┘

Green environment runs migrations
Test Green with new schema
Switch traffic Blue → Green
Blue becomes standby

Rolling Updates

Apply backward-compatible migration
Update servers one by one
Old code continues working
After all servers updated, remove old code paths
Apply cleanup migration

Zero-Downtime Patterns

Add Column (Safe)

-- Safe: Column added as nullable ALTER TABLE users ADD COLUMN phone VARCHAR(20);

-- Safe: Column added with default ALTER TABLE users ADD COLUMN status VARCHAR(20) DEFAULT 'active';

-- PostgreSQL 11+: Fast default ALTER TABLE users ADD COLUMN created_at TIMESTAMP DEFAULT NOW();

Add Non-Nullable Column

-- Step 1: Add nullable ALTER TABLE users ADD COLUMN email_verified BOOLEAN;

-- Step 2: Backfill UPDATE users SET email_verified = FALSE WHERE email_verified IS NULL;

-- Step 3: Add constraint ALTER TABLE users ALTER COLUMN email_verified SET NOT NULL;

Rename Column

-- Step 1: Add new column ALTER TABLE users ADD COLUMN full_name VARCHAR(200);

-- Step 2: Copy data UPDATE users SET full_name = name;

-- Step 3: Add trigger for sync (during transition) CREATE TRIGGER sync_name BEFORE INSERT OR UPDATE ON users FOR EACH ROW EXECUTE FUNCTION sync_name_columns();

-- Step 4: Update application to use new column

-- Step 5: Remove old column ALTER TABLE users DROP COLUMN name; DROP TRIGGER sync_name ON users;

Add Index (Non-Blocking)

-- PostgreSQL: CONCURRENTLY CREATE INDEX CONCURRENTLY idx_users_email ON users(email);

-- MySQL: ALGORITHM=INPLACE, LOCK=NONE ALTER TABLE users ADD INDEX idx_email (email), ALGORITHM=INPLACE, LOCK=NONE;

-- SQL Server: ONLINE CREATE INDEX idx_users_email ON users(email) WITH (ONLINE = ON);

Drop Column (Safe)

-- Step 1: Stop writing to column (application change) -- Step 2: Deploy application -- Step 3: Drop column ALTER TABLE users DROP COLUMN deprecated_field;

Rename Table

-- Step 1: Create view with old name pointing to new table ALTER TABLE orders RENAME TO order_records; CREATE VIEW orders AS SELECT * FROM order_records;

-- Step 2: Update application to use new name -- Step 3: Drop view DROP VIEW orders;

Data Migration Patterns

Batch Processing

-- Process in batches to avoid locking DO $$ DECLARE batch_size INT := 1000; affected INT := 1; BEGIN WHILE affected > 0 LOOP UPDATE users SET status = 'active' WHERE id IN ( SELECT id FROM users WHERE status IS NULL LIMIT batch_size FOR UPDATE SKIP LOCKED ); GET DIAGNOSTICS affected = ROW_COUNT; COMMIT; PERFORM pg_sleep(0.1); -- Small delay END LOOP; END $$;

Background Job Migration

Instead of SQL, use application code

def migrate_user_status(): batch_size = 1000 offset = 0

while True:
    users = User.query.filter(User.status == None) \
                     .limit(batch_size).all()
    if not users:
        break

    for user in users:
        user.status = calculate_status(user)

    db.session.commit()
    time.sleep(0.1)  # Rate limiting

ETL Pattern

-- 1. Create new table with desired structure CREATE TABLE users_new ( id SERIAL PRIMARY KEY, email VARCHAR(255) NOT NULL, full_name VARCHAR(200) NOT NULL, -- Combined from first_name, last_name created_at TIMESTAMP DEFAULT NOW() );

-- 2. Copy and transform data INSERT INTO users_new (id, email, full_name, created_at) SELECT id, email, first_name || ' ' || last_name, created_at FROM users;

-- 3. Swap tables ALTER TABLE users RENAME TO users_old; ALTER TABLE users_new RENAME TO users;

-- 4. Update sequences SELECT setval('users_id_seq', (SELECT MAX(id) FROM users));

-- 5. Drop old table (after verification) DROP TABLE users_old;

Rollback Strategies

Immediate Rollback Script

-- migration.sql ALTER TABLE users ADD COLUMN phone VARCHAR(20);

-- rollback.sql ALTER TABLE users DROP COLUMN phone;

Point-in-Time Recovery

PostgreSQL

pg_restore --target-time="2024-01-15 10:00:00" -d mydb backup.dump

MySQL

mysqlbinlog --stop-datetime="2024-01-15 10:00:00" binlog.000001 | mysql

Forward-Fix (Preferred)

Instead of rollback, deploy a fix:

-- Original migration had bug -- V2: Create new migration to fix ALTER TABLE users ALTER COLUMN status SET DEFAULT 'pending'; -- Fix the default

Migration Testing

Pre-deployment Checklist

Migration tested on copy of production data
Rollback script tested
Application compatible with both old and new schema
Index creation time estimated
Lock duration estimated
Disk space requirements checked
Backup taken before migration

Test Environment Setup

Create production copy

pg_dump production_db | psql test_db

Run migration

flyway -url=jdbc:postgresql://localhost/test_db migrate

Run application tests

npm test

Verify schema

pg_dump --schema-only test_db > schema.sql diff schema.sql expected_schema.sql

Best Practices

Use version control for migrations
Test migrations on production-like data
Make migrations idempotent when possible
Document complex migrations
Keep migrations small and focused
Use expand-contract for breaking changes
Create indexes concurrently

DON'T

Mix schema and data changes
Run migrations during peak hours
Delete migration files
Edit applied migrations
Skip testing rollbacks
Make assumptions about data

Common Pitfalls

Lock Contention

-- Problem: Long-running transaction holds lock BEGIN; ALTER TABLE users ADD COLUMN x INT; -- ... long running queries ... COMMIT;

-- Solution: Keep transaction short ALTER TABLE users ADD COLUMN x INT;

Missing Index

-- Problem: Query becomes slow after adding data ALTER TABLE users ADD COLUMN status VARCHAR(20);

-- Solution: Add index in same migration ALTER TABLE users ADD COLUMN status VARCHAR(20); CREATE INDEX CONCURRENTLY idx_users_status ON users(status);

Constraint Violations

-- Problem: Existing data violates new constraint ALTER TABLE users ADD CONSTRAINT chk_email CHECK (email LIKE '%@%'); -- Fails if bad data exists

-- Solution: Clean data first UPDATE users SET email = 'invalid@example.com' WHERE email NOT LIKE '%@%'; ALTER TABLE users ADD CONSTRAINT chk_email CHECK (email LIKE '%@%');

When NOT to Use This Skill

Flyway specifics - Use flyway skill for Flyway commands and patterns
Prisma migrations - Use prisma skill for Prisma migrate
TypeORM migrations - Use typeorm skill for TypeORM migrations
Liquibase - Use specific Liquibase documentation

Anti-Patterns

Anti-Pattern Problem Solution

Mixing schema and data changes Hard to rollback Separate into different migrations

No rollback script Can't undo changes Always create undo migration

Long-running migrations Locks tables, downtime Use online DDL, batch processing

Editing applied migrations Version conflicts, checksum errors Create new migration

Missing backup Data loss risk Always backup before migrating

No testing on prod-like data Unexpected failures Test with production data copy

Quick Troubleshooting

Problem Diagnostic Fix

Migration fails mid-run Check transaction support Use smaller batches, manual fix

Lock timeout Check running queries Run during low traffic

Version conflicts Check migration history table Resolve conflicts, rebase

Checksum mismatch Compare file with history Repair or recreate migration

Out of disk space Check table sizes Clean old data first

Reference Documentation

Migration Strategies
Schema Versioning
Rollback Patterns
Data Migration

migrations

Safety Notice

Copy this and send it to your AI assistant to learn

Instead of SQL, use application code

PostgreSQL

MySQL

Create production copy

Run migration

Run application tests

Verify schema

Source Transparency

Related Skills

cron-scheduling

token-optimization

react-19