Data Retention & Archiving Planner

Manage data lifecycle with automated retention and archiving.

Retention Policy Document

Data Retention Policy

Retention Periods

Data Type	Hot Storage	Cold Storage	Total Retention	Reason
User accounts	Active	N/A	Indefinite	Business need
Order history	2 years	5 years	7 years	Tax compliance
Logs	30 days	90 days	120 days	Operational
Analytics events	90 days	1 year	15 months	Business insights
Audit trails	1 year	6 years	7 years	Legal compliance
User sessions	30 days	None	30 days	Security
Failed login attempts	90 days	None	90 days	Security

Compliance Requirements

GDPR (EU)

Right to erasure (right to be forgotten)
Data minimization
Storage limitation

HIPAA (Healthcare)

Minimum 6 years retention
Secure archival required

SOX (Financial)

7 years retention for financial records
Immutable audit trails

PCI DSS (Payments)

1 year minimum for audit logs
3 months minimum for transaction logs

Archive Schema Design

-- Hot database: Current active data CREATE TABLE orders ( id BIGSERIAL PRIMARY KEY, user_id BIGINT NOT NULL, total DECIMAL(10,2) NOT NULL, status TEXT NOT NULL, created_at TIMESTAMP NOT NULL DEFAULT NOW(), updated_at TIMESTAMP NOT NULL DEFAULT NOW() );

-- Cold database: Archived historical data CREATE TABLE orders_archive ( id BIGINT PRIMARY KEY, user_id BIGINT NOT NULL, total DECIMAL(10,2) NOT NULL, status TEXT NOT NULL, created_at TIMESTAMP NOT NULL, updated_at TIMESTAMP NOT NULL, archived_at TIMESTAMP NOT NULL DEFAULT NOW() );

-- Create partition for time-based archival CREATE TABLE orders_2024_q1 PARTITION OF orders FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');

CREATE TABLE orders_2024_q2 PARTITION OF orders FOR VALUES FROM ('2024-04-01') TO ('2024-07-01');

Archival Job Implementation

// jobs/archive-orders.ts import { PrismaClient } from "@prisma/client";

const prisma = new PrismaClient(); const archivePrisma = new PrismaClient({ datasources: { db: { url: process.env.ARCHIVE_DATABASE_URL, }, }, });

interface ArchivalJob { table: string; retentionDays: number; batchSize: number; }

async function archiveOrders() { const cutoffDate = new Date(); cutoffDate.setDate(cutoffDate.getDate() - 730); // 2 years

console.log(📦 Archiving orders older than ${cutoffDate.toISOString()});

let archived = 0; let hasMore = true;

while (hasMore) { await prisma.$transaction(async (tx) => { // Find orders to archive const ordersToArchive = await tx.order.findMany({ where: { created_at: { lt: cutoffDate }, status: { in: ["delivered", "cancelled"] }, }, take: 1000, });

  if (ordersToArchive.length === 0) {
    hasMore = false;
    return;
  }

  // Copy to archive database
  await archivePrisma.order.createMany({
    data: ordersToArchive.map((order) => ({
      ...order,
      archived_at: new Date(),
    })),
    skipDuplicates: true,
  });

  // Delete from hot database
  await tx.order.deleteMany({
    where: {
      id: { in: ordersToArchive.map((o) => o.id) },
    },
  });

  archived += ordersToArchive.length;
  console.log(`  Archived ${archived} orders...`);
});

// Rate limiting
await new Promise((resolve) => setTimeout(resolve, 100));

}

console.log(✅ Archived ${archived} orders total); }

// Schedule: Run nightly archiveOrders();

Automated Cleanup Jobs

// jobs/cleanup-old-data.ts interface CleanupJob { table: string; column: string; retentionDays: number; }

const CLEANUP_JOBS: CleanupJob[] = [ { table: "sessions", column: "created_at", retentionDays: 30, }, { table: "password_reset_tokens", column: "created_at", retentionDays: 1, }, { table: "failed_login_attempts", column: "attempted_at", retentionDays: 90, }, { table: "analytics_events", column: "created_at", retentionDays: 90, }, ];

async function runCleanupJobs() { console.log("🗑️ Running cleanup jobs...\n");

for (const job of CLEANUP_JOBS) { const cutoffDate = new Date(); cutoffDate.setDate(cutoffDate.getDate() - job.retentionDays);

const result = await prisma.$executeRawUnsafe(
  `
  DELETE FROM "${job.table}"
  WHERE "${job.column}" &#x3C; $1
`,
  cutoffDate
);

console.log(
  `✅ ${job.table}: Deleted ${result} rows older than ${job.retentionDays} days`
);

}

console.log("\n✅ Cleanup complete!"); }

Soft Delete Pattern

// Soft delete for GDPR compliance model User { id Int @id @default(autoincrement()) email String @unique name String deletedAt DateTime? // NULL = active, NOT NULL = deleted createdAt DateTime @default(now()) updatedAt DateTime @updatedAt

@@index([deletedAt]) }

// Middleware to filter soft-deleted records prisma.$use(async (params, next) => { if (params.action === 'findMany' || params.action === 'findFirst') { params.args.where = { ...params.args.where, deletedAt: null, // Only show non-deleted }; } return next(params); });

// Hard delete after retention period async function purgeDeletedUsers() { const cutoffDate = new Date(); cutoffDate.setDate(cutoffDate.getDate() - 90); // 90 days retention

const result = await prisma.user.deleteMany({ where: { deletedAt: { lt: cutoffDate }, }, });

console.log(🗑️ Purged ${result.count} deleted users); }

Cold Storage Migration

#!/bin/bash

scripts/migrate-to-s3.sh

Dump old orders to S3 for cold storage

CUTOFF_DATE="2022-01-01"

echo "📦 Migrating orders to S3..."

1. Export to CSV

psql $DATABASE_URL -c "\COPY ( SELECT * FROM orders WHERE created_at < '$CUTOFF_DATE' ) TO STDOUT WITH CSV HEADER" | gzip > orders_archive.csv.gz

2. Upload to S3

aws s3 cp orders_archive.csv.gz s3://my-cold-storage/orders/

3. Verify upload

if aws s3 ls s3://my-cold-storage/orders/orders_archive.csv.gz; then echo "✅ Uploaded to S3"

4. Delete from database

psql $DATABASE_URL -c "DELETE FROM orders WHERE created_at < '$CUTOFF_DATE'"

echo "✅ Deleted from database" else echo "❌ S3 upload failed, skipping deletion" exit 1 fi

Compliance Automation

// Right to be forgotten (GDPR) async function deleteUserData(userId: number) { console.log(🗑️ Deleting user data for user ${userId}...);

await prisma.$transaction(async (tx) => { // 1. Anonymize orders (keep for business records) await tx.order.updateMany({ where: { userId }, data: { userId: null, shippingAddress: "[DELETED]", billingAddress: "[DELETED]", }, });

// 2. Delete personal data
await tx.userProfile.delete({ where: { userId } });
await tx.paymentMethod.deleteMany({ where: { userId } });
await tx.address.deleteMany({ where: { userId } });

// 3. Soft delete user account
await tx.user.update({
  where: { id: userId },
  data: {
    email: `deleted-${userId}@example.com`,
    name: "[DELETED]",
    deletedAt: new Date(),
  },
});

});

console.log(✅ User data deleted); }

Monitoring & Alerting

// Monitor archive job health async function checkArchivalHealth() { // Check oldest active order const oldestOrder = await prisma.order.findFirst({ orderBy: { created_at: "asc" }, });

const age = Date.now() - oldestOrder.created_at.getTime(); const ageDays = age / (1000 * 60 * 60 * 24);

if (ageDays > 750) { // > 2 years + buffer console.error("⚠️ Orders older than retention period found!"); await sendAlert({ title: "Archive job failing", message: Oldest order is ${ageDays.toFixed(0)} days old, }); }

// Check archive database size const archiveCount = await archivePrisma.order.count(); console.log(📊 Archive database: ${archiveCount} orders);

// Check hot database size const hotCount = await prisma.order.count(); console.log(📊 Hot database: ${hotCount} orders); }

Restore from Archive

// Restore archived order (e.g., for audit) async function restoreArchivedOrder(orderId: number) { // Find in archive const archivedOrder = await archivePrisma.order.findUnique({ where: { id: orderId }, });

if (!archivedOrder) { throw new Error("Order not found in archive"); }

// Copy to hot database await prisma.order.create({ data: { ...archivedOrder, archived_at: undefined, }, });

console.log(✅ Restored order ${orderId} from archive); }

Schedule Configuration

cron schedule for archival jobs

jobs: archive-orders: schedule: "0 2 * * *" # 2 AM daily command: "npm run job:archive-orders"

cleanup-sessions: schedule: "0 3 * * *" # 3 AM daily command: "npm run job:cleanup-sessions"

purge-deleted-users: schedule: "0 4 * * 0" # 4 AM Sunday command: "npm run job:purge-deleted"

health-check: schedule: "0 */6 * * *" # Every 6 hours command: "npm run job:check-archival-health"

Best Practices

Define clear policies: Document retention periods
Automate everything: Manual cleanup is unreliable
Test restore: Regularly test archive restoration
Monitor job health: Alert on failures
Compliance first: Meet legal requirements
Soft delete: Before hard delete
Batch operations: Avoid locking tables

Output Checklist

Retention policy documented
Archive schema designed
Archival jobs implemented
Cleanup jobs automated
Soft delete pattern (if applicable)
Cold storage migration
GDPR compliance (right to be forgotten)
Job scheduling configured
Monitoring and alerting
Restore procedure tested

data-retention-archiving-planner

Safety Notice

Copy this and send it to your AI assistant to learn