Storage

Choose and architect storage systems for applications with the right tradeoffs.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "Storage" with this command: npx skills add ivangdavila/storage

Object vs Block vs File

  • Object storage (S3, R2, GCS) for immutable blobs: images, videos, backups, logs — cheap, scales infinitely, but no partial updates
  • Block storage (EBS, Persistent Disks) for databases and apps needing filesystem semantics — faster, but tied to single instance
  • Network file systems (NFS, EFS) when multiple instances need shared filesystem access — convenient but latency and cost add up
  • Default to object storage for user uploads — block storage for database files only

When SQL vs NoSQL

  • SQL when you need joins, transactions, or complex queries — fighting against NoSQL for relational data wastes months
  • Document stores (MongoDB, Firestore) for nested/variable schemas where you always fetch the whole document
  • Key-value (Redis, DynamoDB) for simple lookups by ID at massive scale — not for complex queries
  • Time-series databases (InfluxDB, TimescaleDB) for metrics with timestamp-based queries — regular SQL struggles with retention policies
  • Start with PostgreSQL unless you have a specific reason not to — it handles JSON, full-text search, and scales further than most assume

Local vs Cloud Storage

  • Local disk for ephemeral data: temp files, build artifacts, caches — assume it disappears on restart
  • Cloud storage for anything that must survive instance termination — never store user data only on local disk
  • Local SSD for databases in production — network-attached storage adds latency to every query
  • Hybrid: local cache in front of cloud storage for frequently accessed files

CDN Patterns

  • Put CDN in front of static assets always — origin requests are slower and more expensive
  • Set long cache TTLs with versioned URLs (style.abc123.css) — cache invalidation is slow and unreliable
  • CDN for dynamic content only if latency matters more than freshness — adds complexity for marginal gains
  • Edge caching for API responses works but cache keys get tricky — start simple, add only when needed

Upload Handling

  • Never accept uploads directly to app server disk in production — use presigned URLs to cloud storage
  • Set file size limits at load balancer level, not just application — prevents memory exhaustion attacks
  • Generate unique keys for uploads (UUIDs) — user-provided filenames cause collisions and path traversal risks
  • Validate file types by content (magic bytes), not extension — extensions are trivially spoofed

Data Locality

  • Keep compute and storage in same region — cross-region data transfer adds latency and cost
  • Replicate data to regions where users are, not where developers are
  • Multi-region storage adds complexity — single region with backups elsewhere usually sufficient
  • Database read replicas in user regions for read-heavy workloads

Retention and Lifecycle

  • Define retention policy before storing data — "keep everything" becomes expensive and legally risky
  • Automate deletion of temporary data — manual cleanup never happens consistently
  • Tiered storage for aging data: hot → warm → cold → archive — but check retrieval costs before archiving
  • Separate storage for logs vs business data — different retention, different compliance requirements

Cost Traps

  • Egress fees dominate cloud storage costs — calculate before choosing provider
  • Many small files cost more than few large files — batch small writes when possible
  • Minimum storage duration on cold tiers — early deletion still charges full period
  • API request costs matter at scale — millions of LIST operations add up

Backup Strategy

  • 3-2-1 rule: 3 copies, 2 different media types, 1 offsite — cloud counts as one location
  • Test restores regularly — untested backups are not backups
  • Point-in-time recovery for databases — daily snapshots lose a day of data
  • Version important files — deletion or corruption often discovered late

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Dlazy Kling Audio Clone

Generate customized speech that highly restores the timbre by uploading reference audio using Kling Audio Clone.

Registry SourceRecently Updated
General

Dlazy Keling Sfx

Generate matching scene sound effects based on text descriptions or video frames using Kling SFX.

Registry SourceRecently Updated
General

Dlazy Keling Tts

Convert text into high-quality, emotional speech reading using Kling TTS.

Registry SourceRecently Updated
General

Dlazy Jimeng T2i

Text-to-image generation with Jimeng, quickly converting text to high-quality images.

Registry SourceRecently Updated