Prompt Cache
A lightweight caching layer that prevents regenerating identical content. Saved approximately 60% of API quota in production by catching duplicate prompts before they hit the API.
How It Works
- Normalize the prompt (lowercase, collapse whitespace)
- Combine with context keys (user name, language, model)
- SHA-256 hash the combined key
- Check cache table for existing result
- On miss: call API, store result. On hit: return cached result instantly.
Usage
import prompt_cache
# Check before calling expensive API
cached = await prompt_cache.get_cached(
prompt="Tell me a story about clouds",
child_name="Sophie",
language="fr"
)
if cached:
return cached # Free! No API call needed.
# Cache miss — call the API
result = await generate_story(prompt, child_name, language)
# Store for next time
await prompt_cache.set_cached(prompt, child_name, language, result)
Schema
CREATE TABLE IF NOT EXISTS prompt_cache (
prompt_hash TEXT NOT NULL,
child_name TEXT NOT NULL,
language TEXT NOT NULL,
story_json TEXT,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (prompt_hash, child_name, language)
);
Adapt the Keys
The default implementation uses (prompt, child_name, language) as the cache key. Adapt to your domain:
- Chat completions:
(system_prompt, user_message, model) - TTS:
(text, voice_id, model_id) - Image gen:
(prompt, seed, model, size)
Files
scripts/prompt_cache.py— Cache implementation (35 lines)