Failure Memory
Record failures. Learn from them. Never repeat them.
Core Concept
Every failure has three parts:
- What happened (error message, symptom)
- Why it happened (root cause)
- How to fix/avoid it (resolution)
This skill stores them in a searchable markdown file and provides a recall mechanism before starting similar tasks.
File Structure
memory/
└── failures.md # All failure records (append-only log)
Recording a Failure
When an error occurs during work, append to memory/failures.md:
## [YYYY-MM-DD HH:mm] <short title>
- **Category:** <build|deploy|config|api|permissions|data|logic|network|dependency>
- **Context:** <what you were trying to do>
- **Error:** `<exact error message or symptom>`
- **Root Cause:** <why it happened>
- **Resolution:** <what fixed it>
- **Prevention:** <how to avoid next time>
- **Tags:** <comma-separated keywords for search>
When to Record
Record AUTOMATICALLY when:
- A shell command exits non-zero and you identify why
- An API call fails and you find the cause
- A config/setup step fails and you resolve it
- You catch yourself repeating a previously-solved mistake
- A sub-agent reports an error with resolution
Do NOT record:
- Transient network timeouts (unless pattern emerges)
- Intentional test failures
- User-cancelled operations
Pre-Task Recall
Before starting any significant task, search failures for relevant history:
grep -i "<keyword>" memory/failures.md
Or use memory_search if vector search is available:
memory_search query="<task description> failure error"
If matches found, mention them briefly:
⚠️ Known pitfall: [title] — [prevention tip]
Failure Report
When asked for a failure report or review, generate a summary:
- Read
memory/failures.md - Group by category
- Identify repeat patterns (same root cause appearing multiple times)
- Suggest systemic fixes for patterns
Report Format
# Failure Report — YYYY-MM-DD
## Stats
- Total: N failures recorded
- Top category: <category> (N occurrences)
- Repeat offenders: N patterns seen 2+ times
## Repeat Patterns
### <pattern name>
- Seen: N times
- Root cause: <shared cause>
- Systemic fix: <recommendation>
## Recent Failures (last 7 days)
- [date] <title> — <resolution>
Initialization
Run scripts/init.sh to set up the failures file:
bash scripts/init.sh [memory_dir]
Default memory_dir: ./memory
Best Practices
- Be specific — "EACCES on /var/run/docker.sock" beats "permission error"
- Include the exact error — Future grep depends on it
- Tag generously — More tags = better recall
- Review monthly — Patterns reveal systemic issues
- Link to fixes — Reference commits, PRs, or config changes when possible