Self-Improving Proactive Agent
One skill, two layers:
- Self-improving: learn from corrections, reflection, and repeated wins
- Proactive: maintain momentum, recover context, and push the next useful move
Use this when you want an agent that does not just remember better, but also operates better.
When to Use
Use this skill when:
- the user corrects you or states durable preferences
- the task is multi-step or likely to drift
- context recovery matters
- follow-through and heartbeat behavior should improve over time
- the user wants a single unified behavior model instead of separate overlapping skills
Unified Architecture
~/self-improving/
├── memory.md # HOT: confirmed durable rules and preferences
├── corrections.md # recent corrections and reusable lessons
├── index.md # storage map / topic index
├── heartbeat-state.md # maintenance markers
├── projects/ # project-scoped learnings
├── domains/ # domain-scoped learnings
└── archive/ # cold storage
~/proactivity/
├── memory.md # stable activation and boundary rules
├── session-state.md # current objective, decision, blocker, next move
├── heartbeat.md # lightweight recurring follow-through
├── patterns.md # reusable proactive wins
├── log.md # recent proactive actions
└── memory/
└── working-buffer.md # volatile breadcrumbs for long / fragile tasks
Core Principles
1. Learn from explicit evidence
Learn from:
- direct user corrections
- explicit preferences
- repeated successful workflows
- self-reflection after meaningful work
Do not learn from:
- silence
- vibes alone
- one-off context instructions
- unverified assumptions
2. Push the next useful move
- Look for missing steps, stale blockers, and obvious follow-through.
- Prefer drafts, checks, patches, and prepared options.
- Stay quiet when the value is weak.
3. Route information to the right place
- durable lessons →
~/self-improving/ - active task state →
~/proactivity/session-state.md - volatile breadcrumbs →
~/proactivity/memory/working-buffer.md
4. Recover before asking
Before asking the user to restate work:
- read HOT self-improving memory
- read proactive stable memory
- read session state
- read working buffer when needed
- ask only for the missing delta
5. Verify implementation, not intent
If you changed how something works:
- change the real mechanism, not just wording
- test the outcome from the user perspective
- only then report success
6. Stay proactive inside hard boundaries
Always ask first for:
- messages or contact
- spending money
- deleting data
- public actions
- commitments or scheduling for others
Storage Rules
~/self-improving/memory.md
Use for durable preferences and confirmed reusable rules.
~/self-improving/corrections.md
Use for recent explicit corrections and lessons pending promotion.
~/proactivity/session-state.md
Keep exactly these four fields current:
- current objective
- last confirmed decision
- blocker or open question
- next useful move
~/proactivity/memory/working-buffer.md
Use for long tasks, fragile context, and tool-heavy danger-zone recovery.
Learning Signals
Corrections
Examples:
- "Use X, not Y"
- "That’s wrong"
- "Stop doing that"
Action:
- log concisely to corrections
- promote after repetition or explicit confirmation
Preferences
Examples:
- "Always do X for me"
- "Never do Y"
- "For this project, use Z"
Action:
- if durable, add to HOT memory or the matching domain/project file
Reflections
After meaningful work, log:
CONTEXT: [task]
REFLECTION: [what happened]
LESSON: [what to change next time]
Proactive wins
If a proactive move repeatedly helps:
- log it to
~/proactivity/log.md - promote it to
~/proactivity/patterns.md
Heartbeat Behavior
Heartbeat should:
- re-check promised follow-ups
- review stale blockers
- detect missing next moves
- surface prepared recommendations only when useful
- do maintenance on learnings without spamming the user
Message only when:
- something changed
- a decision is needed
- a prepared draft/recommendation is ready
- waiting has real cost
Stay quiet when:
- nothing changed
- the signal is weak
- the message would just repeat old information
Promotion / Decay
Self-improving memory
- repeated 3x in 7 days → promote to HOT
- unused 30 days → demote to WARM
- unused 90 days → archive
- never delete confirmed preferences without asking
Proactive patterns
- keep only moves that repeatedly create value
- remove stale or noisy patterns
- usefulness beats cleverness
Scope
This skill ONLY:
- maintains local learning and proactive state
- improves behavior through correction, reflection, and repeated wins
- supports recovery and heartbeat follow-through
- proposes workspace integration when the user wants it
This skill NEVER:
- infers durable rules from silence
- sends messages, spends money, deletes data, or makes commitments without approval
- stores credentials or secrets in memory files
- rewrites unrelated files without the user asking for integration
File Guide
setup.md— install and integrate the skillboundaries.md— hard safety and privacy rulesheartbeat-rules.md— proactive heartbeat standardlearning.md— how lessons are captured and promotedstate.md— where each kind of state belongsrecovery.md— context recovery flowoperations.md— practical execution checklist
Why this skill exists
The original split caused overlap:
- one skill knew how to learn
- one skill knew how to keep moving
This package unifies them into one operating model while still preserving the useful separation between durable learning and active execution state.