OpenClaw Self-Healing System

"The system that heals itself — or calls for help when it can't."

A 4-tier autonomous self-healing system for OpenClaw Gateway.

Architecture

Level 1: Watchdog (180s)     → Process monitoring (OpenClaw built-in)
Level 2: Health Check (300s) → HTTP 200 + 3 retries
Level 3: Claude Recovery     → 30min AI-powered diagnosis 🧠
Level 4: Discord Alert       → Human escalation

What's Special (v2.0)

World's first Claude Code as Level 3 emergency doctor
Persistent Learning - Automatic recovery documentation (symptom → cause → solution → prevention)
Reasoning Logs - Explainable AI decision-making process
Multi-Channel Alerts - Discord + Telegram support
Metrics Dashboard - Success rate, recovery time, trending analysis
Production-tested (verified recovery Feb 5-6, 2026)
macOS LaunchAgent integration

Quick Setup

1. Install Dependencies

brew install tmux
npm install -g @anthropic-ai/claude-code

2. Configure Environment

# Copy template to OpenClaw config directory
cp .env.example ~/.openclaw/.env

# Edit and add your Discord webhook (optional)
nano ~/.openclaw/.env

3. Install Scripts

# Copy scripts
cp scripts/*.sh ~/openclaw/scripts/
chmod +x ~/openclaw/scripts/*.sh

# Install LaunchAgent
cp launchagent/com.openclaw.healthcheck.plist ~/Library/LaunchAgents/
launchctl load ~/Library/LaunchAgents/com.openclaw.healthcheck.plist

4. Verify

# Check Health Check is running
launchctl list | grep openclaw.healthcheck

# View logs
tail -f ~/openclaw/memory/healthcheck-$(date +%Y-%m-%d).log

Scripts

Script	Level	Description
`gateway-healthcheck.sh`	2	HTTP 200 check + 3 retries + escalation
`emergency-recovery.sh`	3	Claude Code PTY session for AI diagnosis (v1)
`emergency-recovery-v2.sh`	3	Enhanced with learning + reasoning logs (v2) ⭐
`emergency-recovery-monitor.sh`	4	Discord/Telegram notification on failure
`metrics-dashboard.sh`	-	Visualize recovery statistics (NEW)

Configuration

All settings via environment variables in ~/.openclaw/.env:

Variable	Default	Description
`DISCORD_WEBHOOK_URL`	(none)	Discord webhook for alerts
`OPENCLAW_GATEWAY_URL`	`http://localhost:18789/`	Gateway health check URL
`HEALTH_CHECK_MAX_RETRIES`	`3`	Restart attempts before escalation
`EMERGENCY_RECOVERY_TIMEOUT`	`1800`	Claude recovery timeout (30 min)

Testing

Test Level 2 (Health Check)

# Run manually
bash ~/openclaw/scripts/gateway-healthcheck.sh

# Expected output:
# ✅ Gateway healthy

Test Level 3 (Claude Recovery)

# Inject a config error (backup first!)
cp ~/.openclaw/openclaw.json ~/.openclaw/openclaw.json.bak

# Wait for Health Check to detect and escalate (~8 min)
tail -f ~/openclaw/memory/emergency-recovery-*.log

License

MIT License - do whatever you want with it.

Built by @ramsbaby + Jarvis 🦞

openclaw-self-healing

Safety Notice

Copy this and send it to your AI assistant to learn

OpenClaw Self-Healing System

Architecture

What's Special (v2.0)

Quick Setup

1. Install Dependencies

2. Configure Environment

3. Install Scripts

4. Verify

Scripts

Configuration

Testing

Test Level 2 (Health Check)

Test Level 3 (Claude Recovery)

Links

License

Source Transparency

Related Skills

Client Intake Bot Pro

LMFiles

Github App Authentication

vet