openclaw-self-healing

4-tier autonomous self-healing and auto-recovery system for OpenClaw Gateway. Monitors gateway health, auto-restarts on crash, detects OAuth token expiry, kills zombie processes, and escalates to Claude Code AI for diagnosis when automated recovery fails. Use when your OpenClaw gateway crashes, stops responding, enters a restart loop, or needs automatic monitoring and recovery. Features watchdog, config validation, exponential backoff, Discord/Telegram alerts. macOS & Linux.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "openclaw-self-healing" with this command: npx skills add ramsbaby/openclaw-self-healing

OpenClaw Self-Healing System

"The system that heals itself — or calls for help when it can't."

A 4-tier autonomous recovery system for OpenClaw Gateway, featuring AI-powered diagnosis via Claude Code. Tested in production on macOS + Linux.

Architecture

Level 1: config-watch        → Config file change detection + instant reload
Level 2: Watchdog v4.4       → OAuth detection, zombie kill, exponential backoff
Level 3: Claude Code Doctor  → AI-powered diagnosis & repair (30 min window) 🧠
Level 4: Discord/Telegram    → Human escalation with full context

What's New in v3.1.0

  • Complete healing chain fix — config-watch → Watchdog → Emergency Recovery now fully connected
  • Installer rewrite — single install.sh covers macOS (LaunchAgent) + Linux (systemd)
  • Watchdog v4.4 — OAuth token expiry detection, zombie process auto-kill, Exponential Backoff
  • Emergency Recovery v2 — persistent learning repo, reasoning logs, multi-model support (Claude Code + Aider)
  • Metrics dashboard — success rate, MTTR, trending analysis via tmux

Quick Setup

bash <(curl -fsSL https://raw.githubusercontent.com/Ramsbaby/openclaw-self-healing/main/install.sh)

Or install via ClawHub:

npx clawhub@latest install openclaw-self-healing

The 4 Tiers in Detail

LevelScriptTriggerAction
L1config-watch.shConfig file changeValidate + reload gateway
L2gateway-watchdog.shProcess down / HTTP failKill zombie → restart → backoff
L3emergency-recovery-v2.sh30min continuous failureClaude Code PTY diagnosis
L4emergency-recovery-monitor.shL3 triggeredDiscord + Telegram alert

Configuration

All settings via environment variables in ~/.openclaw/.env:

VariableDefaultDescription
DISCORD_WEBHOOK_URL(none)Discord webhook for L4 alerts
OPENCLAW_GATEWAY_URLhttp://localhost:18789/Gateway health check URL
HEALTH_CHECK_MAX_RETRIES3Restart attempts before L3 escalation
EMERGENCY_RECOVERY_TIMEOUT1800Claude recovery timeout (30 min)

Verified Recovery Cases

  • OAuth token expiry — Watchdog v4.4 detects 401 in logs, restarts before agent dies
  • Zombie process — Preflight detects PID mismatch, SIGKILL + launchctl kickstart
  • Config schema erroropenclaw doctor --fix auto-applied on exit_1 pattern
  • Level 3 triggered — Claude Code diagnosed and fixed broken config in < 15 min

Links

License

MIT — built by @ramsbaby + Jarvis 🦞

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

OpenClaw Gateway Guardian

OpenClaw 看门狗 - 自动监控 Gateway 状态,宕机时自动重启,支持配置守护和模型故障转移

Registry SourceRecently Updated
2090Profile unavailable
General

Phoenix Shield

Self-healing backup and update system with intelligent rollback. Protects against failed updates by automatically monitoring system health post-update and re...

Registry SourceRecently Updated
1.7K0Profile unavailable
Security

Aegis Protocol

Self-healing stability monitor for AI agents - 5 core checks + 15 extended checks, auto-recovery, health scoring

Registry SourceRecently Updated
1930Profile unavailable
Coding

Self Updater

⭐ OPEN SOURCE! GitHub: github.com/GhostDragon124/openclaw-self-updater ⭐ ONLY skill with Cron-aware + Idle detection! Auto-updates OpenClaw core & skills, an...

Registry SourceRecently Updated
4851Profile unavailable