ProcessGuard — Critical Process Monitor & Auto-Restart

Monitor critical processes and auto-restart on failure. Tracks CPU and memory usage, escalates alerts via webhook, callback, or file, and writes a dead man's switch heartbeat so you know if ProcessGuard itself goes down. HTTP dashboard included. Zero required dependencies — CPU/memory monitoring unlocked with optional pidusage install.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "ProcessGuard — Critical Process Monitor & Auto-Restart" with this command: npx skills add theshadowrose/process-guard

ProcessGuard — Critical Process Monitor & Auto-Restart

Keep services running without babysitting. Define processes, configure health checks, and let ProcessGuard handle the rest.

What It Does

  • Health checks — HTTP, TCP port, PID file, or shell command
  • Auto-restart — configurable retry limits and cooldown delays
  • CPU & memory monitoring — per-process thresholds with alerts (requires npm install pidusage)
  • Alert escalation — warning → critical, delivered via callback / webhook / JSON file
  • Dead man's switch — heartbeat file updated every 10s so external monitors know if ProcessGuard itself crashes
  • HTTP dashboard — optional /status endpoint for real-time JSON status
  • Command allowlist — optionally restrict which executables restart/check commands may use

Quick Setup

const { ProcessGuard } = require('./src/process-guard');

const guard = new ProcessGuard({
  processes: [
    {
      name: 'ollama',
      check: 'http://localhost:11434/api/tags',
      restart: 'ollama serve',
      maxRestarts: 5,
      cooldown: 5000
    }
  ],
  checkInterval: 30000,
  dashboardPort: 9090,
  alert: {
    onAlert: async (alert) => console.error(`ALERT: ${alert.message}`)
  }
});

guard.start();

See README.md for full documentation, all config options, and advanced examples.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

Ops

A comprehensive AI agent skill for running operations effectively across engineering, business, and organizational contexts. Manages runbooks, incident coord...

Registry SourceRecently Updated
6380Profile unavailable
Coding

OpenClaw Gateway Guardian

OpenClaw 看门狗 - 自动监控 Gateway 状态,宕机时自动重启,支持配置守护和模型故障转移

Registry SourceRecently Updated
1450Profile unavailable
Coding

Infra

A comprehensive AI agent skill for managing, designing, and troubleshooting technical infrastructure. Covers cloud architecture, server configuration, networ...

Registry SourceRecently Updated
2020Profile unavailable
Coding

Observability & Reliability Engineering

Complete observability & reliability engineering system. Use when designing monitoring, implementing structured logging, setting up distributed tracing, buil...

Registry SourceRecently Updated
5251Profile unavailable