openclaw-guardian

A security layer plugin for OpenClaw that intercepts dangerous tool calls (exec, write, edit) through two-tier regex blacklist rules and LLM-based intent verification. Critical operations require 3/3 unanimous LLM votes, warning-level operations require 1 LLM confirmation. 99% of normal operations pass instantly with zero overhead. Includes bypass/pipe-attack detection, path canonicalization, SHA-256 hash-chain audit logging, and auto-discovers a cheap model from your existing provider config.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "openclaw-guardian" with this command: npx skills add fatcatmaofei/openclaw-guardian

OpenClaw Guardian

The missing safety layer for AI agents.

Why?

OpenClaw gives agents direct access to shell, files, email, browser, and more. 99% of that is harmless. Guardian catches the 1% that isn't — without slowing down the rest.

How It Works

Tool Call → Blacklist Matcher (regex rules, 0ms)
              ↓
   No match     → Pass instantly (99% of calls)
   Warning hit  → 1 LLM vote ("did the user ask for this?")
   Critical hit → 3 LLM votes (all must confirm user intent)

Two Blacklist Levels

LevelLLM VotesLatencyExamples
No match0~0msReading files, git, normal ops
Warning1~1-2srm -rf /tmp/cache, chmod 777, sudo apt
Critical3 (unanimous)~2-4srm -rf ~/, mkfs, dd of=/dev/, shutdown

What Gets Checked

Only three tool types are inspected:

  • exec → command string matched against exec blacklist
  • write / edit → file path canonicalized and matched against path blacklist
  • Everything else passes through instantly

LLM Intent Verification

When a blacklist rule matches, Guardian asks a lightweight LLM: "Did the user explicitly request this?" It reads recent conversation context to prevent false positives.

  • Warning: 1 LLM call. Confirmed → proceed.
  • Critical: 3 parallel LLM calls. All 3 must confirm. Any "no" → block.

Auto-discovers a cheap/fast model from your existing OpenClaw provider config (prefers Haiku). No separate API key needed.

LLM Fallback

  • Critical + LLM down → blocked (fail-safe)
  • Warning + LLM down → asks user for manual confirmation

Blacklist Rules

Critical (exec)

  • rm -rf on system paths (excludes /tmp/ and workspace)
  • mkfs, dd to block devices, redirects to /dev/sd*
  • Writes to /etc/passwd, /etc/shadow, /etc/sudoers
  • shutdown, reboot, disable SSH
  • Bypass: eval, absolute-path rm, interpreter-based (python -c, node -e)
  • Pipe attacks: curl | sh, wget | bash, base64 -d | sh
  • Chain attacks: download + chmod +x + execute

Warning (exec)

  • rm -rf on safe paths, sudo, chmod 777, chown root
  • Package install/remove, service management
  • Crontab mods, SSH/SCP, Docker ops, kill/killall

Path Rules (write/edit)

  • Critical: system auth files, SSH keys, systemd units
  • Warning: dotfiles, /etc/ configs, .env files, authorized_keys

Audit Log

Every blacklist hit logged to ~/.openclaw/guardian-audit.jsonl with SHA-256 hash chain — tamper-evident, each entry covers full content + previous hash.

Installation

openclaw plugins install openclaw-guardian

Or manually:

cd ~/.openclaw/workspace
git clone https://github.com/fatcatMaoFei/openclaw-guardian.git

Token Cost

Scenario% of OpsExtra Cost
No match~99%0
Warning~0.5-1%~500 tokens
Critical<0.5%~1500 tokens

Prefers cheap models (Haiku, GPT-4o-mini, Gemini Flash).

File Structure

extensions/guardian/
├── index.ts                # Entry — registers before_tool_call hook
├── src/
│   ├── blacklist.ts        # Two-tier regex rules (critical/warning)
│   ├── llm-voter.ts        # LLM intent verification
│   └── audit-log.ts        # SHA-256 hash-chain audit logger
├── test/
│   └── blacklist.test.ts   # Blacklist rule tests
├── openclaw.plugin.json    # Plugin manifest
└── default-policies.json   # Enable/disable toggle

License

MIT

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Security

Code Review Inspector

Automated code review checking for bugs, security issues, best practices, performance problems, and code style

Registry SourceRecently Updated
Security

agent-bom runtime

AI runtime security monitoring — context graph analysis, runtime audit log correlation with CVE findings, and vulnerability analytics queries. Use when the u...

Registry SourceRecently Updated
Security

agent-bom scan

Open security scanner for agentic infrastructure — agents, MCP, packages, blast radius, runtime, and trust for package CVEs (OSV, NVD, EPSS, KEV), container...

Registry SourceRecently Updated
Security

agent-bom vulnerability intel

Use agent-bom to check package, SBOM, inventory, and agent dependency exposure against OSV, GitHub Security Advisories, NVD, EPSS, and CISA KEV with explicit...

Registry SourceRecently Updated