skill-defender

Scans installed OpenClaw skills for malicious patterns including prompt injection, credential theft, data exfiltration, obfuscated payloads, and backdoors. Use when installing new skills, after skill updates, or for periodic security scans. Runs deterministic pattern matching — fast, offline, no API cost.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "skill-defender" with this command: npx skills add itsclawdbro/skill-defender

Skill Defender — Malicious Pattern Scanner

When to Run

Automatic Triggers

  1. New skill installed — Immediately run scan_skill.py against it before allowing use
  2. Skill updated — Re-scan after any file changes in a skill directory
  3. Periodic audit — Run batch scan on all installed skills when requested

Manual Triggers

  • User says "scan skill X" → scan that specific skill
  • User says "scan all skills" → batch scan all skills
  • User says "security check" or "audit skills" → same as above

Scripts

scripts/scan_skill.py — Single Skill Scanner

Scans one skill directory for malicious patterns. Produces JSON or human-readable output.

scripts/aggregate_scan.py — Batch Scanner

Scans ALL installed skills and produces a single JSON report. Includes a built-in allowlist to reduce false positives from security-related skills, API skills, and other known-safe patterns.

How to Run

# Scan a single skill (human-readable)
python3 scripts/scan_skill.py /path/to/skill-dir

# Scan a single skill (JSON output)
python3 scripts/scan_skill.py /path/to/skill-dir --json

# Scan ALL installed skills (JSON aggregate report)
python3 scripts/aggregate_scan.py

# With custom skills directory
python3 scripts/aggregate_scan.py --skills-dir /path/to/skills

# With verbose warnings
python3 scripts/scan_skill.py /path/to/skill-dir --verbose

# Exclude false positives
python3 scripts/scan_skill.py /path/to/skill-dir --exclude "pattern1" "pattern2"

Exit Codes (scan_skill.py)

  • 0 = clean or informational only
  • 1 = suspicious (medium/high findings)
  • 2 = dangerous (critical findings)
  • 3 = error

Output Format (aggregate_scan.py)

{
  "skills": [
    {
      "name": "skill-name",
      "verdict": "clean|suspicious|dangerous|error",
      "findingsCount": 0,
      "findings": []
    }
  ],
  "summary": "All 37 skills passed with no significant issues.",
  "totalSkills": 37,
  "cleanCount": 37,
  "suspiciousCount": 0,
  "dangerousCount": 0,
  "errorCount": 0,
  "timestamp": "2026-02-02T06:00:00+00:00"
}

Auto-Detection

Both scripts auto-detect paths:

  • Skills directory: Detected from script location (walks up to find skills/ parent), falls back to ~/clawd/skills, ~/skills, ~/.openclaw/skills
  • Scanner script: aggregate_scan.py finds scan_skill.py co-located in the same directory

Handling Results

✅ Clean (verdict: "clean")

  • No action needed — skill is safe

⚠️ Suspicious (verdict: "suspicious")

  • Warn the user with a summary of findings
  • Show the category and severity of each finding

🚨 Dangerous (verdict: "dangerous")

  • Block the skill — do not proceed with installation or use
  • Show the full detailed findings to the user
  • Require explicit user override to proceed

Built-in Allowlist

The aggregate scanner includes an allowlist for known false positives:

  • Security scanners (skill-defender, clawdbot-security-check) — their docs/scripts contain the very patterns they detect
  • Auth-dependent skills (tailscale, reddit, n8n, event-planner) — legitimately reference credential paths and API keys
  • Config-aware skills (memory-setup, eightctl, summarize) — reference config paths in documentation
  • Agent-writing skills (self-improving-agent) — designed to modify agent files

Pattern Reference

See references/threat-patterns.md for full documentation of all detected patterns, organized by category with explanations of why each is dangerous.

Important Notes

  • No external dependencies — standard library only (Python 3.9+)
  • Fast — under 1 second per skill, ~30 seconds for a full batch of 30+ skills
  • This is deterministic pattern matching (Layer 2 defense). Not LLM-based.
  • False positives are possible — the allowlist and --exclude flag help
  • The scanner will flag itself if scanned without the allowlist — this is expected

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Security

Flue — Desktop Software Bridge

Flue is a lightweight bridge enabling command-line control of professional desktop software by executing scripts inside the app's automation runtime and retu...

Registry SourceRecently Updated
Security

Trent OpenClaw Security Assessment

Assess your Agent deployment against security risks using Trent.

Registry SourceRecently Updated
36510Profile unavailable
Security

TrustBoost PII Sanitizer

Sanitizes PII from text before sending to LLMs. Use when handling user-generated text that may contain sensitive data, when privacy compliance is required (G...

Registry SourceRecently Updated
1350Profile unavailable
Security

Agent Causal

Helps decide to ship, continue, or roll back changes from A/B test and DiD data by providing statistical analysis, decisions, and audit trails.

Registry SourceRecently Updated
900Profile unavailable