canary-deploy

Safe system changes with automatic baseline capture, canary testing, and rollback for critical infrastructure modifications. Use when making changes to SSH config, firewall rules, network settings, systemd services, kernel parameters, or any system change that could break remote access. Prevents lockouts by validating connectivity before and after changes. Born from a real incident where AllowTcpForwarding=no killed VPN tunnel access.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "canary-deploy" with this command: npx skills add lolaopenclaw/canary-deploy

Canary Deploy

Safe system changes with pre-flight checks, validation, and automatic rollback.

The Problem

System changes can lock you out:

  • SSH hardening breaks remote access
  • Firewall rules block needed ports
  • Kernel parameters cause instability
  • Service restarts break dependencies

Recovery without physical access is painful or impossible.

Quick Start

Before any critical change

# Capture baseline (connectivity, services, ports)
bash scripts/canary-test.sh baseline

# Make your change
sudo nano /etc/ssh/sshd_config

# Validate change didn't break anything
bash scripts/canary-test.sh validate

# If validation fails:
bash scripts/canary-test.sh rollback

For automated changes

# Full pipeline: baseline → apply → validate → rollback-if-failed
bash scripts/critical-update.sh \
  --name "SSH hardening" \
  --backup "/etc/ssh/sshd_config" \
  --command "sudo sed -i 's/PermitRootLogin yes/PermitRootLogin no/' /etc/ssh/sshd_config && sudo systemctl reload sshd" \
  --validate "ssh -o ConnectTimeout=5 localhost echo ok"

Protocol A+B (Manual Workflow)

For interactive sessions where you want human-in-the-loop:

Protocol A: Test interactively

  1. Tell the human: "Open a second SSH session as backup"
  2. Apply change in the first session
  3. Ask: "Test connectivity from the second session"
  4. If it works → confirm
  5. If it fails → rollback from the backup session

Protocol B: Backup first

  1. Run bash scripts/canary-test.sh baseline
  2. Verify backup is valid
  3. Apply change
  4. Run bash scripts/canary-test.sh validate
  5. If validation fails → bash scripts/canary-test.sh rollback

Always use both A + B together for maximum safety.

What Gets Checked

Baseline capture

  • SSH connectivity (local + remote)
  • Open ports (ss -tlnp)
  • Running services (systemctl)
  • Firewall rules (ufw/iptables)
  • Network routes
  • DNS resolution
  • Config file checksums

Validation

  • All baseline checks re-run
  • Diff against baseline
  • Any regression = FAIL

Critical Change Categories

CategoryRiskExampleRecovery
SSH config🔴 HIGHsshd_config changesBackup session
Firewall🔴 HIGHUFW/iptables rulesPre-change snapshot
Network🔴 HIGHInterface/routing changesConsole access
Services🟡 MEDIUMsystemd unit changessystemctl restart
Kernel params🟡 MEDIUMsysctl changesReboot to defaults
Packages🟢 LOWapt install/upgradeapt rollback

References

See references/incident-report.md for the real incident that inspired this skill.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Accelo

Accelo integration. Manage Organizations, Leads, Pipelines, Users, Goals, Filters. Use when the user wants to interact with Accelo data.

Registry SourceRecently Updated
General

8X8

8x8 integration. Manage Persons, Organizations, Deals, Leads, Activities, Notes and more. Use when the user wants to interact with 8x8 data.

Registry SourceRecently Updated
General

7Shifts

7shifts integration. Manage Companies. Use when the user wants to interact with 7shifts data.

Registry SourceRecently Updated
General

46Elks

46elks integration. Manage Organizations. Use when the user wants to interact with 46elks data.

Registry SourceRecently Updated