openclaw-update-runbook

Use when updating OpenClaw or debugging an OpenClaw instance after an update. This skill acts as a structured update runbook with emphasis on gateway startup, service-manager state, plugin registry and install drift, bundled-vs-npm/clawhub plugin confusion, stale config carried across upgrades, channel health, task ledger corruption, and logs that explain why the updated system is slow, disconnected, or half-broken.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "openclaw-update-runbook" with this command: npx skills add bkf-gitty/claw-update-runbook

OpenClaw Update Runbook

Use this skill when an OpenClaw host was just updated, is about to be updated, or is behaving strangely after an update. It is a generic operator runbook, not a release-specific checklist.

This skill is meant to be installed as a folder, not copied as a single file. It expects references/failure-patterns.md to exist locally beside SKILL.md inside the same skill bundle.

The goal is not only to get it running, but to prove which layer is broken:

  • service lifecycle and service-manager state
  • host package version
  • plugin/package compatibility
  • config drift
  • model/provider runtime routing
  • channel health
  • task ledger health
  • runtime performance
  • command-path and update-channel assumptions

Quick workflow

  1. Establish the real starting state. If you are connected over non-interactive SSH, do not assume the login-shell PATH is available. First locate the binary with common install paths such as a package-manager prefix and ~/.local/bin/openclaw, then export the correct PATH for the audit session.

    Check:

    • openclaw --version
    • openclaw status --deep
    • openclaw doctor --non-interactive --no-workspace-suggestions
    • openclaw channels status --deep
    • openclaw tasks audit
    • current model routing: agent defaults, agent-level model maps, fallback chains, and cron payload models
    • recent successful sessions for the primary model and runtime, not just the display model name
  2. Verify the gateway is actually managed correctly. Look at service-manager state, running PID, and /health. Derive the service label/name and gateway port from openclaw status --deep and/or the service definition instead of guessing them. Do not trust only one of:

    • the host's service manager
    • process list
    • health endpoint

    It is common to have:

    • a service definition present but not loaded
    • a detached gateway process still serving traffic
    • the service manager and the live process disagreeing
  3. Separate bundled plugins from globally installed plugins. First inspect plugin health:

    • openclaw plugins doctor
    • openclaw plugins list --json
    • openclaw plugins inspect <id>

    Important rule:

    • If a capability is supposed to be bundled, verify whether a stale global npm install is shadowing it.
    • If a capability is not bundled, check npm and ClawHub before assuming config is wrong.
    • For special runtime plugins such as codex, compare plugins inspect <id> with plugins list --json; inspect can report a runtime as loaded while raw plugin metadata still says disabled.
  4. Check for config carried across the upgrade that no longer validates. Pay attention to:

    • tools.web.search.provider
    • plugins.allow
    • plugins.entries.*
    • model aliases and fallback chains
    • runtime mappings for openai/*, openai-codex/*, codex, and pi
    • cron job payload model refs, which can be normalized separately from agent defaults
    • update channel metadata

    If doctor says a provider or plugin is unknown, inspect the actual config file and do not assume doctor --fix fully cleaned it.

  5. Compare plugin install records to what exists on disk. Inspect:

    • ~/.openclaw/plugins/installs.json
    • ~/.openclaw/npm/node_modules/@openclaw/...
    • ~/.openclaw/extensions/...

    Look for:

    • recorded install paths that do not exist
    • recorded versions drifting from installed versions
    • package specs rewritten or preserved during openclaw update --channel ...
    • external plugins that lack a release for the selected channel and were installed from a fallback tag such as @latest
    • source-only TypeScript plugin packages with no compiled dist/
    • plugin runtime deps removed from third-party plugin directories
  6. Inspect recent gateway logs before changing too much. Read:

    • ~/.openclaw/logs/gateway.log
    • ~/.openclaw/logs/gateway.err.log
    • /tmp/openclaw/openclaw-YYYY-MM-DD.log

    Prioritize recent startup lines and warnings involving:

    • plugin load failures
    • config validation
    • provider fallback attempts and primary-route auth or module failures
    • update lifecycle messages such as service stop fallbacks, config overwrites/backups, and service reload timing
    • channel auth (if a channel returns 401/auth-failure post-update, inspect ~/.openclaw/service-env/*.env for token-line quote corruption — see Pattern #23 — before assuming the upstream credential was rotated)
    • context-engine fallback
    • active-memory timeouts
    • event loop degradation
    • task restart blocking
    • transient post-restart UI/websocket scope errors that clear after the gateway is ready
  7. Audit runtime/task health after the upgrade. Check for:

    • stale running tasks
    • lost tasks
    • delivery failures
    • timestamp inconsistencies

    A successful package update can still leave the system unhealthy if stale tasks block restarts or keep the audit red.

  8. Prove the primary model route, not just overall agent success. Run a narrow direct agent smoke test with a fresh session id and inspect the returned metadata:

    • final provider and model
    • runtime or harness id
    • fallbackAttempts
    • provider auth errors
    • module load errors
    • schema validation errors

    Treat status: ok as insufficient if the primary model failed and a fallback provider completed the run. Treat a clean plugins doctor as insufficient for runtime plugins until a fresh direct agent run proves that the intended harness can load and execute.

  9. Test at least one representative cron path. Check:

    • cron payload model counts
    • named or high-value cron job status
    • manual cron run behavior
    • whether --expect-final actually waits for final completion on the current build

    If cron verification only proves enqueue, state that clearly in the handoff notes.

  10. Re-run the narrowest fix, then verify again. Common fix sequence:

  • stop gateway cleanly
  • update host package
  • refresh plugin registry if needed
  • repair or update broken plugin installs
  • restart gateway
  • re-run doctor, plugins doctor, status --deep, channels status --deep, and tasks audit

Where to look first

Use this order when diagnosing post-update failures:

  • Service state: service manager, PID, /health
  • Host version: openclaw --version
  • Plugin mismatch: openclaw plugins doctor
  • Config drift: openclaw doctor
  • Channel reality: openclaw channels status --deep
  • Task ledger: openclaw tasks audit
  • Model/runtime route reality: direct smoke metadata and fallback attempts
  • Runtime symptoms: gateway logs

When to open references

Start with this file first.

Open references/failure-patterns.md when:

  • doctor or plugins doctor points to a known-looking regression
  • channels status or logs disagree with the apparent service health
  • plugin installs, install records, or config state do not match what is on disk
  • the update completed, but the host is still slow, disconnected, noisy, or half-broken

Use the reference file for symptom matching and concrete examples after the main workflow has narrowed the likely failure area.

Bundled vs external plugin rule

Do not assume a broken plugin means "plugin missing."

There are three common cases:

  • Bundled plugin exists in the host package, but stale config still points at an old provider/plugin id.
  • Bundled plugin exists, but a globally installed npm plugin shadows it and is on the wrong version.
  • Plugin is not bundled, so the fix is to inspect npm or ClawHub and reconcile install records.

A channel plugin is a good example of the second case: a host can upgrade correctly while still loading an older globally installed plugin package.

If the feature is not bundled, check npm and ClawHub before rewriting config.

Fixing mindset

Prefer the smallest fix that makes state consistent again:

  • refresh registry before reinstalling everything
  • update one stale plugin before removing all plugins
  • inspect the actual config file when helper commands appear to succeed but warnings remain
  • verify whether a third-party plugin needs local runtime deps before deleting plugin-side node_modules

Do not stop at "service is up." A good finish means:

  • the right version is installed
  • the gateway is managed correctly
  • channels are connected
  • the intended primary model route succeeds without an unexpected fallback
  • cron payload models and representative cron jobs are healthy
  • plugin doctor is clean or explained
  • task audit is not carrying a fresh blocking error

Handoff notes

If the upgrade exposed an OpenClaw bug rather than local drift, collect enough information for the next operator or project/support contact. Do not assume the user has any particular external account or wants a public report created.

  • exact version before and after
  • relevant config keys
  • primary model route before and after, including runtime id
  • direct smoke result metadata, especially fallbackAttempts
  • plugin source path actually loaded
  • installed package version and file layout for any failing npm plugin
  • whether the plugin was bundled or globally installed
  • exact update command and selected channel
  • whether external plugins used channel-specific versions or fallbacks
  • service stop/restart messages, especially if the service manager needed a fallback stop/unload path
  • doctor/plugins doctor warning text
  • the specific log lines around startup failure or restart

Sanitize handoff notes before sharing externally:

  • remove hostnames, usernames, IPs, machine names, tokens, account ids, channel ids, and personal job names
  • replace local paths with placeholders such as <state>, <global-openclaw>, and ~/.openclaw
  • summarize private prompt/session contents instead of quoting them
  • keep exact version numbers, package names, model ids, runtime ids, and error classes when they are needed to reproduce the bug

For concrete regression patterns and example symptoms, read references/failure-patterns.md.

Updating this skill

When another operator or agent learns something new from a different OpenClaw host:

  • do not delete existing workflow steps unless they are clearly wrong
  • do not replace an existing failure pattern with a narrower one
  • prefer additive updates over rewrites
  • add new regression patterns to references/failure-patterns.md
  • only tighten the main workflow in this file if the new lesson changes the recommended audit order for most hosts

If a new finding is host-specific or uncertain, add it as a new failure pattern with:

  • symptom
  • what to inspect
  • why it matters

Do not silently erase older patterns just because the current host did not hit them.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Deferred Decision Tracker

Capture decisions that are intentionally deferred, assign a review date and owner, and prevent them from silently disappearing.

Registry SourceRecently Updated
General

Shippo

(Beta) Ship packages with Shippo. Multi-carrier rate shopping, label generation, package tracking, address validation, customs declarations, and batch proces...

Registry SourceRecently Updated
1410shippo
General

Beta Launch Waitlist Generator

Generates complete pre-launch campaigns for SaaS, apps, courses, and hardware including landing pages, email sequences, Product Hunt kits, and upgrade nurtures.

Registry SourceRecently Updated
General

Bank Fraud Response Call Kit

Prepare a calm bank fraud response call packet for suspicious account activity, strange transfers, unauthorized debit card or bank charges, account lockouts,...

Registry SourceRecently Updated