WebChat Voice Full Stack

Meta-installer that orchestrates three standalone skills in the correct order:

faster-whisper-local-service — local STT backend (HTTP on 127.0.0.1:18790)
webchat-https-proxy — HTTPS/WSS reverse proxy for Control UI + WebSocket + transcription
webchat-voice-gui — mic button, VU meter, keyboard shortcuts, i18n for WebChat

Prerequisites

All three skills must be installed before running this meta-installer:

npx clawhub install faster-whisper-local-service
npx clawhub install webchat-https-proxy
npx clawhub install webchat-voice-gui

Additionally required on the system:

Python 3.10+
gst-launch-1.0 (GStreamer, from OS packages)
Internet access on first run (model download ~1.5 GB for medium)

Deploy

bash scripts/deploy.sh

Optional overrides (passed through to downstream scripts):

VOICE_HOST=10.0.0.42 VOICE_HTTPS_PORT=8443 TRANSCRIBE_PORT=18790 WHISPER_LANGUAGE=auto bash scripts/deploy.sh

What this does (via downstream scripts)

This skill does not contain deployment logic itself. It calls deploy.sh from each sub-skill:

Step 1: faster-whisper-local-service

Creates Python venv, installs faster-whisper==1.1.1
Writes transcribe-server.py with input validation (magic-byte check, size limit)
Creates systemd user service openclaw-transcribe.service
Downloads model weights on first run (~1.5 GB for medium)

Step 2: webchat-https-proxy

Copies https-server.py to workspace
Adds HTTPS origin to gateway.controlUi.allowedOrigins
Creates systemd user service openclaw-voice-https.service
Auto-generates self-signed TLS cert (TLS 1.2+ enforced)

Step 3: webchat-voice-gui

Copies voice-input.js and injects <script> tag into Control UI
Installs gateway startup hook for update safety
Optional interactive language selection

For full details, security notes, and uninstall instructions, see each skill's SKILL.md.

Security posture (why these changes are expected)

This is a meta-installer, so it coordinates downstream skills and applies only the minimum required local changes:

Persistence: creates user-level systemd services so STT/proxy survive reboot (openclaw-transcribe, openclaw-voice-https)
UI enablement: injects one explicit <script> tag for voice-input.js in Control UI
Gateway compatibility: appends one HTTPS origin to gateway.controlUi.allowedOrigins

Safety characteristics:

all changes are documented and reversible via uninstall scripts
no root/sudo required (user scope only)
no hidden background tasks beyond documented services
no outbound telemetry or data exfiltration behavior

Integrity verification

Before executing any sub-skill script, deploy.sh verifies SHA256 checksums of all sub-skill scripts against scripts/checksums.sha256. If any script was modified after installation (e.g. by a registry update or tampering), deployment aborts with a clear error.

Workflow:

npx clawhub install <sub-skill> — fetch from registry
Audit the scripts manually or via code review
bash scripts/rehash.sh — record trusted checksums
bash scripts/deploy.sh — verify checksums, then deploy

Dry-run verification (no deployment):

VERIFY_ONLY=true bash scripts/deploy.sh

After a sub-skill update:

Review the changed scripts
bash scripts/rehash.sh to update the trusted baseline
Commit the updated checksums.sha256

Verify

bash scripts/status.sh

Uninstall

Uninstall each skill separately (in reverse order):

# 1. Voice GUI (hook, UI injection, workspace files)
bash skills/webchat-voice-gui/scripts/uninstall.sh

# 2. HTTPS Proxy (service, gateway config, certs)
bash skills/webchat-https-proxy/scripts/uninstall.sh

# 3. STT Backend (service, venv)
systemctl --user stop openclaw-transcribe.service
systemctl --user disable openclaw-transcribe.service
rm -f ~/.config/systemd/user/openclaw-transcribe.service
systemctl --user daemon-reload

Notes

This meta-skill is a convenience wrapper. All actual logic lives in the three sub-skills.
Review each sub-skill's scripts and security notes before running.
The WORKSPACE and SKILLS_DIR paths are configurable via environment variables.