webchat-voice-gui

Voice input and microphone button for OpenClaw WebChat Control UI. Adds a mic button to chat, records audio via browser MediaRecorder, transcribes locally via faster-whisper, and injects text into the conversation. Includes gateway hook for update safety. Real-time VU meter shows voice activity. Push-to-Talk (hold to speak) and Toggle mode (click start/stop), switchable via double-click. Keyboard shortcuts: Ctrl+Space PTT, Ctrl+Shift+M continuous recording, Ctrl+Shift+B live transcription [beta]. Localized UI (English, German, Chinese built-in, extensible). Fully local speech-to-text, no API costs. Keywords: voice input, microphone, WebChat, Control UI, speech to text, STT, local transcription, MediaRecorder, voice button, mic button, push-to-talk, PTT, keyboard shortcut, i18n, localization.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "webchat-voice-gui" with this command: npx skills add neldar/webchat-voice-gui

WebChat Voice GUI

Voice input GUI for OpenClaw WebChat Control UI:

  • Mic button with idle/recording/processing states
  • Real-time VU meter: button shadow/scale reacts to voice level
  • Push-to-Talk: hold mic button to record, release to send (default mode)
  • Toggle mode: click to start, click to stop (switch via double-click on mic button)
  • Keyboard shortcuts: Ctrl+Space Push-to-Talk, Ctrl+Shift+M start/stop continuous recording, Ctrl+Shift+B live transcription [beta]
  • Localized UI: auto-detects browser language (English, German, Chinese built-in), customizable
  • Gateway startup hook re-injects script after openclaw update

Prerequisites

  1. webchat-https-proxy — HTTPS/WSS reverse proxy must be deployed and running.
  2. faster-whisper-local-service — Local STT backend on port 18790.

Verify:

systemctl --user is-active openclaw-voice-https.service
systemctl --user is-active openclaw-transcribe.service

Deploy

bash scripts/deploy.sh

With language override:

VOICE_LANG=de bash scripts/deploy.sh

When run interactively without VOICE_LANG, the script will ask you to choose a UI language.

This script is idempotent.

Quick verify

bash scripts/status.sh

Security Notes

Client-side JS (voice-input.js)

  • No dynamic code execution: No eval(), new Function(), or innerHTML with user data.
  • HTTPS-first: Transcription requests use same-origin /transcribe when served over HTTPS. Only falls back to http://127.0.0.1:18790 in local dev.
  • No external servers: Audio is never sent outside the local machine.
  • Auth forwarding: Bearer token from Control UI is forwarded to /transcribe proxy.
  • Uses textContent for all toast messages (no XSS vector).
  • Bounded memory: Continuous recording mode enforces a 120-chunk limit (~2 minutes), preventing unbounded memory growth.

Deployment scripts

  • Language input validated: VOICE_LANG must match ^([a-zA-Z]{2,5}(-[a-zA-Z]{2,5})?|auto)$ — prevents injection via sed.
  • Robust path detection: All scripts validate Control UI directory exists before modifying files.
  • Gateway hook: Uses execFileSync with array args — no shell interpolation. Script path derived from __dirname, not user input.
  • Idempotent: All scripts safe to run repeatedly.

No data exfiltration

  • No outbound network calls from JS or scripts.
  • No telemetry, analytics, or tracking.

What this skill modifies

WhatPathAction
Control UI HTML<npm-global>/openclaw/dist/control-ui/index.htmlAdds <script> tag for voice-input.js
Control UI asset<npm-global>/openclaw/dist/control-ui/assets/voice-input.jsCopies mic button JS
Gateway hook~/.openclaw/hooks/voice-input-inject/Installs startup hook that re-injects JS after updates
Workspace files~/.openclaw/workspace/voice-input/Copies voice-input.js, i18n.json

Mic Button Controls

ActionEffect
Hold (PTT mode)Record while held, transcribe on release
Click (Toggle mode)Start recording / stop and transcribe
Double-clickSwitch between PTT and Toggle mode
Right-clickToggle beep sound on/off
Ctrl+Space (hold)Push-to-Talk via keyboard
Ctrl+Shift+MStart/stop recording
Ctrl+Shift+BStart/stop live transcription [beta]

Language / i18n

Auto-detects browser language. Built-in: English (en), German (de), Chinese (zh).

Override in browser console:

localStorage.setItem('oc-voice-lang', 'de');  // force German
localStorage.removeItem('oc-voice-lang');      // back to auto-detect

See assets/i18n.json for all translation keys.

Uninstall

bash scripts/uninstall.sh

This removes the UI injection, hook, and workspace files. Does not touch the HTTPS proxy or faster-whisper backend — uninstall those separately.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Hippo Video

Hippo Video integration. Manage Persons, Organizations, Deals, Leads, Activities, Notes and more. Use when the user wants to interact with Hippo Video data.

Registry SourceRecently Updated
General

币安资金费率监控

币安资金费率套利监控工具 - 查看账户、持仓、盈亏统计,SkillPay收费版

Registry SourceRecently Updated
General

apix

Use `apix` to search, browse, and execute API endpoints from local markdown vaults. Use this skill to discover REST API endpoints, inspect request/response s...

Registry SourceRecently Updated
0160
dngpng