voiceclaw-jp

Voice conversation interface for OpenClaw using wake word detection, streaming LLM responses, and text-to-speech. Use when a user wants to talk to their OpenClaw agent by voice, set up a voice assistant, or add speech input/output to OpenClaw. Supports configurable wake words, VOICEVOX TTS, and sentence-level streaming for low-latency responses.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "voiceclaw-jp" with this command: npx skills add kentoku24/voiceclaw-jp

voiceclaw

Voice conversation skill for OpenClaw: wake word → STT → LLM (streaming) → TTS → playback.

Requirements

  • OpenClaw running locally (gateway with chatCompletions enabled)
  • Node.js 18+
  • VOICEVOX running on localhost:50021 (download)
  • Chrome/Edge (Web Speech API for STT)
  • HTTPS for remote mic access (localhost works without HTTPS)

Quick Start

# Install
git clone https://github.com/kentoku24/voiceclaw.git
cd voiceclaw
npm install

# Start (no .env needed if OpenClaw is running locally)
npm start
# → [voiceclaw] OpenClaw config loaded from ~/.openclaw/openclaw.json
# → [voiceclaw] listening on http://127.0.0.1:8788

# Open browser
open http://127.0.0.1:8788

Press 開始, say the wake word (default: アリス), then speak your command.

Configuration

All settings are optional. Set in .env or environment variables:

VariableDefaultDescription
WAKE_WORDSアリスちゃん,アリス,...Comma-separated wake words
STT_LANGja-JPSpeech recognition language
OPENCLAW_MODELopenclawLLM model name
VOICEVOX_URLhttp://127.0.0.1:50021VOICEVOX endpoint
VOICEVOX_SPEAKER1VOICEVOX speaker ID
HOST127.0.0.1Server bind address
PORT8788Server port

Gateway token is auto-detected from ~/.openclaw/openclaw.json. Override with OPENCLAW_GATEWAY_TOKEN if needed.

Architecture

Wake word (browser STT) → voiceclaw server → OpenClaw Gateway (streaming)
                                           → sentence-level TTS (VOICEVOX)
                                           → audio playback (Web Audio API)

See docs/architecture.md for the full sequence diagram.

API Endpoints

MethodPathDescription
GET/healthHealth check
GET/api/configClient-safe settings (wake words, STT lang)
POST/api/chat-streamStreaming LLM → sentence-level SSE
POST/api/chatNon-streaming LLM (fallback)
POST/api/ttsText → VOICEVOX → WAV audio

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Huo15 Xiaohongshu

Use when the user wants to write, analyze, or improve Xiaohongshu (小红书) content — drafting notes, coaching writing skills, diagnosing AI-speak or Jarvis-trap...

Registry SourceRecently Updated
General

Openclaw Nextcloud

Manage Notes, Tasks, Calendar, Files, and Contacts in your Nextcloud instance via CalDAV, WebDAV, and Notes API. Use for creating notes, managing todos and c...

Registry SourceRecently Updated
General

Twenty CRM

Twenty CRM API integration with managed authentication. Manage companies, people, opportunities, notes, and tasks. Use this skill when users want to interact...

Registry SourceRecently Updated
General

Vercel

Vercel API integration with managed OAuth. Manage projects, deployments, domains, teams, and environment variables. Use this skill when users want to interac...

Registry SourceRecently Updated