asr-claw

Speech recognition CLI for AI agent automation. Transcribe audio from stdin, files, or URLs.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "asr-claw" with this command: npx skills add dionren/asr-claw

asr-claw

Speech recognition CLI for AI agent automation. Transcribe audio streams from stdin, files, or URLs with multiple ASR engines — local and cloud.

Triggers

  • User wants to transcribe audio, speech, or voice to text
  • User needs speech recognition or ASR
  • User wants to convert audio/voice recordings to text
  • User wants to monitor live audio / livestream speech
  • User asks about 语音识别、语音转文字、转写、直播语音
  • adb-claw audio capture output needs to be transcribed
  • User wants subtitles (SRT/VTT) generated from audio

Binary

The asr-claw binary is located at ${CLAUDE_PLUGIN_ROOT}/bin/asr-claw.

If it does not exist, the SessionStart hook will build or download it automatically.

Setup

Quick Start (Mac)

# Install the qwen-asr engine (builds C binary + downloads 0.6B model ~1.9GB)
asr-claw engines install qwen-asr

# Verify
asr-claw engines list
asr-claw doctor

OpenClaw Setup

After installing the skill via ClawHub, configure settings:

# Set default language (default: zh)
claw config set asr-claw.default_lang en

# Use a larger model
claw config set asr-claw.model Qwen/Qwen3-ASR-1.7B

# For China users — set HuggingFace mirror
claw config set asr-claw.hf_mirror https://hf-mirror.com

# Custom model path (e.g., shared NAS)
claw config set asr-claw.model_path /mnt/models/Qwen3-ASR-0.6B

# Re-run install after changing model settings
asr-claw engines install qwen-asr

Settings are stored in ~/.asr-claw/config.yaml:

default:
  engine: qwen-asr
  lang: zh
  format: json

engines:
  qwen-asr:
    binary: ~/.asr-claw/bin/qwen-asr
    model_path: ~/.asr-claw/models/Qwen3-ASR-0.6B

Cloud Engines (no local model needed)

# OpenAI Whisper API
export OPENAI_API_KEY=sk-...
asr-claw transcribe --file audio.wav --engine openai

# Volcengine Doubao (火山引擎)
export DOUBAO_API_KEY=...
asr-claw transcribe --file audio.wav --engine doubao

# Deepgram (native streaming)
export DEEPGRAM_API_KEY=...
asr-claw transcribe --file audio.wav --engine deepgram

Commands

transcribe — Core: audio to text

# File transcription
asr-claw transcribe --file meeting.wav --lang zh

# Pipe from stdin
cat audio.wav | asr-claw transcribe --lang zh

# Streaming (real-time, from adb-claw or ffmpeg)
adb-claw audio capture --stream --duration 60000 | asr-claw transcribe --stream --lang zh

# Subtitle output
asr-claw transcribe --file lecture.wav --format srt > lecture.srt
asr-claw transcribe --file lecture.wav --format vtt > lecture.vtt

# Specify engine
asr-claw transcribe --file audio.wav --engine whisper --lang en

Flags:

FlagDefaultDescription
--file <path>stdinInput audio file
--streamfalseStreaming mode (real-time)
--lang <code>zhLanguage code
--engine <name>autoASR engine
--format <fmt>jsonOutput: json, text, srt, vtt
--chunk <sec>0Fixed-time chunking (disables VAD)
--rate <hz>16000Sample rate for raw PCM input

engines — Manage ASR engines

asr-claw engines list                    # List all engines + status
asr-claw engines install qwen-asr       # Install local engine (Mac)
asr-claw engines info qwen-asr          # Engine details
asr-claw engines start qwen3-asr        # Start vLLM service engine
asr-claw engines stop qwen3-asr         # Stop service engine
asr-claw engines status                  # Running engines

doctor — Environment check

asr-claw doctor    # Check platform, engines, dependencies

Engine Matrix

EngineTypeMacGPUStreamingInstall
qwen-asrLocal CLIYesNo (Accelerate)VADengines install qwen-asr
qwen3-asrvLLM ServiceNoYes (CUDA)Nativeengines start qwen3-asr
whisperLocal CLIYesNoVADManual
doubaoCloud APIYesNoSet DOUBAO_API_KEY
openaiCloud APIYesNoSet OPENAI_API_KEY
deepgramCloud APIYesNativeSet DEEPGRAM_API_KEY

Output Format

All commands output JSON envelope:

{
  "ok": true,
  "command": "transcribe",
  "data": {
    "segments": [{"index": 0, "start": 0.0, "end": 2.5, "text": "..."}],
    "full_text": "...",
    "engine": "qwen-asr",
    "audio_duration_sec": 5.5
  },
  "duration_ms": 1230,
  "timestamp": "2026-03-13T10:00:00Z"
}

Use -o text for plain text, -o quiet for silent.

With adb-claw

# Real-time transcription from Android device
adb-claw audio capture --stream --duration 60000 | asr-claw transcribe --stream --lang zh

# Record then transcribe
adb-claw audio capture --duration 30000 --file recording.wav
asr-claw transcribe --file recording.wav --lang zh

# Save audio + transcribe simultaneously
adb-claw audio capture --stream --duration 0 | tee backup.wav | asr-claw transcribe --stream

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

Agent Dev Workflow

Orchestrate coding agents (Claude Code, Codex, etc.) to implement coding tasks through a structured workflow. Use when the user gives a coding requirement, f...

Registry SourceRecently Updated
Coding

Tesla Commander

Command and monitor Tesla vehicles via the Fleet API. Check status, control climate/charging/locks, track location, and analyze trip history. Use when you ne...

Registry SourceRecently Updated
Coding

Skill Creator (Opencode)

Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, edit, or optimize a...

Registry SourceRecently Updated
Coding

Documentation Writer

Write clear, comprehensive documentation. Covers README files, API docs, user guides, and code comments. Create documentation that users actually read and un...

Registry SourceRecently Updated