openclaw-mlx-audio

Local TTS/STT integration for OpenClaw using mlx-audio - Zero API keys, Zero cloud dependency

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "openclaw-mlx-audio" with this command: npx skills add gandli-2025/openclaw-mlx-audio

OpenClaw MLX Audio

本地支持多语言文本转语音 (TTS) 和语音转文本 (STT),完全在 Apple Silicon 设备上运行,无需云服务,保护数据隐私。

功能

  • 🗣️ TTS 文本转语音: 支持中文、英文等多种语言
  • 🎤 STT 语音转文本: 高准确率语音识别
  • 🎭 声音克隆: 使用参考音频克隆声音
  • 🔒 完全本地: 无需 API Key,数据不出设备

安装

# 安装依赖
brew install ffmpeg uv
uv tool install mlx-audio --prerelease=allow

# 安装插件
cp -r openclaw-mlx-audio ~/.openclaw/extensions/

# 重启 OpenClaw
openclaw gateway restart

使用

TTS 命令

# 状态查询
/ mlx-tts status

# 测试生成
/ mlx-tts test "你好,这是测试语音"

# 模型列表
/ mlx-tts models

STT 命令

# 状态查询
/ mlx-stt status

# 转录音频
/ mlx-stt transcribe /path/to/audio.wav

# 模型列表
/ mlx-stt models

工具调用

TTS:

{
  "tool": "mlx_tts",
  "parameters": {
    "action": "generate",
    "text": "Hello World",
    "outputPath": "/tmp/speech.mp3"
  }
}

STT:

{
  "tool": "mlx_stt",
  "parameters": {
    "action": "transcribe",
    "audioPath": "/tmp/audio.wav",
    "language": "zh"
  }
}

支持模型

TTS 模型

模型语言速度质量
mlx-community/Kokoro-82M-bf168+⚡⚡⚡Good
mlx-community/Qwen3-TTS-12Hz-0.6B-Base-bf16ZH/EN/JA/KO⚡⚡Better
mlx-community/Qwen3-TTS-12Hz-1.7B-VoiceDesign-bf16ZH/EN/JA/KOBest

STT 模型

模型语言速度精度
mlx-community/whisper-large-v3-turbo-asr-fp1699+⚡⚡⚡Good
mlx-community/Qwen3-ASR-1.7B-8bitZH/EN/JA/KO⚡⚡Better
mlx-community/whisper-large-v399+⚡⚡Best

测试

自动化测试: 17 项 (100% 通过) 真人测试: 11 项 Discord 测试 总体评分: ⭐⭐⭐⭐ (3.85/5.0)

运行测试:

bash test/run_tests.sh

配置

openclaw.json 中添加:

{
  "plugins": {
    "allow": ["@openclaw/mlx-audio"],
    "entries": {
      "@openclaw/mlx-audio": {
        "enabled": true,
        "config": {
          "tts": {
            "enabled": true,
            "model": "mlx-community/Qwen3-TTS-12Hz-0.6B-Base-bf16",
            "langCode": "zh"
          },
          "stt": {
            "enabled": true,
            "model": "mlx-community/Qwen3-ASR-1.7B-8bit",
            "language": "zh"
          }
        }
      }
    }
  }
}

系统要求

  • macOS Apple Silicon (M1/M2/M3)
  • Node.js 18+
  • Python 3.10+
  • ffmpeg
  • uv

链接

License

MIT

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

MLX Audio Server

Local 24x7 OpenAI-compatible API server for STT/TTS, powered by MLX on your Mac.

Registry SourceRecently Updated
2.6K0Profile unavailable
General

Feishu Voice Loop

Accept text or voice input, transcribe if needed, generate natural OpenAI TTS speech, and send audio output to Feishu chat or web player.

Registry SourceRecently Updated
3720Profile unavailable
General

Kesha Voice Kit

Offline voice toolkit for speech-to-text, text-to-speech, and language detection supporting 25 languages with no API keys or cloud usage.

Registry SourceRecently Updated
1770Profile unavailable
General

Audio PTBR

Premium Portuguese-Brazilian voice interface with neural TTS and Claude AI integration. Features wav2vec2-large-xlsr-53-ptBR for excellent PT-BR understandin...

Registry SourceRecently Updated
1040Profile unavailable