mlx-audio-server

Local 24x7 OpenAI-compatible API server for STT/TTS, powered by MLX on your Mac.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "mlx-audio-server" with this command: npx skills add guoqiao/mlx-audio-server

MLX Audio Server

Local 24x7 OpenAI-compatible API server for STT/TTS, powered by MLX on your Mac.

mlx-audio: The best audio processing library built on Apple's MLX framework, providing fast and efficient text-to-speech (TTS), speech-to-text (STT), and speech-to-speech (STS) on Apple Silicon.

guoqiao/tap/mlx-audio-server: Homebrew Formula to install mlx-audio with brew, and run mlx_audio.server as a LaunchAgent service on macOS.

Requirements

mlx: macOS with Apple Silicon
brew: used to install deps if not available

Installation

bash ${baseDir}/install.sh

This script will:

install ffmpeg/jq with brew if missing.
install homebrew formula mlx-audio-server from guoqiao/tap
start brew service for mlx-audio-server

Usage

STT/Speech-To-Text(default model: mlx-community/glm-asr-nano-2512-8bit):

# input will be converted to wav with ffmpeg, if not yet.
# output will be transcript text only.
bash ${baseDir}/run_stt.sh <audio_or_video_path>

TTS/Text-To-Speech(default model: mlx-community/Qwen3-TTS-12Hz-1.7B-VoiceDesign-bf16):

# audio will be saved into a tmp dir, with default name `speech.wav`, and print to stdout.
bash ${baseDir}/run_tts.sh "Hello, Human!"
# or you can specify a output dir
bash ${baseDir}/run_tts.sh "Hello, Human!" ./output
# output will be audio path only.

You can use both scripts directly, or as example/reference.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Open Registry Record Open in ClawHub

Related Skills

Related by shared tags or category signals.

General

MLX STT

Speech-To-Text with MLX (Apple Silicon) and opensource models (default GLM-ASR-Nano-2512) locally.

Registry SourceRecently Updated

3.6K1Profile unavailable

General

MLX TTS

Text-To-Speech with MLX (Apple Silicon) and opensource models (default QWen3-TTS) locally.

Registry SourceRecently Updated

1.3K0Profile unavailable

General

Deapi Audio

Text-to-speech, voice cloning, voice design, and transcribe audio files via deAPI GPU network. Trigger on 'text to speech', 'TTS', 'generate voice', 'read al...

Registry SourceRecently Updated

1781Profile unavailable

Coding

Speak Turbo - Talk to your Claude 90ms latency!

Give your agent the ability to speak to you real-time. Talk to your Claude! Ultra-fast TTS, text-to-speech, voice synthesis, audio output with ~90ms latency....

Registry SourceRecently Updated

6930Profile unavailable