speakturbo-tts

Give your agent the ability to speak to you real-time. Talk to your Claude! Ultra-fast TTS, text-to-speech, voice synthesis, audio output with ~90ms latency. 8 built-in voices for instant voice responses. For voice cloning, use the speak skill.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "speakturbo-tts" with this command: npx skills add emzod/speak-turbo/emzod-speak-turbo-speakturbo

speakturbo - Talk to your Claude!

Give your agent the ability to speak to you real-time. Ultra-fast text-to-speech with ~90ms latency and 8 built-in voices.

Quick Start

# Play immediately - you should hear "Hello world" through your speakers
speakturbo "Hello world"
# Output: ⚡ 92ms → ▶ 93ms → ✓ 1245ms

# Verify it's working by saving to file
speakturbo "Hello world" -o test.wav
ls -lh test.wav  # Should show ~50-100KB file

Output explained: = first audio received, = playback started, = done

First Run

The first execution takes 2-5 seconds while the daemon starts and loads the model into memory. Subsequent calls are ~90ms to first sound.

# First run (slow - daemon starting)
speakturbo "Starting up"  # ~2-5 seconds

# Second run (fast - daemon already running)
speakturbo "Now I'm fast"  # ~90ms

Usage

# Basic - plays immediately (default voice: alba)
speakturbo "Hello world"

# Save to file (no audio playback)
speakturbo "Hello" -o output.wav

# Save to specific file
speakturbo "Goodbye" -o goodbye.wav

# Quiet mode (suppress status messages, still plays audio)
speakturbo "Hello" -q

# List available voices
speakturbo --list-voices

Available Voices

VoiceType
albaFemale (default)
mariusMale
javertMale
jeanMale
fantineFemale
cosetteFemale
eponineFemale
azelmaFemale

Performance

MetricValue
Time to first sound~90ms (daemon warm)
First run2-5s (daemon startup)
Real-time factor~4x faster
Sample rate24kHz mono

Architecture

speakturbo (Rust CLI, 2.2MB)
    │
    │ HTTP streaming (port 7125)
    ▼
speakturbo-daemon (Python + pocket-tts)
    │
    │ Model in memory, auto-shutdown after 1hr idle
    ▼
Audio playback (rodio)

Text Input

  • Encoding: UTF-8
  • Quotes in text: Use escaping: speakturbo "She said \"hello\""
  • Long text: Supported, streams as it generates

Output Path Security

The -o flag only writes to directories that are on the allowlist. By default, these are:

  • /tmp and system temp directories
  • Your current working directory
  • ~/.speakturbo/

If you need to write elsewhere, use --allow-dir:

speakturbo "Hello" -o /custom/path/audio.wav --allow-dir /custom/path

To permanently allow a directory, add it to ~/.speakturbo/config:

mkdir -p ~/.speakturbo && echo "/custom/path" >> ~/.speakturbo/config

The config file is one directory per line. Lines starting with # are comments.

Exit Codes

CodeMeaning
0Success (audio played/saved)
1Error (daemon connection failed, invalid args)

When to Use

Use speakturbo when:

  • You need instant audio feedback (~90ms)
  • Speed matters more than voice variety
  • Built-in voices are sufficient

Use speak instead when:

  • You need custom voice cloning (Morgan Freeman, etc.) → speak "text" --voice ~/.chatter/voices/morgan_freeman.wav
  • You need emotion tags like [laugh], [sigh]
  • Quality/variety matters more than speed

See the speak skill documentation for full usage.

Troubleshooting

No audio plays:

# Check daemon is running
curl http://127.0.0.1:7125/health
# Expected: {"status":"ready","voices":["alba","marius",...]}

# Verify by saving to file and playing manually
speakturbo "test" -o /tmp/test.wav
afplay /tmp/test.wav  # macOS
aplay /tmp/test.wav   # Linux

Daemon won't start:

# Check port availability
lsof -i :7125

# Manually kill and restart
pkill -f "daemon_streaming"
speakturbo "test"  # Auto-restarts daemon

First run is slow: This is expected. The daemon needs to load the ~100MB model into memory. Subsequent calls will be fast (~90ms).

Daemon Management

The daemon auto-starts on first use and auto-shuts down after 1 hour idle.

# Check status
curl http://127.0.0.1:7125/health

# Manual stop
pkill -f "daemon_streaming"

# View logs
cat /tmp/speakturbo.log

Comparison with speak

Featurespeakturbospeak
Time to first sound~90ms~4-8s
Voice cloning
Emotion tags
Voices8 built-inCustom wav files
Enginepocket-ttsChatterbox

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

speakturbo-tts

No summary provided by upstream source.

Repository SourceNeeds Review
935-emzod
General

speak-tts

No summary provided by upstream source.

Repository SourceNeeds Review
679-emzod
Automation

Financial Ai Agent

查询金融行情数据(股票、黄金、原油等)。当用户提到查询行情、价格、指数、股票时使用。例如:'查询纳斯达克'、'现在金价多少'、'标普最近一周表现'、'设置我的key'、'替换key'。

Registry SourceRecently Updated
Automation

Git Workflow

Use this skill for any git commit, pull request, or release task. Invoke immediately when the user wants to: stage and commit changes, write a commit message...

Registry SourceRecently Updated