LeanVox TTS/STT API Skill
Authentication
Set LEANVOX_API_KEY environment variable. Get a key at https://api.leanvox.com (free $1.00 signup credit).
export LEANVOX_API_KEY="your-key-here"
Model Tiers — Pick the Cheapest That Works
| Tier | Cost | Best For | Voice Support |
|---|---|---|---|
| Standard | $5/1M chars | Fast narration, notifications, bulk | 2 built-in voices (af_heart, am_michael) |
| Pro | $10/1M chars | Expressive, natural, podcasts | 238+ curated voices with cloning |
| Max | $30/1M chars | Creative, instruction-driven | Describe voice via text prompt |
Default to Standard unless the user needs specific voices (→ Pro) or voice design from text description (→ Max).
Quick Reference
Generate Speech
scripts/tts.sh "Hello world" --model standard --voice af_heart --output hello.mp3
Transcribe Audio
scripts/stt.sh audio.mp3 # sync (< 5 min)
scripts/stt.sh audio.mp3 --async # async (> 5 min)
Multi-Speaker Dialogue
scripts/dialogue.sh dialogue.json --output conversation.mp3
Voice-Over (transcribe → edit → re-voice)
scripts/voiceover.sh input.mp3 --voice podcast_conversational_female
Browse Voices
scripts/voices.sh --category podcast --gender female
Clone a Voice
scripts/clone.sh reference.wav "Text to speak in cloned voice" --output cloned.mp3
Endpoint Details
For full API reference including all parameters, see references/api-reference.md.
For the complete curated voice catalog, see references/voice-catalog.md.
Key Constraints
- Max text length: 10,000 Unicode characters per request
- Async threshold: Use async for text > 5,000 chars or audio files > 5 minutes
- Billing minimum: 100 characters (shorter text billed as 100)
- Audio format: Returns MP3 via presigned URL (download separately, no auth header)
- Rate limits: 60 RPM (free), 1,000 RPM (paid)
- 1,000 chars ≈ 1 minute of audio output