OATDA Speech Generation
Generate spoken audio from text through OATDA's unified audio API.
API Key Resolution
All commands need the OATDA API key. Resolve it inline for each exec call:
export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}"
If the key is empty or null, tell the user to get one at https://oatda.com and configure it.
Security: Never print the full API key. Only verify existence or show first 8 chars.
Model Mapping
| User says | Provider | Model |
|---|---|---|
| tts, tts-1, openai tts (default) | openai | tts-1 |
| tts hd, tts-1-hd | openai | tts-1-hd |
| gpt tts, gpt-4o mini tts | openai | gpt-4o-mini-tts |
Default: openai / tts-1 if no model specified.
If the user provides provider/model format directly (for example openai/tts-1), split on /.
Common OpenAI voices include alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, and shimmer. Use alloy if the user does not specify a voice.
⚠️ Models change over time. If a model ID fails, query
oatda-list-modelswith?type=audiofirst.
Discovering Audio Model Parameters
Query available audio models and inspect supported_params before sending optional fields:
export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}" && \
curl -s -X GET "https://oatda.com/api/v1/llm/models?type=audio" \
-H "Authorization: Bearer $OATDA_API_KEY" | jq '.audio_models[] | {id, supported_params}'
Look for:
audio_modescontainingtts- supported
voicevalues - allowed
response_formatvalues - optional fields like
instructionsorlanguage
API Call
The speech endpoint returns binary audio, not JSON. Always save the response to a file.
export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}" && \
curl -s -X POST "https://oatda.com/api/v1/llm/speech" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OATDA_API_KEY" \
-d '{
"provider": "<PROVIDER>",
"model": "<MODEL>",
"input": "<TEXT_TO_SPEAK>",
"voice": "alloy",
"response_format": "mp3",
"speed": 1.0
}' \
--output speech.mp3
Common Parameters
input: Text to convert to speech, max 15000 charactersvoice: Voice name, e.g.alloy,nova,shimmerresponse_format:mp3,opus,aac,flac,wav,pcm,mulaw, oralawspeed: 0.25 to 4.0, default 1.0instructions: Optional tone/style guidance for supported modelslanguage: Optional language code for supported models
Success Handling
If the request succeeds, tell the user where the file was saved, for example:
Speech generated successfully:
speech.mp3
If headers matter, use curl -D headers.txt while still saving the audio body with --output.
Error Handling
| HTTP Status | Meaning | Action |
|---|---|---|
| 401 | Invalid API key | Tell user to check their key |
| 402 | Insufficient credits | Tell user to check balance |
| 400 | Bad request / model not supported | Check model format and query oatda-list-models with type=audio |
| 429 | Rate limited or monthly cap | Wait briefly and retry once |
| 500 | Provider error | Show the error message if returned |
Example
export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}" && \
curl -s -X POST "https://oatda.com/api/v1/llm/speech" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OATDA_API_KEY" \
-d '{
"provider": "openai",
"model": "tts-1",
"input": "Welcome to OATDA, one API to direct all.",
"voice": "alloy",
"response_format": "mp3",
"speed": 1.0
}' \
--output speech.mp3
Notes
- Endpoint:
/api/v1/llm/speech - Use
input, notprompt, for TTS requests - Always save the response with
--output - Use
oatda-list-modelsto discover available audio models - Equivalent capability name:
generate_speech - Related skills:
oatda-list-models,oatda-transcribe-audio,oatda-translate-audio