oatda-generate-speech

Generate speech or audio from text using OATDA's unified audio API. Triggers when the user wants to convert text to speech, create narration, voiceovers, accessibility audio, or use TTS models such as OpenAI tts-1 through OATDA.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "oatda-generate-speech" with this command: npx skills add devcsde/oatda-generate-speech

OATDA Speech Generation

Generate spoken audio from text through OATDA's unified audio API.

API Key Resolution

All commands need the OATDA API key. Resolve it inline for each exec call:

export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}"

If the key is empty or null, tell the user to get one at https://oatda.com and configure it.

Security: Never print the full API key. Only verify existence or show first 8 chars.

Model Mapping

User saysProviderModel
tts, tts-1, openai tts (default)openaitts-1
tts hd, tts-1-hdopenaitts-1-hd
gpt tts, gpt-4o mini ttsopenaigpt-4o-mini-tts

Default: openai / tts-1 if no model specified.

If the user provides provider/model format directly (for example openai/tts-1), split on /.

Common OpenAI voices include alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, and shimmer. Use alloy if the user does not specify a voice.

⚠️ Models change over time. If a model ID fails, query oatda-list-models with ?type=audio first.

Discovering Audio Model Parameters

Query available audio models and inspect supported_params before sending optional fields:

export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}" && \
curl -s -X GET "https://oatda.com/api/v1/llm/models?type=audio" \
  -H "Authorization: Bearer $OATDA_API_KEY" | jq '.audio_models[] | {id, supported_params}'

Look for:

  • audio_modes containing tts
  • supported voice values
  • allowed response_format values
  • optional fields like instructions or language

API Call

The speech endpoint returns binary audio, not JSON. Always save the response to a file.

export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}" && \
curl -s -X POST "https://oatda.com/api/v1/llm/speech" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OATDA_API_KEY" \
  -d '{
    "provider": "<PROVIDER>",
    "model": "<MODEL>",
    "input": "<TEXT_TO_SPEAK>",
    "voice": "alloy",
    "response_format": "mp3",
    "speed": 1.0
  }' \
  --output speech.mp3

Common Parameters

  • input: Text to convert to speech, max 15000 characters
  • voice: Voice name, e.g. alloy, nova, shimmer
  • response_format: mp3, opus, aac, flac, wav, pcm, mulaw, or alaw
  • speed: 0.25 to 4.0, default 1.0
  • instructions: Optional tone/style guidance for supported models
  • language: Optional language code for supported models

Success Handling

If the request succeeds, tell the user where the file was saved, for example:

Speech generated successfully: speech.mp3

If headers matter, use curl -D headers.txt while still saving the audio body with --output.

Error Handling

HTTP StatusMeaningAction
401Invalid API keyTell user to check their key
402Insufficient creditsTell user to check balance
400Bad request / model not supportedCheck model format and query oatda-list-models with type=audio
429Rate limited or monthly capWait briefly and retry once
500Provider errorShow the error message if returned

Example

export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}" && \
curl -s -X POST "https://oatda.com/api/v1/llm/speech" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OATDA_API_KEY" \
  -d '{
    "provider": "openai",
    "model": "tts-1",
    "input": "Welcome to OATDA, one API to direct all.",
    "voice": "alloy",
    "response_format": "mp3",
    "speed": 1.0
  }' \
  --output speech.mp3

Notes

  • Endpoint: /api/v1/llm/speech
  • Use input, not prompt, for TTS requests
  • Always save the response with --output
  • Use oatda-list-models to discover available audio models
  • Equivalent capability name: generate_speech
  • Related skills: oatda-list-models, oatda-transcribe-audio, oatda-translate-audio

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Multi Edge-TTS CN

Edge-TTS 在线语音合成 skill。基于微软 Edge TTS 引擎,生成速度快(1-2秒),支持多种音色和输出格式。同时支持飞书(OGG/Opus)和企业微信(AMR)。默认音色 xiaoxiao_lively。需联网。

Registry SourceRecently Updated
General

vedic-destiny

吠陀命盘分析中文入口。用于完整命盘研判、命主盘 Rashi chart 与九分盘 Navamsha chart 联读、既往事件回看、出生时间稳定度判断、事业主题、婚姻主题、时空盘专题,以及基于 Jagannatha Hora PDF、星盘截图或文本命盘数据的系统拆盘。当用户提到完整星盘、事业方向、婚姻问题、关系窗...

Registry SourceRecently Updated
General

One Person Company OS

Build a visual operating cockpit for an AI-native one-person company across promise, buyer, product, delivery, cash, learning, and assets. / 为 AI 一人公司建立可视化经营...

Registry SourceRecently Updated
General

健康追踪

健康追踪技能 - 追踪饮水、睡眠、步数等健康数据,JSON存储。

Registry SourceRecently Updated