coze-tts

Text-to-Speech (TTS) using Coze API. Convert text to natural-sounding speech audio files. Supports multiple voices and output formats (mp3, ogg_opus, wav, pcm).

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "coze-tts" with this command: npx skills add franklu0819-lang/coze-tts

Coze Text-to-Speech (TTS)

Convert text to natural-sounding speech using Coze API.

Setup

1. Get your API Key: Get a key from Coze Platform

2. Set it in your environment:

export COZE_API_KEY="your-key-here"

Supported Output Formats

  • MP3 - Default format, widely compatible
  • OGG_OPUS - Optimized for streaming and messaging
  • WAV - Uncompressed audio
  • PCM - Raw audio data

Usage

Basic TTS

Convert text to speech with default settings:

bash scripts/text_to_speech.sh "你好,这是测试语音"

Save to Specific File

bash scripts/text_to_speech.sh "你好世界" -o output.mp3

Use Different Voice

bash scripts/text_to_speech.sh "你好" -v 2

Change Output Format

bash scripts/text_to_speech.sh "你好" -f ogg_opus

Full Options

bash scripts/text_to_speech.sh "要转换的文本" -o output.mp3 -v 1 -f mp3

Parameters:

  • text (required): Text to convert to speech
  • -o, --output (optional): Output file path (default: auto-generated)
  • -v, --voice (optional): Voice ID (default: 1)
  • -f, --format (optional): Output format - mp3/ogg_opus/wav/pcm (default: mp3)

Output

The script saves the audio file and outputs:

  • File path
  • File size
  • Audio duration (if ffprobe is available)

Example output:

✓ Audio saved: coze_tts_20260324_235030_a1b2c3d4.mp3
  Size: 25.3 KB
  Duration: ~3 seconds

Workflow Examples

Generate Notification Audio

bash scripts/text_to_speech.sh "您有一条新消息" -o notification.mp3

Create Voice Greeting

bash scripts/text_to_speech.sh "欢迎使用 Coze 语音服务" -v 2 -o greeting.mp3

Generate OGG for Messaging

bash scripts/text_to_speech.sh "你好" -f ogg_opus -o message.ogg

Batch Generate

for text in "你好" "谢谢" "再见"; do
    bash scripts/text_to_speech.sh "$text" -o "${text}.mp3"
done

Integration with Other Skills

Combine with coze-asr for voice conversation:

# 1. User speaks -> ASR converts to text
bash coze-asr/scripts/speech_to_text.sh input.ogg

# 2. Process text with AI...

# 3. AI response -> TTS converts to speech
bash coze-tts/scripts/text_to_speech.sh "AI的回复" -o response.mp3

Troubleshooting

Authentication Error:

  • Check COZE_API_KEY is set correctly
  • Verify API key has TTS permissions

Invalid Voice ID:

  • Voice ID should be a number (int64 format)
  • Try voice_id: 1 as default

File Not Created:

  • Check write permissions in output directory
  • Ensure sufficient disk space

Limitations

  • Text length limits apply (check Coze documentation)
  • Rate limits may apply based on your plan
  • Some voices may not support all output formats

API Reference

  • Endpoint: POST https://api.coze.cn/v1/audio/speech
  • Authentication: Bearer token (COZE_API_KEY)
  • Content-Type: application/json

Required Environment Variables

VariableDescriptionRequired
COZE_API_KEYCoze API authentication keyYes

Required Tools

ToolPurposeRequired
jqJSON processingYes
ffprobeAudio duration detectionOptional

License

MIT

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Elevenlabs Tts

ElevenLabs TTS - the best ElevenLabs integration for OpenClaw. ElevenLabs Text-to-Speech with emotional audio tags, ElevenLabs voice synthesis for WhatsApp,...

Registry SourceRecently Updated
6K6Profile unavailable
General

Text to Speech

Generate speech audio from text using HeyGen's Starfish TTS model. Use when: (1) Generating standalone speech audio files from text, (2) Converting text to s...

Registry SourceRecently Updated
8041Profile unavailable
General

MiniMax TTS Generator

Text-to-speech (TTS) generation using MiniMax API. Converts text into natural-sounding speech with support for multiple voices, adjustable speed and pitch, a...

Registry SourceRecently Updated
1080Profile unavailable
General

Feishu Voice

飞书语音消息发送技能。将文本转换为语音并发送到飞书,支持 TTS 生成、格式转换、语速调整、时长读取、文件上传和消息发送。

Registry Source
1.5K1Profile unavailable