senseaudio-tts

SenseAudio Text-to-Speech (TTS) API for converting text to natural speech. Supports synchronous and SSE streaming modes, multiple voices, emotion control, speed/pitch/volume adjustment, and multi-language (Chinese/English).

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "senseaudio-tts" with this command: npx skills add scikkk/text2speech

SenseAudio Text-to-Speech (TTS)

SenseAudio TTS converts text to natural, emotionally rich speech using a large language model. Supports 10+ emotions, streaming output (SSE), and fine-grained voice control.

Endpoint: POST https://api.senseaudio.cn/v1/t2a_v2 Auth: Authorization: Bearer $SENSEAUDIO_API_KEY Max text length: 10,000 characters


Request Parameters

Headers

HeaderRequiredValue
AuthorizationyesBearer YOUR_API_KEY
Content-Typeyesapplication/json

Body

ParameterTypeRequiredDescription
modelstringyesSenseAudio-TTS-1.0
textstringyesText to synthesize. Supports <break time=500> pause tags
streambooleanyesfalse = sync response; true = SSE streaming
voice_settingobjectyesVoice configuration (see below)
audio_settingobjectnoAudio format configuration (see below)
dictionaryarraynoPolyphonic character corrections (cloned voices + TTS-1.5 only)

voice_setting

ParameterTypeDefaultRangeDescription
voice_idstring--Voice ID (system or cloned)
speedfloat1.0[0.5, 2.0]Speech speed
volfloat1.0[0, 10]Volume
pitchint0[-12, 12]Pitch adjustment
latex_readbooleanfalse-Read LaTeX/MathML formulas aloud

audio_setting

ParameterTypeDefaultOptions
formatstringmp3mp3, wav, pcm, flac
sample_rateint320008000, 16000, 22050, 24000, 32000, 44100
bitrateint12800032000, 64000, 128000, 256000 (MP3 only)
channelint21 (mono), 2 (stereo)

<break> Pause Tag

Insert pauses in text:

你好<break time=500>欢迎使用我们的服务
  • time unit: milliseconds, min 100ms

Non-Streaming Response

{
  "data": {
    "audio": "hex-encoded audio data...",
    "status": 2
  },
  "extra_info": {
    "audio_length": 3500,
    "audio_sample_rate": 32000,
    "audio_size": 56000,
    "bitrate": 128000,
    "audio_format": "mp3",
    "audio_channel": 1,
    "word_count": 24,
    "usage_characters": 30
  },
  "base_resp": {"status_code": 0, "status_msg": "success"}
}

data.audio is hex-encoded. Decode: bytes.fromhex(audio_hex)


SSE Streaming Response

Each chunk: data: {"data":{"audio":"hex...","status":1},...}

Final chunk has status: 2 and includes extra_info.


Code Examples

curl (non-streaming)

curl -X POST https://api.senseaudio.cn/v1/t2a_v2 \
  -H "Authorization: Bearer $SENSEAUDIO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "SenseAudio-TTS-1.0",
    "text": "道可道,非常道。名可名,非常名。",
    "stream": false,
    "voice_setting": {"voice_id": "male_0004_a"}
  }' -o response.json

jq -r '.data.audio' response.json | xxd -r -p > output.mp3

Python (non-streaming)

import requests

resp = requests.post(
    "https://api.senseaudio.cn/v1/t2a_v2",
    headers={"Authorization": "Bearer YOUR_API_KEY", "Content-Type": "application/json"},
    json={
        "model": "SenseAudio-TTS-1.0",
        "text": "道可道,非常道。",
        "stream": False,
        "voice_setting": {"voice_id": "male_0004_a"}
    }
)
result = resp.json()
audio_bytes = bytes.fromhex(result["data"]["audio"])
with open("output.mp3", "wb") as f:
    f.write(audio_bytes)

Python (SSE streaming)

import requests, json

with requests.post(
    "https://api.senseaudio.cn/v1/t2a_v2",
    headers={"Authorization": "Bearer YOUR_API_KEY", "Content-Type": "application/json"},
    json={"model": "SenseAudio-TTS-1.0", "text": "这是流式输出示例。", "stream": True,
          "voice_setting": {"voice_id": "male_0004_a"}},
    stream=True
) as r:
    with open("output.mp3", "wb") as f:
        for line in r.iter_lines():
            if line:
                line_str = line.decode("utf-8")
                if line_str.startswith("data: "):
                    chunk = json.loads(line_str[6:])
                    if chunk.get("data", {}).get("audio"):
                        f.write(bytes.fromhex(chunk["data"]["audio"]))


SenseAudio 文本转语音(TTS)

SenseAudio TTS 基于千亿参数大模型,将文字转化为自然流畅、情感丰富的语音。支持 10+ 种情感、流式输出(SSE)及精细化语音控制。

接口地址: POST https://api.senseaudio.cn/v1/t2a_v2 鉴权: Authorization: Bearer $SENSEAUDIO_API_KEY 最大文本长度: 10,000 字符


请求参数

请求头

参数名必填说明
AuthorizationBearer YOUR_API_KEY
Content-Typeapplication/json

请求体

参数名类型必填说明
modelstringSenseAudio-TTS-1.0
textstring待合成文本,支持 <break time=500> 停顿符
streambooleanfalse 同步;true SSE 流式
voice_settingobject音色设置(见下表)
audio_settingobject音频格式设置(见下表)
dictionaryarray多音字纠正(仅克隆音色 + TTS-1.5)

voice_setting(音色设置)

参数名类型默认值范围说明
voice_idstring--音色 ID(系统音色或克隆音色)
speedfloat1.0[0.5, 2.0]语速
volfloat1.0[0, 10]音量
pitchint0[-12, 12]音调
latex_readbooleanfalse-数学公式朗读

audio_setting(音频设置)

参数名类型默认值选项
formatstringmp3mp3, wav, pcm, flac
sample_rateint320008000/16000/22050/24000/32000/44100
bitrateint12800032000/64000/128000/256000(仅 MP3)
channelint21(单声道), 2(双声道)

<break> 停顿符

在文本中插入停顿:

你好<break time=500>欢迎使用我们的服务
  • time 单位为毫秒,最小值 100ms

非流式响应

{
  "data": {"audio": "hex编码音频...", "status": 2},
  "extra_info": {
    "audio_length": 3500,
    "audio_sample_rate": 32000,
    "audio_size": 56000,
    "audio_format": "mp3",
    "word_count": 24,
    "usage_characters": 30
  },
  "base_resp": {"status_code": 0, "status_msg": "success"}
}

data.audio 为 hex 编码,解码:bytes.fromhex(audio_hex)


SSE 流式响应

每个数据块:data: {"data":{"audio":"hex...","status":1},...}

最后一个 chunk status: 2,包含完整 extra_info


代码示例

curl(非流式)

curl -X POST https://api.senseaudio.cn/v1/t2a_v2 \
  -H "Authorization: Bearer $SENSEAUDIO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "SenseAudio-TTS-1.0",
    "text": "道可道,非常道。名可名,非常名。",
    "stream": false,
    "voice_setting": {"voice_id": "male_0004_a"}
  }' -o response.json

jq -r '.data.audio' response.json | xxd -r -p > output.mp3

Python(非流式)

import requests

resp = requests.post(
    "https://api.senseaudio.cn/v1/t2a_v2",
    headers={"Authorization": "Bearer YOUR_API_KEY", "Content-Type": "application/json"},
    json={
        "model": "SenseAudio-TTS-1.0",
        "text": "道可道,非常道。",
        "stream": False,
        "voice_setting": {"voice_id": "male_0004_a"}
    }
)
audio_bytes = bytes.fromhex(resp.json()["data"]["audio"])
open("output.mp3", "wb").write(audio_bytes)

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Grab Company

Provides detailed insights on Grab's evolution, business model, market position, and significance in Southeast Asia's digital economy and super app landscape.

Registry SourceRecently Updated
General

CV-Driven Job Hunter

Asiste en una búsqueda laboral proactiva basada en el CV del usuario — analiza perfil, sugiere banda salarial, escanea boards y career pages, califica matche...

Registry SourceRecently Updated
General

Changelog Linter

Validate CHANGELOG.md files against the Keep a Changelog format (keepachangelog.com). Checks version ordering, date formats, section types, link references,...

Registry SourceRecently Updated
General

Bosch Company

Bosch is the world's largest automotive Tier 1 supplier, focusing on automotive parts, industrial tech, consumer goods, and energy solutions with a foundatio...

Registry SourceRecently Updated