senseaudio-video-narrator

Generate SenseAudio TTS narration tracks for videos, including timestamped segments, style variants, and editor-ready voiceover exports. Use when users need voiceovers, video narration, timed commentary, or accessibility narration.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "senseaudio-video-narrator" with this command: npx skills add scikkk/video-narrator

SenseAudio Video Narrator

Create professional narration audio for videos with timing-aware segmentation, natural delivery, and editor-friendly exports.

What This Skill Does

  • Generate narration audio synchronized to script timestamps
  • Match narration style to video genre such as documentary or tutorial
  • Control pacing with official TTS parameters and text break markers
  • Create multiple narration takes with different voices or styles
  • Export audio segments and merged narration tracks for editing workflows

Credential and Dependency Rules

  • Read the API key from SENSEAUDIO_API_KEY.
  • Send auth only as Authorization: Bearer <API_KEY>.
  • Do not place API keys in query parameters, logs, or saved examples.
  • If Python helpers are used, this skill expects python3, requests, and pydub.
  • pydub is used only for optional local audio assembly and mixing.

Official TTS Constraints

Use the official SenseAudio TTS rules summarized below:

  • HTTP endpoint: POST https://api.senseaudio.cn/v1/t2a_v2
  • Model: SenseAudio-TTS-1.0
  • Max text length per request: 10000 characters
  • voice_setting.voice_id is required
  • voice_setting.speed range: 0.5-2.0
  • voice_setting.pitch range: -12 to 12
  • Optional audio formats: mp3, wav, pcm, flac
  • Optional sample rates: 8000, 16000, 22050, 24000, 32000, 44100
  • Optional MP3 bitrates: 32000, 64000, 128000, 256000
  • Optional channels: 1 or 2
  • extra_info.audio_length returns segment duration in milliseconds
  • Inline break markup such as <break time=500> is supported in text

Recommended Workflow

  1. Prepare the script:
  • Split narration into timestamped segments.
  • Keep each segment comfortably below the 10000 character limit.
  1. Choose a voice and pacing profile:
  • Pick a voice_id and tune speed, pitch, and optional vol.
  • Use shorter segments when timing precision matters.
  1. Generate audio segments:
  • Call the TTS API for each segment.
  • Decode data.audio from hex before saving.
  • Capture extra_info.audio_length for timeline metadata.
  1. Assemble the narration track locally:
  • Use pydub to position clips on a silent master track.
  • Keep per-segment files for easier editor import and retiming.
  1. Validate timing against the video:
  • Leave small gaps when natural pacing is needed.
  • Adjust segment boundaries instead of overusing extreme speed values.

Minimal Timed Narration Helper

import binascii
import os
import re

import requests

API_KEY = os.environ["SENSEAUDIO_API_KEY"]
API_URL = "https://api.senseaudio.cn/v1/t2a_v2"


def parse_timed_script(script):
    pattern = r"\[(\d{2}):(\d{2}):(\d{2})\]\s*(.+?)(?=\n\[|\Z)"
    segments = []
    for match in re.finditer(pattern, script, re.DOTALL):
        hours, minutes, seconds, text = match.groups()
        timestamp_ms = (int(hours) * 3600 + int(minutes) * 60 + int(seconds)) * 1000
        segments.append({"timestamp": timestamp_ms, "text": text.strip()})
    return segments


def synthesize_segment(text, voice_id, speed=1.0, pitch=0, vol=1.0):
    response = requests.post(
        API_URL,
        headers={
            "Authorization": f"Bearer {API_KEY}",
            "Content-Type": "application/json",
        },
        json={
            "model": "SenseAudio-TTS-1.0",
            "text": text,
            "stream": False,
            "voice_setting": {
                "voice_id": voice_id,
                "speed": speed,
                "pitch": pitch,
                "vol": vol,
            },
            "audio_setting": {
                "format": "mp3",
                "sample_rate": 32000,
                "bitrate": 128000,
                "channel": 2,
            },
        },
        timeout=60,
    )
    response.raise_for_status()
    data = response.json()
    return {
        "audio_bytes": binascii.unhexlify(data["data"]["audio"]),
        "duration_ms": data["extra_info"]["audio_length"],
        "trace_id": data.get("trace_id"),
    }

Local Assembly Pattern

from pydub import AudioSegment


def create_synced_narration(audio_segments, video_duration_ms):
    narration_track = AudioSegment.silent(duration=video_duration_ms)
    for segment in audio_segments:
        clip = AudioSegment.from_file(segment["file"])
        narration_track = narration_track.overlay(clip, position=segment["timestamp"])
    return narration_track

Style Presets

  • Documentary: slower speed such as 0.95, neutral pitch
  • Tutorial: speed near 1.0, slightly warmer pitch
  • Commercial: modestly faster speed, slightly higher pitch

Prefer conservative tuning and script editing over extreme voice parameter changes.

Output Options

  • Per-segment narration clips in mp3 or wav
  • Timing metadata in json
  • Merged narration track for video editors
  • Optional alternate takes with different styles

Safety Notes

  • Do not hardcode credentials.
  • Do not assume local media tooling exists beyond what is declared here.
  • Treat returned trace_id and generated narration assets as potentially sensitive production data.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Leads

Leads - command-line tool for everyday use

Registry SourceRecently Updated
General

Bmi Calculator

BMI计算器。BMI计算、理想体重、健康计划、体重追踪、儿童BMI、结果解读。BMI calculator with ideal weight, health plan. BMI、体重、健康。

Registry SourceRecently Updated
General

Blood

Blood — a fast health & wellness tool. Log anything, find it later, export when needed.

Registry SourceRecently Updated
General

Better Genshin Impact

📦BetterGI · 更好的原神 - 自动拾取 | 自动剧情 | 全自动钓鱼(AI) | 全自动七圣召唤 | 自动伐木 | 自动刷本 | 自动采集/挖矿/锄地 | 一条龙 | 全连音游 - UI A better genshin impact, c#, auto-play-game, automatic, g...

Registry SourceRecently Updated