Caption Format Conversion

Convert between 30+ caption/caption formats using lattifai-captions.

⚡ YouTube Workflow

# 1. Transcribe YouTube video directly
omnicaptions transcribe "https://youtube.com/watch?v=VIDEO_ID" -o transcript.md

# 2. Convert to any format
omnicaptions convert transcript.md -o output.srt
omnicaptions convert transcript.md -o output.ass
omnicaptions convert transcript.md -o output.vtt

When to Use

Converting SRT to VTT, ASS, TTML, etc.
Converting Gemini markdown transcript to standard caption formats
Converting YouTube VTT (with word-level timestamps) to other formats
Batch format conversion

When NOT to Use

Need transcription (use /omnicaptions:transcribe)
Need translation (use /omnicaptions:translate)

Setup

pip install omni-captions-skills --extra-index-url https://lattifai.github.io/pypi/simple/

Quick Reference

Format	Extension	Read	Write
SRT	`.srt`	✓	✓
VTT	`.vtt`	✓	✓
ASS/SSA	`.ass`	✓	✓
TTML	`.ttml`	✓	✓
Gemini MD	`.md`	✓	✓
JSON	`.json`	✓	✓
TXT	`.txt`	✓	✓

Full list: SRT, VTT, ASS, SSA, TTML, DFXP, SBV, SUB, LRC, JSON, TXT, TSV, Audacity, Audition, FCPXML, EDL, and more.

CLI Usage

# Convert (auto-output to same directory, only changes extension)
omnicaptions convert input.srt -t vtt           # → ./input.vtt
omnicaptions convert transcript.md              # → ./transcript.srt

# Specify output file or directory
omnicaptions convert input.srt -o output/       # → output/input.srt
omnicaptions convert input.srt -o output.vtt    # → output.vtt

# Specify format explicitly
omnicaptions convert input.txt -o out.srt -f txt -t srt

ASS Style Presets

When converting to ASS format, use --style to apply preset styles:

omnicaptions convert input.srt -o output.ass --style default    # White text, bottom
omnicaptions convert input.srt -o output.ass --style top        # White text, top
omnicaptions convert input.srt -o output.ass --style bilingual  # White + Yellow (for bilingual)
omnicaptions convert input.srt -o output.ass --style yellow     # Yellow text, bottom

Preset	Position	Line 1	Line 2	Use Case
`default`	Bottom	White	White	Standard captions
`top`	Top	White	White	When bottom is occupied
`bilingual`	Bottom	White	Yellow	Bilingual captions (原文 + 译文)
`yellow`	Bottom	Yellow	Yellow	High visibility

Bilingual Example

If your SRT has two-line captions like:

1
00:00:01,000 --> 00:00:03,000
Hello World
你好世界

Use --style bilingual or custom colors:

# Preset: white + yellow
omnicaptions convert bilingual.srt -o output.ass --style bilingual

# Custom colors: green English + yellow Chinese
omnicaptions convert bilingual.srt -o output.ass --line1-color "#00FF00" --line2-color "#FFFF00"

# Mix preset with custom line2 color
omnicaptions convert bilingual.srt -o output.ass --style default --line2-color "#FF6600"

Custom Color Options

Option	Description
`--line1-color "#RRGGBB"`	First line (original) color
`--line2-color "#RRGGBB"`	Second line (translation) color

Common colors: #FFFFFF (white), #FFFF00 (yellow), #00FF00 (green), #00FFFF (cyan), #FF6600 (orange)

Font Size and Resolution

Font size is auto-calculated based on video resolution. Resolution is detected from (priority order):

--resolution argument (e.g., 1080p, 4k, 1920x1080)
--video argument (uses ffprobe to detect)
.meta.json file (saved by omnicaptions download)
Default: 1080p

# Auto-detect from .meta.json (saved by download command)
omnicaptions convert abc123.en.srt -o abc123.en.ass --karaoke

# Specify resolution directly
omnicaptions convert input.srt -o output.ass --resolution 4k
omnicaptions convert input.srt -o output.ass --resolution 720p
omnicaptions convert input.srt -o output.ass --resolution 1920x1080

# Detect from video file (uses ffprobe)
omnicaptions convert input.srt -o output.ass --video video.mp4

# Override auto-calculated fontsize
omnicaptions convert input.srt -o output.ass --resolution 4k --fontsize 80

Resolution	PlayRes	Auto FontSize
480p	854×480	24
720p	1280×720	32
1080p	1920×1080	48 (default)
2K	2560×1440	64
4K	3840×2160	96

Karaoke Mode

Generate karaoke subtitles with word-level highlighting. Requires word-level timing (use LaiCut alignment first).

# Basic karaoke (sweep effect - gradual fill)
omnicaptions convert lyrics_LaiCut.json -o lyrics_LaiCut_karaoke.ass --karaoke

# Different effects
omnicaptions convert lyrics_LaiCut.json -o lyrics_LaiCut_karaoke.ass --karaoke sweep    # Gradual fill (default)
omnicaptions convert lyrics_LaiCut.json -o lyrics_LaiCut_karaoke.ass --karaoke instant  # Instant highlight
omnicaptions convert lyrics_LaiCut.json -o lyrics_LaiCut_karaoke.ass --karaoke outline  # Outline then fill

# LRC karaoke (enhanced word timestamps)
omnicaptions convert lyrics_LaiCut.json -o lyrics_LaiCut_karaoke.lrc --karaoke

Effect	ASS Tag	Description
`sweep`	`\kf`	Gradual fill from left to right (default)
`instant`	`\k`	Instant word highlight
`outline`	`\ko`	Outline fills, then text fills

Karaoke Workflow

# 1. Align with LaiCut (get word-level timing in JSON)
omnicaptions LaiCut audio.mp3 lyrics.txt

# 2. Convert to karaoke ASS
omnicaptions convert lyrics_LaiCut.json -o karaoke.ass --karaoke

# Or combine with style
omnicaptions convert lyrics_LaiCut.json -o karaoke.ass --karaoke --style yellow

Python Usage

from omnicaptions import Caption

# Load any format
cap = Caption.read("input.srt")

# Write to any format
cap.write("output.vtt")
cap.write("output.ass")
cap.write("output.ttml")

Common Mistakes

Mistake	Fix
Format not detected	Use `--from` / `--to` flags
Missing timestamps	Source format must have timing info
Encoding error	Specify `encoding="utf-8"`

Related Skills

Skill	Use When
`/omnicaptions:transcribe`	Need transcript from audio/video
`/omnicaptions:translate`	Translate with Gemini API
`/omnicaptions:translate`	Translate with Claude (no API key)
`/omnicaptions:download`	Download video/captions first

Workflow Examples

# Transcribe → Convert → Translate (with Claude)
/omnicaptions:transcribe video.mp4
/omnicaptions:convert video_GeminiUnd.md -o video.srt
/omnicaptions:translate video.srt -l zh --bilingual

omnicaptions-convert

Safety Notice

Copy this and send it to your AI assistant to learn

Caption Format Conversion

⚡ YouTube Workflow

When to Use

When NOT to Use

Setup

Quick Reference

CLI Usage

ASS Style Presets

Bilingual Example

Custom Color Options

Font Size and Resolution

Karaoke Mode

Karaoke Workflow

Python Usage

Common Mistakes

Related Skills

Workflow Examples

Source Transparency

Related Skills

omnicaptions-translate

omnicaptions-laicut

omnicaptions-download

Csv Analyzer Cn