omnicaptions-laicut

LattifAI's audio-text processing toolkit. Currently supports forced alignment, with translate and speaker diarization coming soon.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "omnicaptions-laicut" with this command: npx skills add lattifai/omni-captions-skills/lattifai-omni-captions-skills-omnicaptions-laicut

LaiCut

LattifAI's audio-text processing toolkit. Currently supports forced alignment, with translate and speaker diarization coming soon.

Requires LattifAI API Key - Get from https://lattifai.com/dashboard/api-keys

When to Use

  • Accurate/precise timing needed - When user requests accurate timestamps or precise alignment

  • Sync misaligned captions - Fix timing drift in downloaded captions

  • Align manual transcripts - Match text to speech precisely

  • Post-transcription alignment - Improve timing from auto-generated captions

  • Multi-format support - SRT, VTT, ASS, LRC, TXT, MD

When NOT to Use

  • Need full transcription (use /omnicaptions:transcribe )

  • No existing caption/transcript (nothing to align)

  • Very short clips (<5 seconds)

Setup

pip install "omni-captions-skills[laicut]" --extra-index-url https://lattifai.github.io/pypi/simple/

API Key

Priority: LATTIFAI_API_KEY env → .env file → ~/.config/omnicaptions/config.json

If not set, ask user: Please enter your LattifAI API key (get from https://lattifai.com/dashboard/api-keys):

Then run with -k <key> . Key will be saved to config file automatically.

CLI Usage

Basic alignment (default: JSON with word-level timing, RECOMMENDED)

omnicaptions LaiCut audio.mp3 caption.srt

→ caption_LaiCut.json

Then convert to desired format (preserves word timing in JSON for future use)

omnicaptions convert caption_LaiCut.json -o caption_LaiCut.srt

Smart sentence segmentation (for word-level captions like YouTube VTT)

omnicaptions LaiCut video.mp4 caption.vtt --split-sentence

Option Description

-o, --output

Output file (default: <caption>_LaiCut.json )

-f, --format

Output format (default: json)

-k, --api-key

LattifAI API key

--split-sentence

AI-powered semantic sentence segmentation

-v, --verbose

Show progress

JSON Output (Recommended)

JSON output preserves word-level timing for downstream tasks:

[ { "text": "Hello world", "start": 0.0, "end": 2.5, "words": [ {"word": "Hello", "start": 0.0, "end": 0.5}, {"word": "world", "start": 0.6, "end": 2.5} ] } ]

Convert JSON to other formats:

For playback/editing

omnicaptions convert video.en_LaiCut.json -o video.en_LaiCut.srt

For bilingual ASS

omnicaptions convert video.en_LaiCut.json -o video.en_LaiCut.ass --style bilingual

For karaoke

omnicaptions convert lyrics_LaiCut.json -o lyrics_LaiCut_karaoke.ass --karaoke

For translation: Convert to SRT first (JSON is too large for Claude to read):

1. JSON → SRT

omnicaptions convert video.en_LaiCut.json -o video.en_LaiCut.srt

2. Claude 翻译 → video.en_LaiCut_Claude_zh.srt

3. 转换为带颜色 ASS

omnicaptions convert video.en_LaiCut_Claude_zh.srt -o video.en_LaiCut_Claude_zh_Color.ass
--line1-color "#00FF00" --line2-color "#FFFF00"

Ask User: Enable Smart Sentence Segmentation?

When input captions are word-level or poorly segmented (e.g., YouTube VTT), ask user whether to enable --split-sentence :

Is the caption word-level or poorly segmented (e.g., YouTube VTT)?

  • Yes → Add --split-sentence (AI re-segments into natural sentences)
  • No → Keep original segmentation

Use cases: YouTube VTT, word-aligned captions, messy auto-generated captions

LattifAI API Key Error Handling

LaiCut requires LattifAI API key (NOT Gemini API key)

Error example:

API KEY verification error: API KEY is invalid or expired.

Correct handling:

  • Tell user that LaiCut requires a valid LattifAI API key

  • Ask user if they want to provide a new key

  • Direct user to https://lattifai.com/dashboard/api-keys

  • NEVER skip alignment step claiming "timing is already accurate enough"

Common Mistakes

Mistake Fix

API key invalid/expired Ask user for new key, do NOT skip

No API key Set LATTIFAI_API_KEY or use -k

Audio format error Convert to WAV/MP3/M4A first

Empty output Check caption has text content

Related Skills

Skill Use When

/omnicaptions:transcribe

Generate transcript first

/omnicaptions:convert

Convert caption formats

/omnicaptions:translate

Translate after alignment

Workflow Examples

Important:

  • LaiCut outputs JSON by default (preserves word-level timing)

  • Convert to SRT/ASS when needed for playback or translation

  • Generate bilingual captions AFTER alignment

No caption: transcribe → align (JSON) → convert → translate

omnicaptions transcribe video.mp4

→ video_GeminiUnd.md

omnicaptions LaiCut video.mp4 video_GeminiUnd.md

→ video_GeminiUnd_LaiCut.json

omnicaptions convert video_GeminiUnd_LaiCut.json -o video_GeminiUnd_LaiCut.srt

→ video_GeminiUnd_LaiCut_Claude_zh.srt (after translate)

Has caption: download → align (JSON) → convert → translate

omnicaptions download "https://youtu.be/xxx"

→ xxx.en.vtt

omnicaptions LaiCut xxx.mp4 xxx.en.vtt

→ xxx.en_LaiCut.json

omnicaptions convert xxx.en_LaiCut.json -o xxx.en_LaiCut.srt

→ xxx.en_LaiCut_Claude_zh.srt (after translate)

→ xxx.en_LaiCut_Claude_zh_Color.ass (after convert with --style)

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

omnicaptions-translate

No summary provided by upstream source.

Repository SourceNeeds Review
General

omnicaptions-transcribe

No summary provided by upstream source.

Repository SourceNeeds Review
General

omnicaptions-convert

No summary provided by upstream source.

Repository SourceNeeds Review
General

omnicaptions-download

No summary provided by upstream source.

Repository SourceNeeds Review