caption-clip

Download and caption YouTube clips. Use when the user asks to "caption a YouTube video", "download and add subtitles", "create captioned clips", "transcribe and caption a video", or mentions yt-dlp captioning workflows.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "caption-clip" with this command: npx skills add kwindla/skill-caption-clip/kwindla-skill-caption-clip-caption-clip

Caption YouTube Clips

Download YouTube video clips with timestamps, transcribe with Deepgram, and burn styled captions.

Prerequisites

  • yt-dlp installed
  • ffmpeg installed
  • DEEPGRAM_API_KEY in .env file (or environment variable)
  • Avenir Next font (falls back to Arial if unavailable)

Workflow

Step 1: Download Clip

Download a specific section of a YouTube video using yt-dlp:

yt-dlp --download-sections "*START-END" -o "clip_name.%(ext)s" "YOUTUBE_URL"

Example with timestamps 02:53-04:15:

yt-dlp --download-sections "*02:53-04:15" -o "clip_intro.%(ext)s" "https://www.youtube.com/watch?v=VIDEO_ID"

Step 2: Convert to MP4

Ensure consistent encoding with ffmpeg defaults:

ffmpeg -i clip_name.webm -y clip_name.mp4

Step 3: Transcribe with Deepgram

Load API key from .env and call Deepgram API:

DEEPGRAM_API_KEY=$(grep '^DEEPGRAM_API_KEY=' .env | cut -d'=' -f2 | tr -d '"' | tr -d "'")

curl -s --request POST \
  --url 'https://api.deepgram.com/v1/listen?model=nova-2&smart_format=true&punctuate=true&utterances=true' \
  --header "Authorization: Token $DEEPGRAM_API_KEY" \
  --header 'Content-Type: audio/mp4' \
  --data-binary @clip_name.mp4 > clip_transcript.json

Step 4: Generate SRT

Convert Deepgram JSON to SRT format using bundled script:

python3 ~/.claude/skills/caption-clip/scripts/json-to-srt.py clip_transcript.json clip.srt

Step 5: Clean Up Transcription

Read the generated SRT file and create a cleaned version (clip_clean.srt):

  1. Fix transcription errors - Correct misheard words
  2. Remove filler words - Delete: "like", "you know", "I'd say", "yeah", "um", "uh", "sort of", "kind of"
  3. Consolidate fragments - Merge short utterances into complete sentences
  4. Improve readability - Adjust phrasing to flow better when read vs. heard
  5. Fix timestamps - Ensure no invalid times (e.g., 00:00:60 should be 00:01:00)

Write the cleaned content to clip_clean.srt.

Step 6: Burn Captions

Apply styled subtitles with semi-transparent background:

ffmpeg -i clip_name.mp4 \
  -vf "subtitles=clip_clean.srt:force_style='FontSize=24,FontName=Avenir Next Medium,PrimaryColour=&H00FFFFFF,BackColour=&H80000000,BorderStyle=4,Outline=0,Shadow=0,MarginV=30,MarginL=40,MarginR=40'" \
  -c:a copy -y clip_captioned.mp4

Caption Styling Reference

ParameterValueDescription
FontSize24Point size
FontNameAvenir Next MediumClean sans-serif (use Arial as fallback)
PrimaryColour&H00FFFFFFWhite text (AABBGGRR format)
BackColour&H8000000050% transparent black background
BorderStyle4Opaque box using BackColour
MarginV30Vertical margin from bottom
MarginL/R40Horizontal margins

Alternative Fonts

If Avenir Next is unavailable, these work well for captions:

  • SF Pro Display
  • Helvetica Neue
  • Futura
  • Arial (universal fallback)

Check available fonts:

fc-list : family | sort -u | grep -iE '(avenir|sf pro|helvetica|futura)'

Output Files

After completing the workflow:

FileDescription
clip_name.mp4Original downloaded clip
clip_transcript.jsonRaw Deepgram transcription (preserved for re-runs)
clip.srtAuto-generated subtitles
clip_clean.srtCleaned/edited subtitles
clip_captioned.mp4Final video with burned-in captions

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

OPC Landing Page Manager

Landing page strategy, copywriting, design, and code generation for solo entrepreneurs. From product idea to a complete, self-contained, conversion-optimized...

Registry SourceRecently Updated
Coding

OPC Product Manager

Product spec generation for solo entrepreneurs. Turns a one-sentence idea into a build-ready spec that AI coding agents (Claude Code, etc.) can execute direc...

Registry SourceRecently Updated
Coding

设备

Use when querying or modifying device configurations on ESD service, calling REST APIs with sigV2 authentication on HK baseline or STG environments

Registry SourceRecently Updated
Coding

My Agent Browser

A fast Rust-based headless browser automation CLI with Node.js fallback that enables AI agents to navigate, click, type, and snapshot pages via structured co...

Registry SourceRecently Updated