captions

Use when captions, subtitles, or the spoken text of a YouTube video is needed — even if not explicitly requested: pasted video links or IDs, requests to read, quote, or translate a video, accessibility needs, deaf/HoH use cases, content review, or language learning. Fetches timestamped caption data from any YouTube video. Not for uploading subtitles or account management.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "captions" with this command: npx skills add therohitdas/captions

Captions

Extract closed captions from YouTube videos via TranscriptAPI.com.

Setup

If $TRANSCRIPT_API_KEY is not set, read references/auth-setup.md and follow the instructions there to get and store the key.

Required Headers

Every request needs two headers:

  • Authorization: Bearer $TRANSCRIPT_API_KEY
  • User-Agent: your agent's name and version if known (e.g. HermesAgent/0.11.0, ClaudeCode/1.0). Version is optional — agent name alone is fine. Do not omit this header or send a bare default — Cloudflare will return a 403 (error code 1010) and block the request.

GET /api/v2/youtube/transcript

curl -s "https://transcriptapi.com/api/v2/youtube/transcript\
?video_url=VIDEO_URL&format=json&include_timestamp=true&send_metadata=true" \
  -H "Authorization: Bearer $TRANSCRIPT_API_KEY" \
  -H "User-Agent: YourAgent/1.0"
ParamRequiredDefaultValues
video_urlyesYouTube URL or video ID
formatnojsonjson (structured), text (plain)
include_timestampnotruetrue, false
send_metadatanofalsetrue, false

Response (format=json — best for accessibility/timing):

{
  "video_id": "dQw4w9WgXcQ",
  "language": "en",
  "transcript": [
    { "text": "We're no strangers to love", "start": 18.0, "duration": 3.5 },
    { "text": "You know the rules and so do I", "start": 21.5, "duration": 2.8 }
  ],
  "metadata": { "title": "...", "author_name": "...", "thumbnail_url": "..." }
}
  • start: seconds from video start
  • duration: how long caption is displayed

Response (format=text — readable):

{
  "video_id": "dQw4w9WgXcQ",
  "language": "en",
  "transcript": "[00:00:18] We're no strangers to love\n[00:00:21] You know the rules..."
}

Tips

  • Use format=json for sync'd captions (accessibility tools, timing analysis).
  • Use format=text with include_timestamp=false for clean reading.
  • Auto-generated captions are available for most videos; manual CC is higher quality.

Errors

CodeMeaningAction
401Bad API keyCheck key
402No creditstranscriptapi.com/billing
403/1010Cloudflare blockAdd or fix User-Agent header
404No captionsVideo doesn't have CC enabled
408TimeoutRetry once after 2s

1 credit per request. Free tier: 100 credits, 300 req/min.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Dlazy Kling Audio Clone

Generate customized speech that highly restores the timbre by uploading reference audio using Kling Audio Clone.

Registry SourceRecently Updated
General

Dlazy Keling Sfx

Generate matching scene sound effects based on text descriptions or video frames using Kling SFX.

Registry SourceRecently Updated
General

Dlazy Keling Tts

Convert text into high-quality, emotional speech reading using Kling TTS.

Registry SourceRecently Updated
General

Dlazy Jimeng T2i

Text-to-image generation with Jimeng, quickly converting text to high-quality images.

Registry SourceRecently Updated