youtube-transcription-generator

Use VLM Run (vlmrun) to generate transcriptions from YouTube videos. Download a video with yt-dlp, then run vlmrun to transcribe with optional timestamps. VLMRUN_API_KEY must be in .env; follow vlmrun-cli-skill for CLI setup and options.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "youtube-transcription-generator" with this command: npx skills add MehediAhamed/youtube-transcription-generator

YouTube Transcription Generator (VLM Run)

Generate transcriptions from YouTube videos using vlmrun for speech-to-text and optional timestamps. This skill:

  1. Downloads the YouTube video (or audio) with yt-dlp.
  2. Transcribes the video with vlmrun (Orion visual AI).
  3. Saves the transcript to a file (plain text or with timestamps).

Refer to vlmrun-cli-skill for vlmrun CLI setup, environment variables, and all vlmrun chat options.


How the assistant should use this skill

  • Check .env for API key

    • Ensure .env (or .env.local) contains VLMRUN_API_KEY.
    • If missing, instruct the user to set it before running any vlmrun commands.
  • Use vlmrun for transcription only

    • For transcription (and optional timestamps), use the vlmrun CLI with a video file as input (-i <video>).
    • vlmrun accepts video files (e.g. .mp4). For YouTube, the skill first downloads the video with yt-dlp, then passes the file to vlmrun.
  • Workflow

    • User provides a YouTube URL (and optionally output path).
    • Download the video (or audio-only for faster/smaller) with yt-dlp.
    • Run: vlmrun chat "Transcribe this video with timestamps for each section. Output the full transcript in a clear, readable format." -i <downloaded_file> -o <output_dir>.
    • Capture vlmrun’s response and save it as the transcript file (e.g. transcript.txt).

Prerequisites

  • Python 3.10+
  • VLMRUN_API_KEY (required for vlmrun)
  • vlmrun CLI (vlmrun[cli])
  • yt-dlp (for downloading YouTube videos)

See vlmrun-cli-skill for detailed vlmrun usage and examples (including video transcription).


Installation & Setup

From the youtube-transcription-generator directory:

Windows (PowerShell):

cd path\to\youtube-transcription-generator
uv venv
.venv\Scripts\Activate.ps1
uv pip install -r requirements.txt

macOS/Linux:

cd path/to/youtube-transcription-generator
uv venv
source .venv/bin/activate
uv pip install -r requirements.txt

Copy .env_template to .env and set VLMRUN_API_KEY.


Quick Start: Transcribe a YouTube Video

Option A: Run the script (recommended)

# From youtube-transcription-generator directory, with venv activated
python scripts/run_transcription.py "https://www.youtube.com/watch?v=VIDEO_ID" -o ./output

This will:

  1. Download the video with yt-dlp into the output directory.
  2. Run vlmrun to transcribe the video.
  3. Save the transcript as output/transcript.txt (and keep artifacts in output/).

Option B: Manual vlmrun (after downloading the video yourself)

# 1) Download with yt-dlp
yt-dlp -f "bv*[ext=mp4]+ba/best[ext=mp4]/best" -o video.mp4 "https://www.youtube.com/watch?v=VIDEO_ID"

# 2) Transcribe with vlmrun (see vlmrun-cli-skill for options)
vlmrun chat "Transcribe this video with timestamps for each section. Output the full transcript in a clear, readable format." -i video.mp4 -o ./output

Capture the vlmrun stdout and save it as your transcript, or use --json if you need structured output.


Prompt variants for vlmrun

  • With timestamps:
    "Transcribe this video with timestamps for each section. Output the full transcript in a clear, readable format."

  • Plain transcript only:
    "Transcribe everything said in this video. Output only the spoken text, no timestamps."

  • Structured (e.g. JSON):
    Use --json and ask for a structured format in the prompt (e.g. list of { "time": "...", "text": "..." }).


Workflow checklist

  • Confirm vlmrun is installed and VLMRUN_API_KEY is set (see vlmrun-cli-skill).
  • Install dependencies: uv pip install -r requirements.txt (includes vlmrun[cli] and yt-dlp).
  • Run python scripts/run_transcription.py <youtube_url> -o ./output or download + vlmrun manually.
  • Find transcript in the output directory (e.g. output/transcript.txt).

Troubleshooting

  • vlmrun not found
    Activate the venv and run: uv pip install "vlmrun[cli]". See vlmrun-cli-skill.

  • Authentication errors
    Verify VLMRUN_API_KEY in .env or the current shell.

  • yt-dlp fails
    Update yt-dlp: uv pip install -U yt-dlp. Check the URL is a valid public YouTube video.

  • Large or long videos
    Use audio-only download in the script (e.g. -f bestaudio) to reduce size and speed up transcription.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

Cron Expression

Cron表达式生成、解释、常用示例、验证、下次执行时间、平台转换(Linux/AWS/GitHub Actions). Use when you need cron expression capabilities. Triggers on: cron expression.

Registry SourceRecently Updated
Coding

Coze Studio

An AI agent development platform with all-in-one visual tools, simplifying agent creation, debugging coze studio, typescript, agent, agent-platform, ai-plugi...

Registry SourceRecently Updated
Coding

Auto Document Generator

自动从代码生成技术文档,支持 Python/JavaScript/Bash,AI 增强文档质量

Registry SourceRecently Updated
Coding

Core

AdonisJS is a TypeScript-first web framework for building web apps and API servers. It comes with su core, typescript, core, framework, mvc-framework, nodejs...

Registry SourceRecently Updated