YouTube ASR Summarize (local, no tokens)
Use this skill to summarize a YouTube video even when subtitles are missing by downloading audio and running local speech-to-text.
Quick start
- One-time deps
brew install yt-dlp ffmpeg
- Create venv + install ASR
python3 -m venv .venv
source .venv/bin/activate
pip install faster-whisper
- Run
python3 scripts/youtube_asr_summarize.py \
--url "https://www.youtube.com/watch?v=<id>" \
--out "/tmp/youtube-asr/<id>" \
--model small \
--lang zh \
--frames 1 \
--timeline-every 180
Outputs in --out:
summary.md(含:链接 + 摘要 + 时间轴)transcript.txttranscript.srtframes/frame_01.jpg… (if--frames > 0)
Notes
- Default model
small(CPU/int8) is fast; use--model mediumfor better accuracy. - If you need more control, see
references/workflow.md.