YouTube Video Analyzer

A professional YouTube video analysis assistant using scene detection + subtitle alignment + parallel analysis architecture.

Prerequisites

Before starting, ensure these tools are installed:

# Check installations
which yt-dlp    # Video/subtitle download
which ffmpeg    # Scene detection and frame extraction

# Install if missing (macOS)
brew install yt-dlp ffmpeg

# Or via pip
pip install yt-dlp

Complete Workflow

Phase 1: Setup and Download

# Create working directory
VIDEO_ID="[extract from URL]"
WORK_DIR="youtube_analysis_$VIDEO_ID"
mkdir -p $WORK_DIR/{video,subtitles,frames,output}

# Download video + subtitles + metadata in one call (fewer requests)
yt-dlp -f "worst[ext=mp4]/best[ext=mp4]" \
       --write-info-json \
       --write-auto-sub --write-sub \
       --sub-lang zh-Hans,zh,en \
       --convert-subs srt \
       --no-playlist \
       -o "$WORK_DIR/video/source.%(ext)s" \
       "YOUTUBE_URL"

# Move subtitles to subtitles/ and keep metadata.json
mv "$WORK_DIR/video/"*.srt "$WORK_DIR/subtitles/" 2>/dev/null || true
cp "$WORK_DIR/video/source.info.json" "$WORK_DIR/metadata.json" 2>/dev/null || true

Phase 2: Scene Detection and Frame Extraction

# Extract keyframes + timestamps in a single decode
ffmpeg -i $WORK_DIR/video/source.mp4 \
       -vf "select='gt(scene,0.3)',showinfo" \
       -vsync vfr \
       $WORK_DIR/frames/scene_%04d.jpg \
       2> $WORK_DIR/ffmpeg_scene.log

# Parse timestamps from log (no second decode)
grep "pts_time" $WORK_DIR/ffmpeg_scene.log | \
  sed 's/.*pts_time:\([0-9.]*\).*/\1/' > $WORK_DIR/frame_timestamps.txt

Scene threshold guidelines:

Video Type	Threshold	Description
Lectures/PPT	0.2-0.3	Fewer changes, capture slides
Technical tutorials	0.25-0.35	Code/UI changes
Vlogs/interviews	0.3-0.4	Moderate changes
Fast-paced/edited	0.4-0.5	Avoid too many frames

Phase 3: Subtitle Parsing and Alignment

Parse the SRT subtitle file and align with extracted frames:

Read subtitle file from $WORK_DIR/subtitles/
Parse timestamp format: 00:01:23,456 --> 00:01:25,789
Match each frame timestamp to corresponding subtitle segment
Create frame-subtitle pairs for analysis

Phase 4: Parallel Segment Analysis

Divide frames into segments (10-15 frames each) and analyze:

For each segment, use this prompt:

分析以下视频片段：

时间范围：{start_time} - {end_time}
帧图片：[Read the frame images]
字幕内容：
{subtitle_text}

请分析：
1. 每帧的视觉内容（图表、代码、流程图、UI等）
2. 结合字幕理解讲解要点
3. 提取关键概念和术语
4. 标注重要的视觉元素
5. 给出关键细节的解释或小结
6. 如果有步骤/代码，提炼可复现的操作点

输出格式：结构化笔记，标注时间戳

Parallel execution tips:

Cap concurrency (e.g., 3–5 segments at once) to avoid rate limits
Retry failed segments and merge results incrementally
Consider de-dup/contact-sheeting similar frames to reduce token use

Phase 5: Final Summary Generation

Merge all segment analyses and generate complete summary:

Use this prompt for final generation:

整合以下视频分析结果，生成完整的学习总结：

{all_segment_analyses}

**必须包含以下内容：**

1. 概览（中英双语）
2. 核心要点列表
3. 场景时间线表格
4. 关键视觉内容（引用帧图片）
5. 详细笔记（按章节组织）
6. 实践要点清单

**详细度要求：**
- 每个章节至少 3-5 条要点（包含解释、原因或影响）
- 对关键术语给出简短定义/释义
- 对关键步骤给出可复现的操作描述
- 重要结论尽量引用对应帧图（scene_XXXX.jpg）

**必须生成以下图表（Mermaid格式）：**

1. **思维导图**（必须）- 展示知识结构
2. **时间线**（必须）- 展示内容分布
3. **流程图**（如有步骤/流程）
4. **概念关系图**（如有概念关联）

Phase 6: Final Deliverables (cleanup)

Keep only final artifacts:

Video file
Chinese/English subtitles (SRT)
Summary document
Frames referenced by the summary

Run:

./scripts/finalize.sh "$WORK_DIR" /path/to/summary.md

Use --keep-work to preserve intermediate files for debugging. When using this skill, always run finalize.sh after the summary is generated to remove intermediate artifacts.

Output Format Template

# [视频标题] 学习总结 / Learning Summary

## 概览 / Overview
[中英双语简介]

## 核心要点 / Key Takeaways
- 要点 1 / Point 1
- 要点 2 / Point 2
- 要点 3 / Point 3

## 知识结构图 / Knowledge Mind Map

```mermaid
mindmap
  root((视频主题))
    核心概念1
      子概念A
      子概念B
    核心概念2
      子概念C
    实践要点
      步骤1
      步骤2

视频时间线 / Video Timeline

gantt
    title 视频内容时间线
    dateFormat mm:ss
    section 引言
    主题介绍 :00:00, 02:00
    section 核心内容
    概念讲解 :02:00, 15:00
    section 总结
    回顾要点 :15:00, 20:00

内容流程图 / Content Flowchart (如适用)

flowchart TD
    A[开始] --> B[步骤1]
    B --> C{判断条件}
    C -->|是| D[步骤2]
    C -->|否| E[步骤3]
    D --> F[结束]
    E --> F

概念关系图 / Concept Relationships (如适用)

graph LR
    A[概念A] --> B[概念B]
    A --> C[概念C]
    B --> D[概念D]
    C --> D

场景时间线 / Scene Timeline

时间	场景描述	关键内容
00:15	标题页	主题介绍
02:30	代码演示	核心实现
05:45	架构图	系统设计

关键视觉内容 / Key Visuals

[00:02:30] - 架构图

架构图 分析 / Analysis: [图片内容说明及重要性]

[00:05:45] - 代码示例

代码示例 分析 / Analysis: [代码说明及要点]

详细笔记 / Detailed Notes

第一章：引言 [00:00 - 02:00]

[详细内容...]

第二章：核心概念 [02:00 - 10:00]

[详细内容...]

第三章：实践演示 [10:00 - 18:00]

[详细内容...]

第四章：总结 [18:00 - 20:00]

[详细内容...]

关键概念释义 / Key Terms

术语 1：解释
术语 2：解释

复现步骤 / Reproduction Steps

步骤 1
步骤 2
步骤 3

常见误区 / Common Pitfalls

误区 1：说明
误区 2：说明

实践要点 / Action Items

实践项 1 / Action 1
实践项 2 / Action 2
实践项 3 / Action 3