YouTube Analysis

Extract transcripts from YouTube videos and produce structured concept analysis — key ideas, arguments, technical terms, takeaways, and multi-level summaries — all without API keys or MCP servers.

Reference Files

File	Purpose
`scripts/fetch_transcript.py`	Core transcript + metadata fetcher (CLI + importable)
`scripts/analyze_video.py`	Orchestrator: fetch → structure → export scaffold
`scripts/utils.py`	URL parsing, timestamp formatting, transcript chunking
`references/analysis-patterns.md`	Prompt patterns for each video type
`assets/output-template.md`	Markdown template for final output

Workflow

User provides YouTube URL
        │
        ▼
┌─────────────────────┐
│  Step 0: Deps check │
└────────┬────────────┘
         ▼
┌─────────────────────┐
│  Step 1: Parse URL  │
└────────┬────────────┘
         ▼
┌─────────────────────┐     ┌──────────────┐
│  Step 2: Transcript │────▶│  yt-dlp      │
│  (youtube-t-api)    │fail │  (fallback)  │
└────────┬────────────┘     └──────┬───────┘
         │◀───────────────-────────┘
         ▼
┌─────────────────────┐
│  Step 3: Metadata   │
│  (yt-dlp --dump-json│
└────────┬────────────┘
         ▼
┌─────────────────────┐
│  Step 4: Claude     │
│  analyzes transcript│
└────────┬────────────┘
         ▼
┌─────────────────────┐
│  Step 5: Export MD  │
└─────────────────────┘

Step 0: Ensure Dependencies

Before running any script, verify dependencies are installed:

uv pip install youtube-transcript-api yt-dlp -q

Or run scripts directly with uv run:

uv run --with youtube-transcript-api --no-project python scripts/fetch_transcript.py "URL"

Verify:

python -c "from youtube_transcript_api import YouTubeTranscriptApi; print('OK')"
yt-dlp --version

Step 1: URL Parsing and Validation

Use scripts/utils.py:parse_youtube_url() to extract the video ID. Supported formats:

Format	Example
Standard watch	`youtube.com/watch?v=dQw4w9WgXcQ`
Short URL	`youtu.be/dQw4w9WgXcQ`
Shorts	`youtube.com/shorts/dQw4w9WgXcQ`
Embed	`youtube.com/embed/dQw4w9WgXcQ`
Live	`youtube.com/live/dQw4w9WgXcQ`
With params	`youtube.com/watch?v=dQw4w9WgXcQ&t=120&list=PLxxx`
Bare ID	`dQw4w9WgXcQ`
Mobile	`m.youtube.com/watch?v=dQw4w9WgXcQ`
Music	`music.youtube.com/watch?v=dQw4w9WgXcQ`

If parsing fails, ask the user to provide the URL in a standard format.

Step 2: Transcript Extraction

Run fetch_transcript.py to get the transcript:

cd <skill_dir>/scripts
python fetch_transcript.py "YOUTUBE_URL" --lang en

This outputs JSON to stdout. The script:

Primary path: Uses youtube-transcript-api to scrape captions directly (no API key)
Fallback path: If primary fails, uses yt-dlp --write-sub --write-auto-sub to extract subtitle files
Language handling: Tries requested language first, falls back to any available transcript

The returned JSON contains both individual timestamped segments and a joined transcript_text field.

Or import as a module (used by analyze_video.py):

from fetch_transcript import fetch_video
data = fetch_video("https://youtube.com/watch?v=VIDEO_ID", lang="en")

Step 3: Metadata Extraction

Metadata is fetched automatically by fetch_transcript.py via yt-dlp --dump-json:

Title, channel name
Duration (seconds)
Upload date (YYYY-MM-DD)
Description (first 500 chars in scaffold)
View count
Tags

No separate step needed — fetch_video() returns everything.

Step 4: Concept Analysis

This is where you (Claude) do the work. The scripts provide raw data; you perform the analysis.

Analysis Depth

Choose based on user request or video duration:

Depth	When to Use	Sections to Fill
`quick`	User wants fast overview, or video < 10 min	TL;DR, Key Concepts, Takeaways
`standard`	Default for most videos	All template sections
`deep`	User wants thorough breakdown, or video > 30 min	All sections + timestamped section-by-section

Analysis Process

Read the full transcript from the JSON output
Identify the video type (or use user-provided hint). See references/analysis-patterns.md for type-specific guidance
Extract key concepts: Main ideas, arguments, claims — each as a bullet with brief explanation
Identify technical terms: Definitions as presented in the video
Pull notable statements: Paraphrase key quotes with approximate timestamps
Synthesize takeaways: Actionable items the viewer should consider
Write the TL;DR: One to three sentences capturing the core message
Suggest related topics: Based on concepts mentioned, what should the viewer explore next

For Deep Analysis

Use utils.chunk_transcript() to break the transcript into 5-minute segments, then analyze each chunk with timestamps:

from utils import chunk_transcript
chunks = chunk_transcript(data["transcript"], chunk_minutes=5)
for chunk in chunks:
    print(f"[{chunk['start_formatted']} - {chunk['end_formatted']}]")
    print(chunk["text"])

Or run the orchestrator with --depth deep:

python analyze_video.py "YOUTUBE_URL" --depth deep

Video Type Patterns

Type	Key Extraction Focus	See
Lecture	Thesis, arguments, citations, definitions	`references/analysis-patterns.md`
Tutorial	Steps, tools, prerequisites, gotchas	`references/analysis-patterns.md`
Interview	Perspectives, disagreements, attributed positions	`references/analysis-patterns.md`
Podcast	Topic threads, opinions, recommendations	`references/analysis-patterns.md`
Tech Talk	Architecture, trade-offs, benchmarks, lessons	`references/analysis-patterns.md`
Panel	Consensus vs. disagreement, per-speaker views	`references/analysis-patterns.md`

Read references/analysis-patterns.md for detailed extraction guidance per type.

Step 5: Export to Markdown

The orchestrator generates a scaffold:

cd <skill_dir>/scripts
python analyze_video.py "YOUTUBE_URL" --output ./analysis.md --depth standard --type auto

Flags:

--output PATH: Where to write (default: ./{sanitized_title}.md)
--depth quick|standard|deep: Analysis depth
--type auto|lecture|tutorial|interview|podcast|tech-talk|panel: Video type hint
--lang CODE: Transcript language (default: en)
--json: Output raw JSON instead of Markdown scaffold

The scaffold contains populated metadata and [TO BE ANALYZED] placeholders. Claude replaces these with actual analysis.

Preferred workflow: Run fetch_transcript.py to get JSON, analyze in context, then produce the final Markdown directly using assets/output-template.md as the structure guide. The orchestrator is useful for batch processing or when the user wants a file written.

Error Handling

Error	Exit Code	Cause	Resolution
URL parse failure	1	Invalid or unsupported URL format	Ask user for standard YouTube URL
No transcript	2	Video has no captions (manual or auto)	Inform user; suggest a different video
Video unavailable	1	Private, deleted, or geo-blocked	Inform user of the restriction
Age-restricted	1	Requires authentication	Inform user; yt-dlp may work with cookies
Metadata fetch fail	0	yt-dlp network issue	Transcript still works; metadata shows "Unknown"
Language unavailable	0	Requested lang not available	Auto-falls back to available language
yt-dlp not installed	1	Missing dependency	Run Step 0 dependency installation

Limitations

No visual analysis: Transcript-only; slides, diagrams, code on screen, and demos are not captured. Note this in output when relevant.
Auto-caption quality: Auto-generated captions may contain errors, especially for technical terms, proper nouns, and non-English accents.
Music videos: Lyrics may not be available as captions. Music-only content produces poor results.
Live streams: Ongoing live streams may have incomplete or unavailable transcripts.
Rate limiting: Excessive requests to YouTube may trigger temporary blocks. Space requests if processing multiple videos.
Language coverage: Best results for English. Other languages depend on caption availability and quality.
Speaker attribution: Transcripts rarely identify individual speakers. Claude infers from context where possible.