douyin-transcribe

Extract audio from Douyin (抖音/TikTok China) videos and transcribe to text using Whisper. Trigger when user sends a Douyin link (v.douyin.com or www.douyin.com/video/) and asks for transcription, extract text, analyze video content, or summarize.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "douyin-transcribe" with this command: npx skills add don068589/douyin-video-transcribe

Douyin Video Transcribe

Extract speech from Douyin videos and convert to text. Supports Chinese/English, cross-platform (Windows/macOS/Linux).

Core Principle

Douyin has strict anti-scraping. Must:

  1. Load page in browser, wait for video stream
  2. Extract real CDN URL from DOM or network requests
  3. Download with Referer: https://www.douyin.com/ header (403 without it)
  4. Convert audio to 16kHz mono WAV for Whisper

Prerequisites

ToolPurposeInstall
ffmpegAudio extractionbrew install ffmpeg / winget install ffmpeg / apt install ffmpeg
whisperSpeech-to-textpip install openai-whisper
curlDownload videoBuilt-in (Windows: curl.exe)

Workflow

1. Resolve Short URL

Douyin share links are usually v.douyin.com/xxx, resolve to full URL:

# macOS/Linux
curl -sL -o /dev/null -w '%{url_effective}' "https://v.douyin.com/xxx/"

# Windows PowerShell
curl.exe -sL -o NUL -w "%{url_effective}" "https://v.douyin.com/xxx/"

Output: https://www.douyin.com/video/7616020798351871284

Video ID is the 19-digit number in URL.

2. Get Video URL

Open video page in browser, wait 3-5 seconds, execute JS:

(() => {
  const videos = document.querySelectorAll('video');
  for (const v of videos) {
    const src = v.currentSrc || v.src;
    if (src && src.startsWith('http') && !src.includes('uuu_265')) {
      return src;
    }
  }
  return null;
})()

Key points:

  • Returns null: Page not loaded, retry after waiting
  • Contains uuu_265: Placeholder video, retry after waiting
  • Starts with blob:: Streaming, wait for real URL
  • CDN URLs expire (~2 hours), re-fetch if needed

3. Download Video

# macOS/Linux
curl -L -H "Referer: https://www.douyin.com/" -o video.mp4 "<CDN_URL>"

# Windows
curl.exe -L -H "Referer: https://www.douyin.com/" -o video.mp4 "<CDN_URL>"

Referer header is required, otherwise 403.

4. Extract Audio

ffmpeg -i video.mp4 -ar 16000 -ac 1 -c:a pcm_s16le audio.wav -y

Parameters:

  • -ar 16000: 16kHz sample rate (Whisper requirement)
  • -ac 1: Mono channel
  • -c:a pcm_s16le: 16-bit PCM

5. Transcribe

python -m whisper audio.wav --model small --language zh

Model selection:

ModelSize5-min video (CPU)AccuracyUse case
tiny75MB~30sFairQuick preview
base142MB~1minGoodDaily use
small466MB~3minBetterRecommended
medium1.5GB~8minBestHigh accuracy

Language:

  • Chinese: --language zh
  • English: --language en
  • Auto-detect: omit flag (slower)

Output files in current directory: audio.txt, audio.srt, audio.json

Troubleshooting

IssueDetectionSolution
Short URL failsReturns non-douyin.comCheck link completeness, remove share text noise
Video URL not foundJS returns nullWait 3-5s and retry, max 3 times
Placeholder videoURL contains uuu_265Page not loaded, wait and retry
Download 403curl returns 403Check Referer header; URL may be expired
Whisper hangsNo output for long timeFirst run downloads model (~460MB for small)
Garbled outputTerminal shows gibberishNormal, read .txt file directly
Out of memoryProcess killedUse smaller model (base/tiny)

Output Convention

Name files by video ID, save to user-specified directory:

output/
├── 7616020798351871284.mp4   # Original video (optional)
├── 7616020798351871284.wav   # Audio (delete after)
├── 7616020798351871284.txt   # Transcript
└── 7616020798351871284.srt   # Subtitles (optional)

Scripts (Optional)

Helper scripts in skill directory:

  • scripts/get_video_url.js: Browser-side video URL extraction with multiple methods
  • scripts/transcribe.py: CLI one-click transcription (requires video URL)

Scripts are accelerators, not required. Implement yourself after understanding the workflow.

Notes

  • Article links (/article/): Use browser snapshot directly, no transcription needed
  • Douyin AI summary: Some video pages have AI-generated chapter summaries, extract from snapshot as supplement
  • Other platforms: This skill is for Douyin only. Use yt-dlp for YouTube/Bilibili

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Kafka

Kafka - command-line tool for everyday use

Registry SourceRecently Updated
General

Helm

Helm - command-line tool for everyday use

Registry SourceRecently Updated
General

Cms

Cms - command-line tool for everyday use

Registry SourceRecently Updated
General

Valuation

Valuation - command-line tool for everyday use

Registry SourceRecently Updated