whisper-transcription

Whisper Transcription

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "whisper-transcription" with this command: npx skills add guia-matthieu/clawfu-skills/guia-matthieu-clawfu-skills-whisper-transcription

Whisper Transcription

Transcribe any audio or video to text using OpenAI's Whisper model - the same technology powering ChatGPT voice features.

When to Use This Skill

  • Podcast repurposing - Convert episodes to blog posts, show notes, social snippets

  • Video subtitles - Generate SRT/VTT files for YouTube, social media

  • Interview extraction - Pull quotes and insights from recorded calls

  • Content audit - Make audio/video libraries searchable

  • Translation - Transcribe and translate foreign language content

What Claude Does vs What You Decide

Claude Does You Decide

Structures production workflow Final creative direction

Suggests technical approaches Equipment and tool choices

Creates templates and checklists Quality standards

Identifies best practices Brand/voice decisions

Generates script outlines Final script approval

Dependencies

pip install openai-whisper torch ffmpeg-python click

Also requires ffmpeg installed on system

macOS: brew install ffmpeg

Ubuntu: sudo apt install ffmpeg

Commands

Transcribe Single File

python scripts/main.py transcribe audio.mp3 --model medium --output transcript.txt python scripts/main.py transcribe video.mp4 --format srt --output subtitles.srt

Batch Transcription

python scripts/main.py batch ./recordings/ --format txt --output ./transcripts/

Transcribe + Translate

python scripts/main.py translate foreign-audio.mp3 --to en

Extract Timestamps

python scripts/main.py timestamps podcast.mp3 --format json

Examples

Example 1: Podcast to Blog Post

Transcribe 1-hour podcast

python scripts/main.py transcribe episode-42.mp3 --model medium

Output: episode-42.txt (full transcript with timestamps)

Processing time: ~5 min for 1 hour audio on M1 Mac

Example 2: YouTube Subtitles

Generate SRT for video upload

python scripts/main.py transcribe marketing-video.mp4 --format srt

Output: marketing-video.srt

Upload directly to YouTube/Vimeo

Example 3: Batch Process Interview Library

Transcribe all recordings in folder

python scripts/main.py batch ./customer-interviews/ --model small --format txt

Output: ./customer-interviews/*.txt (one per audio file)

Model Selection Guide

Model Speed Accuracy VRAM Best For

tiny

Fastest ~70% 1GB Quick drafts, short clips

base

Fast ~80% 1GB Social media clips

small

Medium ~85% 2GB Podcasts, interviews

medium

Slow ~90% 5GB Professional transcripts

large

Slowest ~95% 10GB Critical accuracy needs

Recommendation: Start with small for most marketing content. Use medium for client deliverables.

Output Formats

Format Extension Use Case

txt

.txt Blog posts, analysis

srt

.srt Video subtitles (YouTube)

vtt

.vtt Web video subtitles

json

.json Programmatic access

tsv

.tsv Spreadsheet analysis

Performance Tips

  • GPU acceleration - 10x faster with CUDA GPU

  • Audio extraction - Script auto-extracts audio from video

  • Chunking - Long files auto-split for memory efficiency

  • Language detection - Automatic, or specify with --language

Skill Boundaries

What This Skill Does Well

  • Structuring audio production workflows

  • Providing technical guidance

  • Creating quality checklists

  • Suggesting creative approaches

What This Skill Cannot Do

  • Replace audio engineering expertise

  • Make subjective creative decisions

  • Access or edit audio files directly

  • Guarantee commercial success

Related Skills

  • video-processing - Extract audio from video

  • youtube-downloader - Download videos to transcribe

  • content-repurposer - Transform transcripts to content

  • podcast-production - Create podcasts

Skill Metadata

  • Mode: cyborg

category: automation subcategory: audio-processing dependencies: [openai-whisper, torch, ffmpeg-python] difficulty: beginner time_saved: 10+ hours/week

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

design-trends-2026

No summary provided by upstream source.

Repository SourceNeeds Review
General

social-listening

No summary provided by upstream source.

Repository SourceNeeds Review
General

web-scraper

No summary provided by upstream source.

Repository SourceNeeds Review
General

email-writing

No summary provided by upstream source.

Repository SourceNeeds Review