transcribe-audio

Skill: Transcribe Audio

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "transcribe-audio" with this command: npx skills add barefootford/buttercut/barefootford-buttercut-transcribe-audio

Skill: Transcribe Audio

Transcribes video audio using WhisperX and creates clean JSON transcripts with word-level timing data.

When to Use

  • Videos need audio transcripts before visual analysis

Critical Requirements

Use WhisperX, NOT standard Whisper. WhisperX preserves the original video timeline including leading silence, ensuring transcripts match actual video timestamps. Run WhisperX directly on video files. Don't extract audio separately - this ensures timestamp alignment.

Workflow

  1. Read Language from Library File

Read the library's library.yaml to get the language code:

Library metadata

library_name: [library-name] language: en # Language code stored here ...

  1. Run WhisperX

whisperx "/full/path/to/video.mov"
--language en
--model medium
--compute_type float32
--device cpu
--output_format json
--output_dir libraries/[library-name]/transcripts

  1. Prepare Audio Transcript

After WhisperX completes, format the JSON using our prepare_audio_script:

ruby .claude/skills/transcribe-audio/prepare_audio_script.rb
libraries/[library-name]/transcripts/video_name.json
/full/path/to/original/video_name.mov

This script:

  • Adds video source path as metadata

  • Removes unnecessary fields to reduce file size

  • Prettifies JSON

  1. Return Success Response

After audio preparation completes, return this structured response to the parent agent:

✓ [video_filename.mov] transcribed successfully Audio transcript: libraries/[library-name]/transcripts/video_name.json Video path: /full/path/to/video_filename.mov

DO NOT update library.yaml - the parent agent will handle this to avoid race conditions when running multiple transcriptions in parallel.

Running in Parallel

This skill is designed to run inside a Task agent for parallel execution:

  • Each agent handles ONE video file

  • Multiple agents can run simultaneously

  • Parent thread updates library.yaml sequentially after each agent completes

  • No race conditions on shared YAML file

Next Step

After audio transcription, use the analyze-video skill to add visual descriptions and create the visual transcript.

Installation

Ensure WhisperX is installed. Use the setup skill to verify dependencies.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

analyze-video

No summary provided by upstream source.

Repository SourceNeeds Review
General

setup

No summary provided by upstream source.

Repository SourceNeeds Review
General

update-buttercut

No summary provided by upstream source.

Repository SourceNeeds Review
General

release

No summary provided by upstream source.

Repository SourceNeeds Review