Podcast Producer

Create complete podcast episodes, interviews, and conversation-style audio content.

This is an orchestrator skill that combines:

Script/dialogue generation (Claude)
Multi-speaker voice synthesis (Gemini TTS)
Intro/outro music (Lyria)
Audio assembly (FFmpeg via media-utils)

What You Can Create

Type	Example
Podcast episode	Two hosts discussing a topic
Interview	Q&A format with host and guest
Dialogue	Scripted conversation between characters
Audio drama	Story with multiple characters
Radio show	Formatted audio program with segments

Prerequisites

GOOGLE_API_KEY - For Gemini TTS (voices) and Lyria (music)
FFmpeg installed: brew install ffmpeg (macOS) or apt install ffmpeg (Linux)

Workflow

Step 1: Gather Requirements (REQUIRED)

⚠️ DO NOT skip this step. Use interactive questioning — ask ONE question at a time.

Question Flow

⚠️ Use the AskUserQuestion tool for each question below. Do not just print questions in your response — use the tool to create interactive prompts with the options shown.

Q1: Topic

"I'll create that podcast episode! First — what's the topic?

(What should the hosts discuss?)"

Wait for response.

Q2: Hosts

"Who are the hosts/speakers?

(Names and brief personality — e.g., 'Sarah, enthusiastic tech expert' — max 2 for TTS)"

Wait for response.

Q3: Duration

"How long should the episode be?

5 minutes

10 minutes

15 minutes

Or specify your own"

Wait for response.

Q4: Tone

"What tone?

Professional

Casual/conversational

Funny/entertaining

Serious/educational

Or describe your own"

Wait for response.

Q5: Music

"What music style for intro/outro?

Upbeat pop

Chill lo-fi

Corporate/professional

Electronic

Or describe your own"

Wait for response.

Quick Reference

Question	Determines
Topic	Script content and discussion points
Hosts	Voice selection and script style
Duration	Script length
Tone	Writing style and energy
Music	Lyria prompt for intro/outro

Step 2: Generate the Script

Use Claude to create the dialogue script with speaker labels:

[INTRO MUSIC: 10 seconds, upbeat tech podcast vibe]

Sarah: Welcome back to Tech Talk! I'm Sarah, and today we have something exciting.

Mike: Hey everyone! I'm Mike, and yes - we're diving into AI in healthcare.

Sarah: So Mike, what's the biggest change you've seen this year?

Mike: Great question! The use of AI for diagnostic imaging has exploded...

[Continue dialogue...]

[OUTRO MUSIC: Fade under last lines, 5 seconds after]

Sarah: Thanks for listening! Follow us for more episodes.

Mike: See you next time!

Script Guidelines:

Use speaker names exactly as they'll be configured in TTS
Include music cues in brackets: [INTRO MUSIC: description]
Natural conversation flow with back-and-forth
Aim for ~150 words per minute of audio

Step 3: Plan Asset Generation

Create a manifest of what needs to be generated:

{
  "project": "tech_talk_ai_healthcare",
  "duration_target": "5 minutes",
  "speakers": [
    {"name": "Sarah", "voice": "Kore", "style": "Enthusiastic, upbeat"},
    {"name": "Mike", "voice": "Puck", "style": "Friendly, knowledgeable"}
  ],
  "assets": [
    {
      "type": "music",
      "name": "intro_music",
      "prompt": "upbeat tech podcast, electronic, modern",
      "duration": 15,
      "script": "lyria"
    },
    {
      "type": "dialogue", 
      "name": "main_content",
      "speakers": ["Sarah", "Mike"],
      "text": "[the dialogue script]",
      "script": "gemini_tts"
    },
    {
      "type": "music",
      "name": "outro_music",
      "prompt": "same as intro, fade out",
      "duration": 10,
      "script": "lyria"
    }
  ],
  "assembly": [
    {"action": "mix", "voice": "intro_with_music", "music": "intro_music", "music_volume": 0.8},
    {"action": "concat", "files": ["intro_with_music", "main_content"]},
    {"action": "mix", "voice": "main_content_end", "music": "outro_music", "fade_out": 5}
  ]
}

Step 4: Generate Assets

Execute each generation step:

Generate intro music (Lyria):

python3 ${CLAUDE_PLUGIN_ROOT}/skills/music-generation/scripts/lyria.py \
  --prompt "upbeat tech podcast, electronic, modern, positive energy" \
  --duration 15 \
  --bpm 120

Generate dialogue (Gemini TTS multi-speaker):

python3 ${CLAUDE_PLUGIN_ROOT}/skills/voice-generation/scripts/gemini_tts.py \
  --multi \
  --speaker "Sarah:Kore" \
  --speaker "Mike:Puck" \
  --text "[dialogue script from Step 2]" \
  --style "Make Sarah sound enthusiastic and upbeat. Mike sounds friendly and knowledgeable."

Generate outro music (same as intro or variation):

python3 ${CLAUDE_PLUGIN_ROOT}/skills/music-generation/scripts/lyria.py \
  --prompt "upbeat tech podcast, electronic, fade out feel" \
  --duration 10 \
  --bpm 120

Step 5: Assemble the Podcast

Use media-utils to stitch everything together:

Mix intro music with beginning of dialogue:

python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/audio_mix.py \
  --voice dialogue.wav \
  --music intro_music.wav \
  --music-volume 0.4 \
  --fade-out 3 \
  -o intro_mixed.wav

Concatenate all segments:

python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/audio_concat.py \
  -i intro_mixed.wav main_dialogue.wav outro_mixed.wav \
  --crossfade 1.0 \
  -o final_podcast.mp3

Step 6: Deliver the Result

Provide:

The final audio file
Summary of what was created
Offer adjustments

Example delivery:

"✅ Your podcast episode is ready!

File: tech_talk_ai_healthcare.mp3 (5:23)

What I created:

Dialogue between Sarah (Kore voice) and Mike (Puck voice)
15s upbeat electronic intro music
10s outro music fade

Want me to:

Adjust the voices or tone?
Change the music style?
Extend or shorten any section?
Add more topics to the discussion?"

Voice Pairing Suggestions

Host Type	Suggested Voice	Description
Main host (energetic)	Kore, Puck, Laomedeia	Firm, upbeat
Co-host (calm)	Charon, Algieba	Informative, smooth
Guest (expert)	Rasalgethi, Gacrux	Knowledgeable, mature
Interviewer	Achird	Friendly
Narrator	Charon, Orus	Informative, clear

Music Style Suggestions

Podcast Type	Music Prompt
Tech/Business	"upbeat electronic, modern, corporate, positive"
Casual/Comedy	"fun, playful, acoustic guitar, lighthearted"
News/Serious	"subtle, professional, ambient, understated"
Storytelling	"cinematic, emotional, orchestral, atmospheric"
Health/Wellness	"calm, peaceful, ambient, gentle piano"
True Crime	"dark, suspenseful, minimal, tension"

Limitations

Max 2 speakers per Gemini TTS call (for more, generate separate files and concatenate)
Lyria is instrumental only - no vocals in music
Duration estimates may vary - dialogue length depends on speaking pace
Music loops if shorter than dialogue

Error Handling

Error	Solution
"GOOGLE_API_KEY not set"	Set up API key per README
"FFmpeg not found"	Install: `brew install ffmpeg`
"google-genai not installed"	Run: `pip install google-genai`
TTS too long	Split script into segments, generate separately, concat

Example Prompts

Simple:

"Create a 3-minute podcast episode about remote work tips with two hosts"

Detailed:

"Create a 5-minute tech podcast episode. Hosts: Alex (enthusiastic, tech-savvy) and Jordan (skeptical, asks good questions). Topic: The future of AI assistants. Include upbeat electronic intro/outro music. Casual but informative tone."

With provided script:

"Turn this dialogue into a podcast episode with intro/outro music: Alex: Hey everyone, welcome back! Jordan: Today we're talking about..."