Speak Response
Vocalize text using local Qwen3-TTS. Default voice is the Oracle (cloned from a Dune narrator with deep, resonant, prophetic quality).
Quick Examples
Command Effect
/speak
Last 2 sentences with Oracle voice
/speak 5
Last 5 sentences with Oracle voice
/speak "The sleeper must awaken."
Specific text with Oracle voice
/speak --preset mood:warm
Last 2 sentences with preset speaker + emotion
/speak --preset "Hello" speaker:Vivian voice:"nurturing"
Preset speaker with custom voice
Default: Oracle Voice
The oracle voice is a deep, resonant, prophetic voice cloned from a Dune narrator. It speaks all text with a sense of ancient wisdom and gravitas.
Default usage - Oracle voice
scripts/speak.sh "The spice must flow." scripts/speak.sh "He who controls the spice controls the universe."
Limitation
The Oracle uses voice cloning (Base model), which does not support per-message instruction control. The voice characteristics are fixed. For emotion/mood control, use --preset .
Preset Speakers (--preset)
For emotion and mood control, use --preset to switch to CustomVoice with adjustable instructions:
scripts/speak.sh --preset "<text>" [speaker] [instruction]
Quick Preset Examples
Calm therapeutic voice
scripts/speak.sh --preset "Take a deep breath." Vivian "calm, nurturing, gentle pace"
Excited announcement
scripts/speak.sh --preset "We did it!" Ryan "joyful, excited, enthusiastic"
Serious explanation
scripts/speak.sh --preset "This is important." Eric "serious, measured, emphatic"
Custom Voice Instructions
The model understands rich natural language descriptions:
Aspect Examples
Emotion joyful, melancholic, anxious, calm, excited, contemplative
Pace slow and deliberate, rapid and energetic, measured, hesitant
Intensity soft and gentle, loud and commanding, whispered, emphatic
Style warm and nurturing, professional, playful, dramatic
Prosody with dramatic pauses, rising intonation, emphatic on key words
Mood Presets (Shortcuts)
Preset Expands To
calm
"calm, soothing, gentle pace"
warm
"warm, empathetic, nurturing tone"
excited
"joyful, excited, enthusiastic"
serious
"serious, measured, authoritative"
gentle
"soft, gentle, whispered"
encouraging
"encouraging, uplifting, sincere"
contemplative
"thoughtful, slow pace, reflective"
Speakers
Speaker Best For
Ryan (default) Professional, serious, authoritative
Vivian Warm, nurturing, therapeutic
Serena Calm, gentle, contemplative
Dylan Friendly, casual, playful
Eric Serious, dramatic, commanding
Aiden Encouraging, uplifting, energetic
Uncle_Fu Wise, measured
Ono_Anna Soft, gentle
Sohee Clear, professional
Workflow
-
Parse arguments for text and mode (default oracle vs --preset)
-
Extract text from last response if not provided
-
Default mode: Clone with Oracle voice
-
Preset mode: Generate with CustomVoice + instruction
-
Audio plays through macOS speakers
Execution
Oracle voice (default)
scripts/speak.sh "<text>"
Preset speaker with instruction
scripts/speak.sh --preset "<text>" [speaker] [instruction]
Voice Cloning (Custom Voices)
Clone any voice from a 3+ second audio sample:
Get transcript first (use Whisper API)
curl -s https://api.openai.com/v1/audio/transcriptions
-H "Authorization: Bearer $OPENAI_API_KEY"
-F file="@reference.mp3" -F model="whisper-1"
Clone the voice
scripts/clone.sh "<text to speak>" "<audio_file>" "<transcript>"
Voice Design (Create New Voices)
Design entirely new voices from natural language descriptions:
scripts/design-voice.sh "<sample_text>" "<voice_description>"
Example: Create a warm guide voice
scripts/design-voice.sh
"Take a deep breath and feel this moment."
"warm, nurturing, gentle pace, empathetic, female"
Then clone the designed voice for reuse:
scripts/clone.sh "New text" designed-voice.wav "Original sample text"
See references/moods.md for more instruction examples.