MAI-Voice-1

Synthesize speech with Azure AI Speech using Microsoft's MAI-Voice-1 voices.

Quick start

{baseDir}/scripts/speak.sh --text "Hello Steve"

Defaults:

Voice: en-us-Jasper:MAI-Voice-1
Output: ./mai-voice.mp3
Format: audio-24khz-160kbitrate-mono-mp3

Useful flags

{baseDir}/scripts/speak.sh --text "Hello Steve" --voice en-us-Iris:MAI-Voice-1 --out /tmp/iris.mp3
{baseDir}/scripts/speak.sh --text-file /tmp/input.txt --voice en-us-June:MAI-Voice-1 --style empathy --out /tmp/june.mp3
{baseDir}/scripts/speak.sh --text "Let's go" --voice en-us-Jasper:MAI-Voice-1 --style excitement
{baseDir}/scripts/speak.sh --list-voices

Required env vars

export AZURE_SPEECH_KEY="YOUR_SPEECH_RESOURCE_KEY"
export AZURE_SPEECH_REGION="eastus"

How to get the API key and region

Go to the Azure portal and open your Speech or Foundry Speech resource.
Open Keys and Endpoint.
Copy one of the resource keys.
Copy the resource region, for example eastus.
Export them:

export AZURE_SPEECH_KEY="YOUR_SPEECH_RESOURCE_KEY"
export AZURE_SPEECH_REGION="eastus"

The MAI-Voice docs currently point at East US for preview access, so if you are not using eastus, double-check that your resource actually supports the model before blaming the script.

Optional:

export AZURE_SPEECH_OUTPUT_FORMAT="audio-24khz-160kbitrate-mono-mp3"

Supported voices

en-us-Jasper:MAI-Voice-1
en-us-June:MAI-Voice-1
en-us-Grant:MAI-Voice-1
en-us-Iris:MAI-Voice-1
en-us-Reed:MAI-Voice-1
en-us-Joy:MAI-Voice-1

API shape

The script calls:

POST https://{AZURE_SPEECH_REGION}.tts.speech.microsoft.com/cognitiveservices/v1

Headers:

Ocp-Apim-Subscription-Key: {AZURE_SPEECH_KEY}
Content-Type: application/ssml+xml
X-Microsoft-OutputFormat: {format}
User-Agent: curl

Body:

SSML with a MAI voice name
optional mstts:express-as style wrapper

Notes

This uses the Azure Speech REST API, not the Python SDK.
Voice selection is user-configurable.
Style is optional and applied via SSML.
MAI-Voice-1 is currently public preview.

mai-voice

Safety Notice

Copy this and send it to your AI assistant to learn

MAI-Voice-1

Quick start

Useful flags

Required env vars

How to get the API key and region

Supported voices

API shape

Notes

Source Transparency

Related Skills

Ephemeral Media Hosting

Ethereum Read Only

OpenClaw Memory

ImageRouter