Voices and Emotional Styles#

Each TTS model ships with a catalog of built-in voices. This page explains how voice names are structured, which speakers support emotional style variants, and how to discover and select voices at runtime.

Naming Convention#

Voice names follow a hierarchical pattern that encodes the model, locale, speaker, and optional emotional style.

Magpie TTS Multilingual#

Magpie-Multilingual.{LOCALE}.{Speaker}[.{Emotion}]

Segment

Description

Examples

Magpie-Multilingual

Model prefix

Always Magpie-Multilingual

{LOCALE}

Language and region in uppercase

EN-US, FR-FR, ZH-CN, VI-VN

{Speaker}

Speaker identity

Aria, Jason, Leo, Diego

{Emotion}

Optional emotional style

Angry, Calm, Happy, Neutral, Sad

When you omit the emotion segment (for example, Magpie-Multilingual.EN-US.Aria), the model uses a default neutral style for that speaker.

Examples:

  • Magpie-Multilingual.EN-US.Aria – Aria with default style

  • Magpie-Multilingual.EN-US.Aria.Happy – Aria with happy emotional style

  • Magpie-Multilingual.FR-FR.Pascal.Calm – Pascal (French) with calm style

  • Magpie-Multilingual.VI-VN.Long.Fearful – Long (Vietnamese) with fearful style

Magpie TTS Zeroshot#

Magpie-ZeroShot.{Gender}-{Style}

Built-in voices use a {Gender}-{Style} pattern. Female-1 and Male-1 are the recommended defaults.

Examples: Magpie-ZeroShot.Female-1, Magpie-ZeroShot.Male-Angry

Magpie TTS Flow#

English-US-Magpie-Flow.{Gender}[.{Emotion}]

Similar to Zeroshot, but uses dots to separate the emotion. Default voices use {Gender}-1.

Examples: English-US-Magpie-Flow.Female-1, English-US-Magpie-Flow.Male.Happy

Emotional Style Availability#

Not all speakers have emotional variants, and the available emotions differ by speaker and locale.

Emotions by Model#

Emotion

Multilingual

Zeroshot

Flow

Angry

Calm

Fearful

Happy

Neutral

Sad

Disgusted

Disgust

✅ (ES-US, FR-FR, VI-VN only)

PleasantSurprise

✅ (ES-US only)

Multilingual Speakers with Emotional Variants#

In the Magpie TTS Multilingual model, emotional styles are available only for specific speaker-locale combinations.

Speaker

EN-US

ES-US

FR-FR

DE-DE

ZH-CN

VI-VN

IT-IT

Aria

6 emotions

Jason

4 emotions

Leo

5 emotions

Mia

5 emotions

Ray

5 emotions

Sofia

5 emotions

Diego

6 emotions

Isabela

7 emotions

Pascal

6 emotions

Long

7 emotions

Speakers without emotional variants (for example, Diego in EN-US, Louise in all locales) produce speech in a default neutral style.

Discover Voices at Runtime#

List the voices a deployed TTS NIM is currently serving.

python3 python-clients/scripts/tts/talk.py \
    --server 0.0.0.0:50051 \
    --list-voices
curl -sS http://localhost:9000/v1/audio/list_voices | jq
python3 python-clients/scripts/tts/realtime_tts_client.py \
    --server localhost:9000 \
    --list-voices

The output lists voice names grouped by language code. Use these exact names in the --voice parameter when synthesizing speech.

Select a Voice#

Pass the full voice name to the --voice flag. If omitted, the server selects the first available voice for the specified --language-code.

python3 python-clients/scripts/tts/talk.py --server 0.0.0.0:50051 \
    --language-code en-US \
    --text "Hello from Aria with a happy tone." \
    --voice Magpie-Multilingual.EN-US.Aria.Happy \
    --output output.wav

Cross-Language Accents#

The Magpie TTS Multilingual model allows mixing a voice from one locale with text in another language to produce accented speech. For example, synthesize English text with a French-accented voice:

python3 python-clients/scripts/tts/talk.py --server 0.0.0.0:50051 \
    --language-code en-US \
    --text "This English text is spoken with a French accent." \
    --voice Magpie-Multilingual.FR-FR.Pascal \
    --output output.wav

Complete Voice Reference#

For the full list of all available voices per model and locale, refer to the TTS support matrix.