Support Matrix#

This documentation describes the software and hardware that Riva TTS NIM supports.

Hardware#

NVIDIA Riva TTS NIM is supported on NVIDIA Volta or later GPU (Compute Capability >= 7.0). Avoid exceeding the available memory when selecting models to deploy; 16+ GB VRAM is recommended.

GPUs Supported#

GPU

Precision

A30, A100

FP16

H100

FP16

A2, A10, A16, A40

FP16

L4, L40, GeForce RTX 40xx

FP16

GeForce RTX 50xx

FP16

Blackwell RTX 60xx

FP16

DGX Spark

FP16

Software#

  • Linux operating systems (Ubuntu 22.04 or later recommended)

  • NVIDIA Driver >= 535

  • NVIDIA Docker >= 23.0.1

Supported Models#

Riva TTS NIM supports the following models.

NIM automatically downloads the prebuilt model if it is available on the target GPU (GPUs with Compute Capability >= 8.0) or generates an optimized model on-the-fly using RMIR model on other GPUs (Compute Capability >= 7.0).

Model

Languages Supported

Inference Mode

Publisher

WSL Support

Model Format

Magpie TTS Multilingual

English (en-US), Spanish (es-US), French (fr-FR), German (de-DE), Mandarin (zh-CN), Vietnamese (vi-VN), Italian (it-IT)

Streaming & Offline

NVIDIA

No

Prebuilt, RMIR

Magpie TTS Zeroshot

English (en-US)

Streaming & Offline

NVIDIA

No

Prebuilt, RMIR

Magpie TTS Flow

English (en-US)

Offline

NVIDIA

Yes

Prebuilt

Fastpitch HifiGAN en-US

English (en-US)

Streaming & Offline

NVIDIA

No

Prebuilt, RMIR

Note

All models use FP16 precision.

Magpie TTS Multilingual#

This model supports text to speech in English (en-US), Spanish (es-US), French (fr-FR), German (de-DE), Mandarin (zh-CN), Vietnamese (vi-VN), and Italian (it-IT) languages.

To use this model, set CONTAINER_ID to magpie-tts-multilingual. Then, set NIM_TAGS_SELECTOR to one of the values from the following table as required. For more information, refer to Launching the NIM.

Model Profiles#

Model Profile
(Selected using NIM_TAGS_SELECTOR)

Inference Mode

CPU Memory (GB)

GPU Memory (GB)

batch_size=1

offline and streaming

9.738

4.2

batch_size=8

offline and streaming

9.738

10.87

batch_size=32

offline and streaming

8.819

31.55

batch_size=64

offline and streaming

8.159

60.224

Note

If NIM_TAGS_SELECTOR is not specified, the default model profile is batch_size=8. However, for hardware with compute capability >= 10, the default is batch_size=1.

The Blackwell platform currently supports only batch_size=1 in the latest release.

Available Voices#

Magpie-Multilingual.EN-US.Aria.Angry
Magpie-Multilingual.EN-US.Aria.Calm
Magpie-Multilingual.EN-US.Aria.Fearful
Magpie-Multilingual.EN-US.Aria.Happy
Magpie-Multilingual.EN-US.Aria.Neutral
Magpie-Multilingual.EN-US.Aria.Sad
Magpie-Multilingual.EN-US.Aria
Magpie-Multilingual.EN-US.Diego
Magpie-Multilingual.EN-US.Isabela
Magpie-Multilingual.EN-US.Jason.Angry
Magpie-Multilingual.EN-US.Jason.Calm
Magpie-Multilingual.EN-US.Jason.Happy
Magpie-Multilingual.EN-US.Jason.Neutral
Magpie-Multilingual.EN-US.Jason
Magpie-Multilingual.EN-US.Leo.Angry
Magpie-Multilingual.EN-US.Leo.Calm
Magpie-Multilingual.EN-US.Leo.Fearful
Magpie-Multilingual.EN-US.Leo.Neutral
Magpie-Multilingual.EN-US.Leo.Sad
Magpie-Multilingual.EN-US.Leo
Magpie-Multilingual.EN-US.Louise
Magpie-Multilingual.EN-US.Mia.Angry
Magpie-Multilingual.EN-US.Mia.Calm
Magpie-Multilingual.EN-US.Mia.Happy
Magpie-Multilingual.EN-US.Mia.Neutral
Magpie-Multilingual.EN-US.Mia.Sad
Magpie-Multilingual.EN-US.Mia
Magpie-Multilingual.EN-US.Pascal
Magpie-Multilingual.EN-US.Ray.Angry
Magpie-Multilingual.EN-US.Ray.Calm
Magpie-Multilingual.EN-US.Ray.Fearful
Magpie-Multilingual.EN-US.Ray.Happy
Magpie-Multilingual.EN-US.Ray.Neutral
Magpie-Multilingual.EN-US.Ray
Magpie-Multilingual.EN-US.Sofia.Angry
Magpie-Multilingual.EN-US.Sofia.Calm
Magpie-Multilingual.EN-US.Sofia.Fearful
Magpie-Multilingual.EN-US.Sofia.Happy
Magpie-Multilingual.EN-US.Sofia.Neutral
Magpie-Multilingual.EN-US.Sofia

Magpie-Multilingual.ES-US.Aria
Magpie-Multilingual.ES-US.Diego.Angry
Magpie-Multilingual.ES-US.Diego.Calm
Magpie-Multilingual.ES-US.Diego.Disgust
Magpie-Multilingual.ES-US.Diego.Happy
Magpie-Multilingual.ES-US.Diego.Neutral
Magpie-Multilingual.ES-US.Diego.PleasantSurprise
Magpie-Multilingual.ES-US.Diego
Magpie-Multilingual.ES-US.Isabela.Angry
Magpie-Multilingual.ES-US.Isabela.Calm
Magpie-Multilingual.ES-US.Isabela.Fearful
Magpie-Multilingual.ES-US.Isabela.Happy
Magpie-Multilingual.ES-US.Isabela.Neutral
Magpie-Multilingual.ES-US.Isabela.PleasantSurprise
Magpie-Multilingual.ES-US.Isabela.Sad
Magpie-Multilingual.ES-US.Isabela
Magpie-Multilingual.ES-US.Jason
Magpie-Multilingual.ES-US.Leo
Magpie-Multilingual.ES-US.Louise
Magpie-Multilingual.ES-US.Mia
Magpie-Multilingual.ES-US.Pascal
Magpie-Multilingual.ES-US.Ray
Magpie-Multilingual.ES-US.Sofia

Magpie-Multilingual.FR-FR.Aria
Magpie-Multilingual.FR-FR.Diego
Magpie-Multilingual.FR-FR.Isabela
Magpie-Multilingual.FR-FR.Jason
Magpie-Multilingual.FR-FR.Leo
Magpie-Multilingual.FR-FR.Louise
Magpie-Multilingual.FR-FR.Mia
Magpie-Multilingual.FR-FR.Pascal.Angry
Magpie-Multilingual.FR-FR.Pascal.Calm
Magpie-Multilingual.FR-FR.Pascal.Disgust
Magpie-Multilingual.FR-FR.Pascal.Happy
Magpie-Multilingual.FR-FR.Pascal.Neutral
Magpie-Multilingual.FR-FR.Pascal.Sad
Magpie-Multilingual.FR-FR.Pascal
Magpie-Multilingual.FR-FR.Ray
Magpie-Multilingual.FR-FR.Sofia

Magpie-Multilingual.DE-DE.Diego
Magpie-Multilingual.DE-DE.Mia
Magpie-Multilingual.DE-DE.Pascal
Magpie-Multilingual.DE-DE.Sofia

Magpie-Multilingual.ZH-CN.Aria
Magpie-Multilingual.ZH-CN.Diego
Magpie-Multilingual.ZH-CN.HouZhen
Magpie-Multilingual.ZH-CN.Isabela
Magpie-Multilingual.ZH-CN.Long
Magpie-Multilingual.ZH-CN.Louise
Magpie-Multilingual.ZH-CN.Mia
Magpie-Multilingual.ZH-CN.North
Magpie-Multilingual.ZH-CN.Pascal
Magpie-Multilingual.ZH-CN.Ray
Magpie-Multilingual.ZH-CN.Siwei

Magpie-Multilingual.IT-IT.Isabela
Magpie-Multilingual.IT-IT.Pascal

Magpie-Multilingual.VI-VN.Aria
Magpie-Multilingual.VI-VN.Diego
Magpie-Multilingual.VI-VN.Isabela
Magpie-Multilingual.VI-VN.Jason
Magpie-Multilingual.VI-VN.Le
Magpie-Multilingual.VI-VN.Long.Angry
Magpie-Multilingual.VI-VN.Long.Calm
Magpie-Multilingual.VI-VN.Long.Disgust
Magpie-Multilingual.VI-VN.Long.Fearful
Magpie-Multilingual.VI-VN.Long.Happy
Magpie-Multilingual.VI-VN.Long.Neutral
Magpie-Multilingual.VI-VN.Long.Sad
Magpie-Multilingual.VI-VN.Louise
Magpie-Multilingual.VI-VN.Mia
Magpie-Multilingual.VI-VN.North
Magpie-Multilingual.VI-VN.Pascal
Magpie-Multilingual.VI-VN.Ray
Magpie-Multilingual.VI-VN.Sofia

Magpie TTS Zeroshot#

This model supports synthesizing speech in English (en-US) from input text and an audio prompt of three to ten seconds. A set of built-in voices are also available to use.

To use this model, set CONTAINER_ID to magpie-tts-zeroshot. Then, set NIM_TAGS_SELECTOR to one of the values from the following table, based on your requirements. For more information, refer to Launching the NIM.

Note

Access to Magpie TTS Zeroshot model is restricted. Apply for access

Model Profiles#

Model Profile
(Selected using NIM_TAGS_SELECTOR)

Inference Mode

CPU Memory (GB)

GPU Memory (GB)

name=magpie-tts-zeroshot

offline and streaming

3.8

4.8

Available Voices#

"Magpie-ZeroShot.Female-1", (Default, Recommended for Female speaker)
"Magpie-ZeroShot.Female-Neutral",
"Magpie-ZeroShot.Female-Angry",
"Magpie-ZeroShot.Female-Fearful",
"Magpie-ZeroShot.Female-Calm",
"Magpie-ZeroShot.Female-Happy",
"Magpie-ZeroShot.Male-1", (Recommended for Male speaker)
"Magpie-ZeroShot.Male-Calm",
"Magpie-ZeroShot.Male-Neutral",
"Magpie-ZeroShot.Male-Angry",
"Magpie-ZeroShot.Male-Fearful"

Magpie TTS Flow#

This is an offline-only model to support synthesizing speech in English (en-US) from input text. In addition to built-in voice support, this model also supports speech synthesis using an audio prompt of three to ten seconds along with a transcript for the provided audio prompt.

To use this model, set CONTAINER_ID to magpie-tts-flow. Then, set NIM_TAGS_SELECTOR to one of the values from the following table, based on your requirements. Refer to Launching the NIM for details.

Note

Access to the Magpie TTS Flow model is restricted. Apply here to request access.

Model Profiles#

Model Profile
(Selected using NIM_TAGS_SELECTOR)

Inference Mode

CPU Memory (GB)

GPU Memory (GB)

name=magpie-tts-flow

offline

7.2

5.1

Available Voices#

English-US-Magpie-Flow.Female-1 (Default, Recommended for Female speaker)
English-US-Magpie-Flow.Female.Calm
English-US-Magpie-Flow.Female.Fearful
English-US-Magpie-Flow.Female.Happy
English-US-Magpie-Flow.Female.Neutral
English-US-Magpie-Flow.Female.Angry
English-US-Magpie-Flow.Female.Disgusted
English-US-Magpie-Flow.Female.Sad
English-US-Magpie-Flow.Male.Calm
English-US-Magpie-Flow.Male.Fearful
English-US-Magpie-Flow.Male.Happy
English-US-Magpie-Flow.Male.Neutral
English-US-Magpie-Flow.Male.Angry
English-US-Magpie-Flow.Male.Disgusted
English-US-Magpie-Flow.Male.Sad
English-US-Magpie-Flow.Male-1 (Recommended for Male speaker)

Fastpitch HifiGAN en-US#

Model supports text to speech in English (en-US) language.

For using this model, set CONTAINER_ID to riva-tts. Set NIM_TAGS_SELECTOR to one of the values from the following table as required. Refer Launching the NIM for details.

Model Profiles#

Model Profile
(Selected using NIM_TAGS_SELECTOR)

Inference Mode

CPU Memory (GB)

GPU Memory (GB)

name=fastpitch-hifigan-en-us

offline and streaming

1.4

2