Support Matrix#

This documentation describes the software and hardware that Riva ASR NIM supports.

Hardware#

NVIDIA Riva ASR NIM is supported on NVIDIA Volta or later GPU (Compute Capability >= 7.0). Care must be taken to not exceed the memory available when selecting models to deploy. 16+ GB VRAM is recommended.

GPUs Supported#

GPU

Precision

V100

FP16

A30, A100

FP16

H100

FP16

A2, A10, A16, A40

FP16

L4, L40, GeForce RTX 40xx

FP16

GeForce RTX 50xx

FP16

Software#

  • Linux operating systems (Ubuntu 22.04 or later recommended)

  • NVIDIA Driver >= 535

  • NVIDIA Docker >= 23.0.1

Supported Models#

Riva ASR NIM supports the following models.

NIM automatically downloads the prebuilt model if it is available on the target GPU (GPUs with Compute Capability >= 8.0) or generates an optimized model on-the-fly using RMIR model on other GPUs (Compute Capability >= 7.0).

The environment variable NIM_TAGS_SELECTOR is used to specify the desired model and inference mode. It is specified as comma-separated key-value pairs. Some ASR models support different inference modes tuned for different use cases. Available modes include streaming low latency (str), streaming high throughput (str-thr), and offline (ofl). Setting the mode to all deploys all inference modes where applicable.

Note

All models use FP16 precision.

Parakeet 0.6b CTC English#

Model information

To use this model, set CONTAINER_ID to parakeet-0-6b-ctc-en-us. Choose a value for NIM_TAGS_SELECTOR from the table below as needed. For further instructions, refer to Launching the NIM.

Profile
(Selected using NIM_TAGS_SELECTOR)

Inference Mode

Batch Size

CPU Memory (GB)

GPU Memory (GB)

name=parakeet-0-6b-ctc-riva-en-us,mode=ofl

offline

1024

3

5.8

name=parakeet-0-6b-ctc-riva-en-us,mode=str

streaming

1024

3

4

name=parakeet-0-6b-ctc-riva-en-us,mode=str-thr

streaming-throughput

1024

3

5

name=parakeet-0-6b-ctc-riva-en-us,mode=all

all

1024

5.3

11.5

name=parakeet-0-6b-ctc-riva-en-us,mode=ofl,bs=1

offline

1

3

3

name=parakeet-0-6b-ctc-riva-en-us,mode=str,bs=1

streaming

1

3

3

Note

Profiles with a Batch Size of 1 are optimized for the lowest memory usage and support only a single session at a time. These profiles are recommended for WSL2 deployment or scenarios with a single inference request client.

Parakeet 1.1b CTC English#

Model information

To use this model, set CONTAINER_ID to riva-asr. Choose a value for NIM_TAGS_SELECTOR from the table below as needed. For further instructions, refer to Launching the NIM.

Profile
(Selected using NIM_TAGS_SELECTOR)

Inference Mode

Batch Size

CPU Memory (GB)

GPU Memory (GB)

name=parakeet-1-1b-ctc-riva-en-us,mode=ofl

offline

1024

5

6.7

name=parakeet-1-1b-ctc-riva-en-us,mode=str

streaming

1024

5

5

name=parakeet-1-1b-ctc-riva-en-us,mode=str-thr

streaming-throughput

1024

5

5.9

name=parakeet-1-1b-ctc-riva-en-us,mode=all

all

1024

7.6

14

Conformer CTC Spanish#

Model information

To use this model, set CONTAINER_ID to riva-asr. Choose a value for NIM_TAGS_SELECTOR from the table below as needed. For further instructions, refer to Launching the NIM.

Profile
(Selected using NIM_TAGS_SELECTOR)

Inference Mode

Batch Size

CPU Memory (GB)

GPU Memory (GB)

name=conformer-ctc-riva-es-us,mode=ofl

offline

1024

2

5.8

name=conformer-ctc-riva-es-us,mode=str

streaming

1024

2

3.6

name=conformer-ctc-riva-es-us,mode=str-thr

streaming-throughput

1024

2

4.2

name=conformer-ctc-riva-es-us,mode=all

all

1024

3.1

9.8

Canary 1b Multilingual#

Model information

To use this model, set CONTAINER_ID to riva-asr. Choose a value for NIM_TAGS_SELECTOR from the table below as needed. For further instructions, refer to Launching the NIM.

Profile
(Selected using NIM_TAGS_SELECTOR)

Inference Mode

Batch Size

CPU Memory (GB)

GPU Memory (GB)

name=canary-1b,mode=ofl

offline

1024

6.5

13.4

Canary 0.6b Turbo Multilingual#

Model information

To use this model, set CONTAINER_ID to riva-asr. Choose a value for NIM_TAGS_SELECTOR from the table below as needed. For further instructions, refer to Launching the NIM.

Profile
(Selected using NIM_TAGS_SELECTOR)

Inference Mode

Batch Size

CPU Memory (GB)

GPU Memory (GB)

name=canary-0-6b-turbo,mode=ofl

offline

1024

5.3

12.2

Whisper Large v3 Multilingual#

Model information

To use this model, set CONTAINER_ID to riva-asr. Choose a value for NIM_TAGS_SELECTOR from the table below as needed. For further instructions, refer to Launching the NIM.

Profile
(Selected using NIM_TAGS_SELECTOR)

Inference Mode

Batch Size

CPU Memory (GB)

GPU Memory (GB)

name=whisper-large-v3,mode=ofl

offline

1024

4.3

12.5