Support Matrix#

Models#

Model Name

Model ID

Model Type

Publisher

claire

claire_v2.3.1

Regression

NVIDIA

james

james_v2.3.1

Regression

NVIDIA

mark

mark_v2.3

Regression

NVIDIA

multi

multi_v3.2

Diffusion

NVIDIA

Optimized configurations#

GPU

GPU Memory (GB)

Precision

Batch Size (Regression)

Batch Size (Diffusion)

A10G

24

FP16

35

32

A30

24

FP16

55

41

L4

24

FP16

30

16

L40S

48

FP16

80

60

RTX 4090

24

FP16

80

35

RTX 5080

16

FP16

75

22

RTX 5090

32

FP16

100

52

RTX 6000 Ada

48

FP16

70

50

RTX PRO 6000 Blackwell

96

FP16

120

130

B200

192

FP16

255

144

Profiles can be automatically selected when you run the Audio2Face-3D NIM.

To verify auto profile selection, run:

$ docker run -it --rm --network=host --gpus all \
-e NGC_API_KEY=$NGC_API_KEY nvcr.io/nim/nvidia/audio2face-3d:2.0

If a pre-generated profile is found for your GPU, you’ll see this message in the logs:

"timestamp": "2025-02-12 05:53:29,945", "level": "INFO",
"message": "Matched profile_id in manifest from TagsBasedProfileSelector"

When running the Audio2Face-3D NIM, automatic profile selection will attempt to choose a compatible pre-generated profile for your GPU.

More Information on Automatic Profile Selection

This process works as follows:

  1. The service detects your GPU and maps it to a profile key in the model manifest.

  2. Some GPUs without a dedicated profile are mapped to a compatible alternative (for example, RTX 30 series GPUs are mapped to the A10G profile).

  3. The service then starts using the selected TRT engines:

    • Downloaded into /opt/nim/workspace and copied to /tmp/a2x (default behavior), or

    • Loaded / generated locally into /tmp/a2x when NIM_DISABLE_MODEL_DOWNLOAD=true.

GPUs that have no mapping in the manifest (such as A100 or H100) are not supported with pre-generated profiles. To run on these GPUs, set NIM_DISABLE_MODEL_DOWNLOAD=true and generate TRT engines locally (see Getting Started).

Note

Blackwell-class GPUs have dedicated pre-generated profiles (for example RTX PRO 6000 Blackwell and B200). Auto selection will match those profiles directly.

Note

Audio2Face-3D does not have multi GPU support. If you want to use more GPUs, you will have to start multiple instances of Audio2Face-3D for each GPU and manage how requests are routed to each instance.

Non-optimized configurations#

The GPU memory usage is dependend on the number of concurrent connections (number_of_streams) configured when starting up Audio2Face-3D.

Note

This approximation was observed on an RTX 4090 and RTX 3080 Ti, without generating the TRT engines.

GPU

GPU Memory (GB)

Precision

Any NVIDIA GPU with sufficient GPU memory and compute capability

0.15 * number_of_streams + 9

FP16

Software#

Operating System

Ubuntu 24.04

NVIDIA CUDA

>=12.8, <13.0 (12.9 recommended)

NVIDIA Driver

NVIDIA GPU driver R570+ (R580 recommended) with CUDA 12.8.0+ support

TensorRT

>=10.13, <11.0 (included in the container)

NVIDIA Container Toolkit

latest version

Docker

latest version

  • Any Linux distribution should work but has not been tested by our teams

  • Your Docker environment must support NVIDIA GPUs. Please refer to the NVIDIA Container Toolkit for more information.

Note

NVIDIA Fabric Manager Requirement

NVIDIA Fabric Manager is required on systems with multiple GPUs connected using NVLink or NVSwitch technology. This typically applies to:

  • Multi-GPU systems with NVLink bridges (e.g., DGX systems, HGX platforms)

  • Systems with NVSwitch fabric interconnects

  • Hosts running NVIDIA B200, H100, A100, V100, or other datacenter GPUs with NVLink

Fabric Manager is not required for:

  • Single GPU systems

  • Multi-GPU systems without NVLink/NVSwitch (PCIe-only configurations)

For installation instructions, refer to the official NVIDIA Fabric Manager documentation. See also CUDA Error 802 “system not yet initialized” on B200 Multi-GPU Systems in the Troubleshooting guide for B200-specific details.