Support Matrix#
Models#
Model Name |
Model ID |
Model Type |
Publisher |
|---|---|---|---|
claire |
claire_v2.3.1 |
Regression |
NVIDIA |
james |
james_v2.3.1 |
Regression |
NVIDIA |
mark |
mark_v2.3 |
Regression |
NVIDIA |
multi |
multi_v3.2 |
Diffusion |
NVIDIA |
Optimized configurations#
GPU |
GPU Memory (GB) |
Precision |
Batch Size (Regression) |
Batch Size (Diffusion) |
|---|---|---|---|---|
A10G |
24 |
FP16 |
35 |
32 |
A30 |
24 |
FP16 |
55 |
41 |
L4 |
24 |
FP16 |
30 |
16 |
L40S |
48 |
FP16 |
80 |
60 |
RTX 4090 |
24 |
FP16 |
80 |
35 |
RTX 5080 |
16 |
FP16 |
75 |
22 |
RTX 5090 |
32 |
FP16 |
100 |
52 |
RTX 6000 Ada |
48 |
FP16 |
70 |
50 |
RTX PRO 6000 Blackwell |
96 |
FP16 |
120 |
130 |
B200 |
192 |
FP16 |
255 |
144 |
Profiles can be automatically selected when you run the Audio2Face-3D NIM.
To verify auto profile selection, run:
$ docker run -it --rm --network=host --gpus all \
-e NGC_API_KEY=$NGC_API_KEY nvcr.io/nim/nvidia/audio2face-3d:2.0
If a pre-generated profile is found for your GPU, you’ll see this message in the logs:
"timestamp": "2025-02-12 05:53:29,945", "level": "INFO",
"message": "Matched profile_id in manifest from TagsBasedProfileSelector"
When running the Audio2Face-3D NIM, automatic profile selection will attempt to choose a compatible pre-generated profile for your GPU.
More Information on Automatic Profile Selection
This process works as follows:
The service detects your GPU and maps it to a profile key in the model manifest.
Some GPUs without a dedicated profile are mapped to a compatible alternative (for example, RTX 30 series GPUs are mapped to the A10G profile).
The service then starts using the selected TRT engines:
Downloaded into
/opt/nim/workspaceand copied to/tmp/a2x(default behavior), orLoaded / generated locally into
/tmp/a2xwhenNIM_DISABLE_MODEL_DOWNLOAD=true.
GPUs that have no mapping in the manifest (such as A100 or H100) are not supported with pre-generated
profiles. To run on these GPUs, set NIM_DISABLE_MODEL_DOWNLOAD=true and generate TRT engines locally
(see Getting Started).
Note
Blackwell-class GPUs have dedicated pre-generated profiles (for example RTX PRO 6000 Blackwell and B200). Auto selection will match those profiles directly.
Note
Audio2Face-3D does not have multi GPU support. If you want to use more GPUs, you will have to start multiple instances of Audio2Face-3D for each GPU and manage how requests are routed to each instance.
Non-optimized configurations#
The GPU memory usage is dependend on the number of concurrent connections (number_of_streams) configured when starting up Audio2Face-3D.
Note
This approximation was observed on an RTX 4090 and RTX 3080 Ti, without generating the TRT engines.
GPU |
GPU Memory (GB) |
Precision |
|---|---|---|
Any NVIDIA GPU with sufficient GPU memory and compute capability |
0.15 * number_of_streams + 9 |
FP16 |
Software#
Operating System |
Ubuntu 24.04 |
NVIDIA CUDA |
>=12.8, <13.0 (12.9 recommended) |
NVIDIA Driver |
NVIDIA GPU driver R570+ (R580 recommended) with CUDA 12.8.0+ support |
TensorRT |
>=10.13, <11.0 (included in the container) |
NVIDIA Container Toolkit |
latest version |
Docker |
latest version |
Any Linux distribution should work but has not been tested by our teams
Your Docker environment must support NVIDIA GPUs. Please refer to the NVIDIA Container Toolkit for more information.
Note
NVIDIA Fabric Manager Requirement
NVIDIA Fabric Manager is required on systems with multiple GPUs connected using NVLink or NVSwitch technology. This typically applies to:
Multi-GPU systems with NVLink bridges (e.g., DGX systems, HGX platforms)
Systems with NVSwitch fabric interconnects
Hosts running NVIDIA B200, H100, A100, V100, or other datacenter GPUs with NVLink
Fabric Manager is not required for:
Single GPU systems
Multi-GPU systems without NVLink/NVSwitch (PCIe-only configurations)
For installation instructions, refer to the official NVIDIA Fabric Manager documentation. See also CUDA Error 802 “system not yet initialized” on B200 Multi-GPU Systems in the Troubleshooting guide for B200-specific details.