Support Matrix#

Models#

Model Name

Model ID

Publisher

mark

mark_v2.3

NVIDIA

claire

claire_v2.3

NVIDIA

james

james_v2.3

NVIDIA

Optimized configurations#

GPU

GPU Memory (GB)

Precision

A10G

24

FP16

L4

24

FP16

L40S

48

FP16

A100

15

FP16

A30

24

FP16

H100

80

FP16

RTX6000

48

FP16

RTX4090

24

FP16

Profiles can be automatically selected when you run the Audio2Face-3D NIM.

To verify auto profile selection, run:

$ docker run -it --rm --network=host --gpus all \
-e NGC_API_KEY=$NGC_CLI_API_KEY nvcr.io/nim/nvidia/audio2face-3d:1.3

If a pre-generated profile is found for your GPU, you’ll see this message in the logs:

"timestamp": "2025-02-12 05:53:29,945", "level": "INFO",
"message": "Matched profile_id in manifest from TagsBasedProfileSelector"

When running the Audio2Face-3D NIM, automatic profile selection will attempt to choose a compatible pre-generated profile for your GPU. If no compatible profile is found, the service will automatically use the GB20x compatability mode profile as fallback.

More Information on Automatic Profile Selection

This process works as follows:

  1. The service first tries to match your GPU with a pre-generated profile.

  2. If no match is found, it uses the FallbackTRTProfileSelector to select the GB20x compatability mode profile.

  3. The service then starts using the downloaded engines

For example, on GPUs like the RTX 3080 that don’t have pre-generated profiles, the GB20x compatability mode profile will be used automatically.

You can verify the fallback is working by checking the logs for the “Selecting compatibility mode profile” message when running on an unsupported GPU.

Note

Audio2Face-3D does not have multi GPU support. If you want to use more GPUs, you will have to start multiple instances of Audio2Face-3D for each GPU and manage how requests are routed to each instance.

Non-optimized configurations#

The GPU memory usage is dependend on the number of concurrent connections (number_of_streams) configured when starting up Audio2Face-3D.

Note

This approximation was observed on an RTX 4090 and RTX 3080 Ti, without generating the TRT engines.

GPU

GPU Memory (GB)

Precision

Any NVIDIA GPU with sufficient GPU memory and compute capability

0.15 * number_of_streams + 9

FP16

Software#

Operating System

Ubuntu 22.04/24.04 (bare-metal or with WSL)

NVIDIA CUDA

12.6

NVIDIA Driver

535.183.06 (for Data Center GPUs), 560.35.03 (for RTX GPUs) and 560.94 (for Windows WSL)

NVIDIA Container Toolkit

latest version

Docker

latest version

  • Any Linux distribution should work but has not been tested by our teams

  • Your Docker environment must support NVIDIA GPUs. Please refer to the NVIDIA Container Toolkit for more information.