Getting Started#

The steps below will help you setup and run the Audio2Face-3D NIM and use our sample application to receive blendshapes, audio and emotions.

Prerequisites#

Check the Support Matrix to make sure you have the supported hardware and software stack.

Read the instructions corresponding to your Operating System.

Windows System - Using Windows Subsystem for Linux (WSL)

These instructions are written for Ubuntu 22.04 inside WSL 2.0. Run all the steps inside the WSL terminal unless mentioned otherwise.

  1. Setup Docker without Docker Desktop

Install docker using the convenience script:

$ curl -fsSL https://get.docker.com -o get-docker.sh
$ sudo sh ./get-docker.sh

Add your user account to docker group:

$ sudo groupadd docker
$ sudo usermod -aG docker <username>

Logout and login again of your system, then do a sanity check:

$ docker run hello-world

You should see “Hello from Docker!” printed out.

Install the Docker Compose plugin:

$ sudo apt-get update
$ sudo apt-get install docker-compose-plugin

Check that the installation was successful by running:

$ docker compose version

Set up iptables compatibility:

$ sudo update-alternatives --config iptables

When prompted, choose Selection 1, with the path /usr/sbin/iptables-legacy.

Shutdown the WSL instance by either closing the terminal window or typing in Powershell:

$ wsl --shutdown Ubuntu-22.04

Start a WSL instance and check Docker status:

$ service docker status

You should see “active (running)” in the messages. To exit, press q.

  1. Install CUDA Toolkit

Once a Windows Nvidia driver is installed on the Windows system, CUDA becomes available within WSL 2. Therefore, users must not install any Nvidia Linux driver within WSL 2.

For cuda-toolkit-12-6 run these instructions:

$ wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
$ sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
$ wget https://developer.download.nvidia.com/compute/cuda/12.6.2/local_installers/cuda-repo-wsl-ubuntu-12-6-local_12.6.2-1_amd64.deb
$ sudo dpkg -i cuda-repo-wsl-ubuntu-12-6-local_12.6.2-1_amd64.deb
$ sudo cp /var/cuda-repo-wsl-ubuntu-12-6-local/cuda-*-keyring.gpg /usr/share/keyrings/
$ sudo apt-get update
$ sudo apt-get -y install cuda-toolkit-12-6

Alternatively, to install the latest CUDA Toolkit, visit NVIDIA Developer - CUDA downloads WSL and follow the instructions.

  1. Install NVIDIA Container Toolkit

If any of the steps below fails, follow the official NVIDIA Container Toolkit docs instead.

Configure the production repository:

$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
   && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
   sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
   sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

Optionally, configure the repository to use experimental packages:

$ sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list

Update the packages list from the repository:

$ sudo apt-get update
$ sudo apt-get install -y nvidia-container-toolkit
  1. Configure docker with NVIDIA Container Toolkit

Run the following instructions:

$ sudo nvidia-ctk runtime configure --runtime=docker
$ sudo systemctl restart docker

If all goes well then you should be able to start a Docker container and run nvidia-smi inside, to see information about your GPU inside a container. We provide an example below but keep in mind that numbers will vary for your hardware.

$ sudo docker run --rm --gpus all ubuntu nvidia-smi
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03             Driver Version: 560.35.03   CUDA Version: 12.6       |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4090        Off | 00000000:01:00.0 Off |                  Off |
|  0%   41C    P8               7W / 450W |    287MiB / 24564MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+
Linux System - Using Ubuntu 22.04
  1. Setup Docker without Docker Desktop

Install docker using the convenience script:

$ curl -fsSL https://get.docker.com -o get-docker.sh
$ sudo sh ./get-docker.sh

Add your user account to docker group:

$ sudo groupadd docker
$ sudo usermod -aG docker <username>

Logout and login again of your system, then do a sanity check:

$ docker run hello-world

You should see “Hello from Docker!” printed out.

Install the Docker Compose plugin:

$ sudo apt-get update
$ sudo apt-get install docker-compose-plugin

Check that the installation was successful by running:

$ docker compose version

Set up iptables compatibility:

$ sudo update-alternatives --config iptables

When prompted, choose Selection 1, with the path /usr/sbin/iptables-legacy.

Restart your system and check the Docker status:

$ service docker status

You should see “active (running)” in the messages. To exit, press q.

  1. Install CUDA Toolkit

For cuda-toolkit-12-6 run these instructions:

$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
$ sudo dpkg -i cuda-keyring_1.1-1_all.deb
$ sudo apt-get update
$ sudo apt-get -y install cuda-toolkit-12-6

Alternatively, to install the latest CUDA Toolkit, visit NVIDIA Developer - CUDA downloads Ubuntu 22.04, and follow the instructions.

  1. Install NVIDIA Container Toolkit

If any of the steps below fails, follow the official NVIDIA Container Toolkit docs instead.

Configure the production repository:

$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
   && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
   sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
   sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

Optionally, configure the repository to use experimental packages:

$ sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list

Update the packages list from the repository:

$ sudo apt-get update
$ sudo apt-get install -y nvidia-container-toolkit
  1. Configure docker with NVIDIA Container Toolkit

Run the following instructions:

$ sudo nvidia-ctk runtime configure --runtime=docker
$ sudo systemctl restart docker

If all goes well then you should be able to start a Docker container and run nvidia-smi inside, to see information about your GPU inside a container. We provide an example below but keep in mind that numbers will vary for your hardware.

$ sudo docker run --rm --gpus all ubuntu nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.06    Driver Version: 535.183.06    CUDA Version: 12.6   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A10G         On   | 00000000:01:00.0 Off |                    0 |
|  0%   33C    P8    18W / 300W |      0MiB / 23028MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

NVAIE access#

To download Audio2Face-3D NIM Microservice you need an active subscription to an NVIDIA AI Enterprise product.

Contact a Sales representative on this form and request an access to NVIDIA AI Enterprise Essentials.

NGC Personal Key#

Set up your NGC Personal Key if you have not done so already.

Go to the NGC personal key setup page of the NGC website and Generate Personal Key.

Once prompted with a Generate Personal Key form, choose your key Name and Expiration, then select all services for Services Included.

Then you will get your Personal Key, make sure to save it somewhere safe.

Export the API key#

Export the API key generated at the previous step in NGC_API_KEY environment variable to run the A2F-3D NIM by running:

$ export NGC_API_KEY=<value>

To make the key available at startup, run the following command if you are using bash. Make sure you replace <value> with the actual API key.

$ echo "export NGC_API_KEY=<value>" >> ~/.bashrc

Docker Login to NGC#

To pull the NIM container image, you need to login to the nvcr.io docker registry. The username is $oauthtoken and the password is the API key generated earlier and stored in NGC_API_KEY. You can simply run the following command to login:

$ echo "$NGC_API_KEY" | docker login nvcr.io --username '$oauthtoken' --password-stdin
Login Succeeded

Launching the Audio2Face-3D NIM#

There are two quick ways to start the Audio2Face-3D NIM: either use a pre-generated TRT engine for supported GPUs or generate a TRT engine for your NVIDIA GPU. Supported GPUs can be found in the table below:

Supported Models#

GPU

NIM_MANIFEST_PROFILE

A10G

009c006e7958606db6ce0923059aac8d2002c4e33d0807652486ca3821dcbfff

A100

044e16bb7cbfdca9d5d39a191261adc117edbbebb63f7372f9cfdbf83485b230

H100

2ec560cc5ce6c80ae7c7b5b44a2c348c7b4bd024355cdb111cdf08855331750c

RTX6000

8dd2ad5b0bd70c8cbed7c36d68c8a435b3f95f9014c832299053b3bcd37eb9d8

RTX4090

d6ecb540a388274c7e6191995371cabcede89ad87725c74c9837df64a73fffd7

L40S

5e21ca4dcb2ba7792e170a331fa25dccc4b5bae0b8ed91e9253f7a68b47d7802

  1. For supported GPUs, launch using a pregenerated TRT engine

    Run this command and change the <manifest_profile_id> to the value from the table above corresponding to your GPU:

    $  export NIM_MANIFEST_PROFILE=<manifest_profile_id>
    $  docker run -it --rm --name audio2face-3d \
         --gpus all \
         --network=host \
         -e NGC_API_KEY=$NGC_API_KEY \
         -e NIM_MANIFEST_PROFILE=$NIM_MANIFEST_PROFILE \
         nvcr.io/nim/nvidia/audio2face-3d:1.2
    
  2. For other NVIDIA GPUs, launch and generate the TRT engine

    $  docker run -it --rm --name audio2face-3d \
         --gpus all \
         --network=host \
         -e NGC_API_KEY=$NGC_API_KEY \
         -e NIM_DISABLE_MODEL_DOWNLOAD=true \
         nvcr.io/nim/nvidia/audio2face-3d:1.2
    

Note

When you start the service, you might encounter warnings labeled as GStreamer-WARNING. These warnings occur because some libraries are missing from the container. However, they are safe to ignore, as these libraries are not used by Audio2Face-3D.

Expand this section for more details about the docker commands used above:

Docker flags explained

You can find the explanation of each flag in the above docker command in this table:

Flag

Description

-it

--interactive + --tty (see Docker docs)

--rm

Delete the container after it stops (see Docker docs)

--name

Give a name to the NIM container. Use any preferred value.

--gpus all

Expose all NVIDIA GPUs inside the container. See the configuration page for mounting specific GPUs.

--network=host

Connect container to host machine network. (see Docker docs)

-e NGC_API_KEY=$NGC_API_KEY

Add NGC_API_KEY environment variable in the container with the value from the NGC_API_KEY environment variable from the local machine.

-e NIM_MANIFEST_PROFILE=$NIM_MANIFEST_PROFILE

Add NIM_MANIFEST_PROFILE environment variable in the container.

-e NIM_DISABLE_MODEL_DOWNLOAD=<value>

Set NIM_DISABLE_MODEL_DOWNLOAD environment variable in the container. Value can be true or false. The variable controls if the A2F-3D NIM should download the model from NGC or not.

Running Inference#

Audio2Face-3D uses gRPC API. You can quickly try out the API by using the A2F-3D Python interaction application. Follow the instructions below to set it up:

$ git clone https://github.com/NVIDIA/Audio2Face-3D-Samples.git
$ cd Audio2Face-3D-Samples/scripts/audio2face_3d_microservices_interaction_app
$ python3 -m venv .venv
$ source .venv/bin/activate
$ pip3 install ../../proto/sample_wheel/nvidia_ace-1.2.0-py3-none-any.whl
$ pip3 install -r requirements.txt

To check if the service is ready to handle inference requests:

$ python3 a2f_3d.py health_check --url 0.0.0.0:52000

To run inference on one of the example audios:

$ python3 a2f_3d.py run_inference ../../example_audio/Claire_neutral.wav config/config_claire.yml -u 0.0.0.0:52000

This command will print out where the results are saved, in a log similar with:

Input audio header info:
Sample rate: 16000 Hz
Bit depth: 16 bits
Channels: 1
Receiving data from server...
.............................
Received status message with value: 'sent all data'
Status code: '0'
Saving data into output_000001 folder...

You can then explore the A2F-3D NIM output animations by running the command below and replacing <output_folder> with the name of the folder printed by the run inference command.

$ ls -l <output_folder>/
-rw-rw-r-- 1 user user    328 Nov 14 15:46 a2f_3d_input_emotions.csv
-rw-rw-r-- 1 user user  65185 Nov 14 15:46 a2f_3d_smoothed_emotion_output.csv
-rw-rw-r-- 1 user user 291257 Nov 14 15:46 animation_frames.csv
-rw-rw-r-- 1 user user 406444 Nov 14 15:46 out.wav
  • out.wav: contains the audio received

  • animation_frames.csv: contains the blendshapes

  • a2f_3d_input_emotions.csv: contains the emotions provided as input in the gRPC protocol

  • a2f_3d_smoothed_emotion_output.csv: contains emotions smoothed over time

Model Caching#

When running the first time, the Audio2Face-3D NIM will download the model from NGC. You can cache this model locally by using a Docker volume mount. Follow the example below and set the LOCAL_NIM_CACHE environment variable to the desired local path. Make sure the local path has execute, read and write permissions (777 permissions).

$ mkdir -p ~/.cache/audio2face-3d
$ chmod 777 ~/.cache/audio2face-3d
$ export LOCAL_NIM_CACHE=~/.cache/audio2face-3d

Then simply run the Audio2Face-3D NIM and mount the folder inside the Docker container in /tmp/a2x:

$  docker run -it --rm --name audio2face-3d \
     --gpus all \
     --network=host \
     -e NGC_API_KEY=$NGC_API_KEY \
     -e NIM_MANIFEST_PROFILE=$NIM_MANIFEST_PROFILE \
     -v "$LOCAL_NIM_CACHE:/tmp/a2x" \
     nvcr.io/nim/nvidia/audio2face-3d:1.2

For subsequent runs of the Audio2Face-3D NIM using the cache, set NIM_DISABLE_MODEL_DOWNLOAD to true:

$  docker run -it --rm --name audio2face-3d \
     --gpus all \
     --network=host \
     -e NGC_API_KEY=$NGC_API_KEY \
     -e NIM_MANIFEST_PROFILE=$NIM_MANIFEST_PROFILE \
     -e NIM_DISABLE_MODEL_DOWNLOAD=true \
     -v "$LOCAL_NIM_CACHE:/tmp/a2x" \
     nvcr.io/nim/nvidia/audio2face-3d:1.2

Stopping the container#

You can easily stop and remove the running container by passing its name to docker stop and docker rm commands:

$ docker stop audio2face-3d
$ docker rm audio2face-3d