Warning
Usage Restrictions
You may not use the Software or any of its components for the purpose of emotion recognition. Any technology included in the Software may only be used as fully integrated in the Software and consistent with all applicable documentation.
Getting Started#
The steps below will help you setup and run the Audio2Face-3D NIM and use our sample application to receive blendshapes, audio and emotions.
Prerequisites#
Check the Support Matrix to make sure you have the supported hardware and software stack.
Read the instructions corresponding to your Operating System.
Windows System - Using Windows Subsystem for Linux (WSL2)
Refer to the NVIDIA NIM on WSL2 public documentation for detailed instructions and Prerequisites for WSL installation and setup.
Use the NVIDIA NIM WSL2 installer to install WSL2 with the NVIDIA AI Workbench included. This installer will install podman, NVIDIA Container Toolkit, and NVIDIA AI Workbench.
Note
Default user created is ‘workbench’. Also docker is not installed. Use podman to run the Audio2Face-3D NIM
Refer to the instructions below in this page to get the NGC API key and login to the nvcr.io registry. Instead of docker, use podman.
To run the Audio2Face-3D NIM, use the following command:
$ podman run -it --rm --device nvidia.com/gpu=all --network=host --shm-size=8GB \
-e NGC_API_KEY=$NGC_API_KEY \
-e NIM_RELAX_MEM_CONSTRAINTS=1 \
nvcr.io/nim/nvidia/audio2face-3d:1.3
To run the A2F-3D Python interaction application, follow the instructions below:
On a separate Windows terminal (using cmd or PowerShell), run WSL and navigate to the Audio2Face-3D-Samples folder.
$ wsl -d Ubuntu-22.04
$ git clone https://github.com/NVIDIA/Audio2Face-3D-Samples.git
$ git checkout tags/v1.3
$ cd Audio2Face-3D-Samples/scripts/audio2face_3d_microservices_interaction_app
$ python3 -m venv .venv
$ source .venv/bin/activate
$ pip3 install ../../proto/sample_wheel/nvidia_ace-1.2.0-py3-none-any.whl
$ pip3 install -r requirements.txt
To check if the service is ready to handle inference requests:
$ python3 a2f_3d.py health_check --url 0.0.0.0:52000
To run inference on one of the example audios:
$ python3 a2f_3d.py run_inference ../../example_audio/Mark_neutral.wav config/config_james.yml -u 0.0.0.0:52000
Linux System - Using Ubuntu 22.04 & 24.04
Setup Docker without Docker Desktop
Install docker using the convenience script:
$ curl -fsSL https://get.docker.com -o get-docker.sh
$ sudo sh ./get-docker.sh
Add your user account to docker group:
$ sudo groupadd docker
$ sudo usermod -aG docker <username>
Logout and login again of your system, then do a sanity check:
$ docker run hello-world
You should see “Hello from Docker!” printed out.
Install the Docker Compose plugin:
$ sudo apt-get update
$ sudo apt-get install docker-compose-plugin
Check that the installation was successful by running:
$ docker compose version
Set up iptables compatibility:
$ sudo update-alternatives --config iptables
When prompted, choose Selection 1, with the path /usr/sbin/iptables-legacy.
Restart your system and check the Docker status:
$ service docker status
You should see “active (running)” in the messages. To exit, press q.
Install CUDA Toolkit
For cuda-toolkit-12-6
Ubuntu 22.04 run these instructions:
$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
$ sudo dpkg -i cuda-keyring_1.1-1_all.deb
$ sudo apt-get update
$ sudo apt-get -y install cuda-toolkit-12-6
For cuda-toolkit-12-6
Ubuntu 24.04 run these instructions:
$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
$ sudo dpkg -i cuda-keyring_1.1-1_all.deb
$ sudo apt-get update
$ sudo apt-get -y install cuda-toolkit-12-6
Alternatively, to install the latest CUDA Toolkit, visit NVIDIA Developer - CUDA downloads:
and follow the instructions.
Install NVIDIA Container Toolkit
If any of the steps below fails, follow the official NVIDIA Container Toolkit docs instead.
Configure the production repository:
$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
Optionally, configure the repository to use experimental packages:
$ sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list
Update the packages list from the repository:
$ sudo apt-get update
$ sudo apt-get install -y nvidia-container-toolkit
Configure docker with NVIDIA Container Toolkit
Run the following instructions:
$ sudo nvidia-ctk runtime configure --runtime=docker
$ sudo systemctl restart docker
If all goes well then you should be able to start a Docker container and run nvidia-smi
inside, to see information
about your GPU inside a container. We provide an example below but keep in mind that numbers will vary for your hardware.
$ sudo docker run --rm --gpus all ubuntu nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.06 Driver Version: 535.183.06 CUDA Version: 12.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA A10G On | 00000000:01:00.0 Off | 0 |
| 0% 33C P8 18W / 300W | 0MiB / 23028MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
NVAIE access#
To download Audio2Face-3D NIM Microservice you need an active subscription to an NVIDIA AI Enterprise product.
Contact a Sales representative on this form
and request an access to NVIDIA AI Enterprise Essentials
.
NGC Personal Key#
Set up your NGC Personal Key if you have not done so already.
Go to the NGC personal key setup page of the NGC website
and Generate Personal Key
.
Once prompted with a Generate Personal Key
form, choose your key Name and Expiration,
then select all services for Services Included
.
Then you will get your Personal Key, make sure to save it somewhere safe.
Export the API key#
Export the API key generated at the previous step in NGC_API_KEY
environment variable to run the A2F-3D NIM by
running:
$ export NGC_API_KEY=<value>
To make the key available at startup, run the following command if you are using bash. Make sure you replace <value>
with the actual API key.
$ echo "export NGC_API_KEY=<value>" >> ~/.bashrc
Docker Login to NGC#
To pull the NIM container image, you need to login to the nvcr.io docker registry. The username is $oauthtoken
and the password is the API key generated earlier and stored in NGC_API_KEY
. You can simply run the following command
to login:
$ echo "$NGC_API_KEY" | docker login nvcr.io --username '$oauthtoken' --password-stdin
Login Succeeded
Launching the Audio2Face-3D NIM#
There are two quick ways to start the Audio2Face-3D NIM:
Use a pre-generated TRT engine for supported GPUs
Generate a TRT engine for your NVIDIA GPU
Pregenerated TRT engine#
The Audio2Face-3D NIM supports pre-generated engine for the following GPUs:
A10G
A100
A30
H100
L4
L40s
RTX6000
RTX4090
RTX 50 Series (using GB20x compatability mode profile)
To list available model profiles:
$ docker run -it --rm --network=host --gpus all \
--entrypoint nim_list_model_profiles nvcr.io/nim/nvidia/audio2face-3d:1.3
Check the Manual profile selection section to set the profile manually.
Manual profile selection
Supported GPUs profiles can be found in the table below:
GPU Type |
Profile ID |
---|---|
A10G |
cb81f87bcd530fdec6bf29a96b83d6837e4a57ccd6f3622847c178988fb191ec |
A30 |
5c9af5028db0c53e8c4f9db6db151ea839e18c2c270566229fc98f60d1ef993f |
A100 |
e59c2f97d15763a368ae33b4c9419a83348d73193c5aa79f224d6022113afaaa |
A100-SXM4-40GB |
a2e2b4ff4edb677c0445275a5e3f7ea3b47233b8d517bd34aebb4df65f060e62 |
A100-PCIE |
ee0960f5b9b2ed6321b1c0dcf58e3c76d2e3d8d9420bb64fe5e960cae724a4cd |
A100-PCIE-40GB |
2e01ff71a695f41bcfdff354fff1465983308d11a2716333bdf891d98cf465c2 |
H100 |
095153281fce60754c149b217e7775118c36450adfd5df2590b624cef810a767 |
H100-NVL |
3d3a70ba2ae10496b827bf85468ae85b0c8ad6d65684c4f72e17fe57907eae88 |
H100-PCIE |
c5fc10d30a2d1f1c514867dff26d8707ff1a9404d29312c4a3228e8288eca31a |
L40S |
c23fd2abf84952c6bdbe17378b865c562cab8784dac21d31aa36c30bdd6296c8 |
L4 |
2cec6eaafc5552880952775c50d95f02d4f6ef5b64ba6ea3f29bce5be0449bec |
RTX6000 |
7296c3153bf4005ca20ebfd5e975b183b3e8a1ac189d2830dc09118eaedf5fd0 |
RTX4090 |
c761e52b62df2a2a46047aed74dd6e1da8826f3596bec3c197372c7592478f6b |
GB20x (compatibility mode) |
f4f4bc7183a661f81ab8f7a7bdbc1935d8397139593547a8d6513ee334a94375 |
Run this command and change the <manifest_profile_id>
to the value from the table above corresponding to your GPU:
$ export NIM_MANIFEST_PROFILE=<manifest_profile_id>
$ docker run -it --rm --name audio2face-3d \
--gpus all \
--network=host \
-e NGC_API_KEY=$NGC_API_KEY \
-e NIM_MANIFEST_PROFILE=$NIM_MANIFEST_PROFILE \
nvcr.io/nim/nvidia/audio2face-3d:1.3
When the Audio2Face-3D NIM is deployed, it will use the james_v2.3
model with tongue animation disabled by default,
as shown in the logs below. In order to enable tongue animation, please refer to the Flexible Configuration Management section
in Audio2Face-3D NIM Container Deployment and Configuration Guide
[info] Tongue animation is disabled
[info] Will use A2F stylization ids: inference_model_id=james_v2.3, blendshape_id=james_topo2_v2.2, tongue_blendshape_id=
Note
When you start the service, you might encounter warnings labeled as GStreamer-WARNING. These warnings occur because some libraries are missing from the container. However, they are safe to ignore, as these libraries are not used by Audio2Face-3D.
Expand this section for more details about the docker commands used above:
Docker flags explained
You can find the explanation of each flag in the above docker command in this table:
Flag |
Description |
---|---|
|
|
|
Delete the container after it stops (see Docker docs) |
|
Give a name to the NIM container. Use any preferred value. |
|
Expose all NVIDIA GPUs inside the container. See the configuration page for mounting specific GPUs. |
|
Connect container to host machine network. (see Docker docs) |
|
Add |
|
Add |
|
Set |
Running Inference#
Audio2Face-3D uses gRPC API. You can quickly try out the API by using the A2F-3D Python interaction application. Follow the instructions below to set it up:
$ git clone https://github.com/NVIDIA/Audio2Face-3D-Samples.git
$ git checkout tags/v1.3
$ cd Audio2Face-3D-Samples/scripts/audio2face_3d_microservices_interaction_app
$ python3 -m venv .venv
$ source .venv/bin/activate
$ pip3 install ../../proto/sample_wheel/nvidia_ace-1.2.0-py3-none-any.whl
$ pip3 install -r requirements.txt
Note
Audio2Face-3D NIM v1.3 continues to use v1.2.0 of the nvidia_ace gRPC python module.
To check if the service is ready to handle inference requests:
$ python3 a2f_3d.py health_check --url 0.0.0.0:52000
To run inference on one of the example audios:
$ python3 a2f_3d.py run_inference ../../example_audio/Mark_neutral.wav config/config_james.yml -u 0.0.0.0:52000
This command will print out where the results are saved, in a log similar with:
Input audio header info:
Sample rate: 16000 Hz
Bit depth: 16 bits
Channels: 1
Receiving data from server...
.............................
Status code: SUCCESS
Received status message with value: 'sent all data'
Saving data into output_000001 folder...
You can then explore the A2F-3D NIM output animations by running the command below and replacing <output_folder>
with
the name of the folder printed by the run inference command.
$ ls -l <output_folder>/
-rw-rw-r-- 1 user user 328 Nov 14 15:46 a2f_3d_input_emotions.csv
-rw-rw-r-- 1 user user 65185 Nov 14 15:46 a2f_3d_smoothed_emotion_output.csv
-rw-rw-r-- 1 user user 291257 Nov 14 15:46 animation_frames.csv
-rw-rw-r-- 1 user user 406444 Nov 14 15:46 out.wav
out.wav: contains the audio received
animation_frames.csv: contains the blendshapes
a2f_3d_input_emotions.csv: contains the emotions provided as input in the gRPC protocol
a2f_3d_smoothed_emotion_output.csv: contains emotions smoothed over time
Note
The maximum size of 1 audio buffer sent over the grpc is 10 seconds.
The maximum size of the audio clip to process is 300 seconds.
This information can be found in Audio2Face-3D NIM Container Deployment and Configuration Guide under the Advanced Configuration File section.
Model Caching#
When running the first time, the Audio2Face-3D NIM will download the model from NGC. You can cache this model locally
by using a Docker volume mount. Follow the example below and set the LOCAL_NIM_CACHE
environment variable to the
desired local path. Make sure the local path has execute, read and write permissions (777
permissions).
$ mkdir -p ~/.cache/audio2face-3d
$ chmod 777 ~/.cache/audio2face-3d
$ export LOCAL_NIM_CACHE=~/.cache/audio2face-3d
Then simply run the Audio2Face-3D NIM and mount the folder inside the Docker container in /tmp/a2x
.
This will download and store the models in LOCAL_NIM_CACHE
.
$ docker run -it --rm --name audio2face-3d \
--gpus all \
--network=host \
-e NGC_API_KEY=$NGC_API_KEY \
-e NIM_MANIFEST_PROFILE=$NIM_MANIFEST_PROFILE \
-v "$LOCAL_NIM_CACHE:/tmp/a2x" \
nvcr.io/nim/nvidia/audio2face-3d:1.3
Once the models have been stored locally, you can start running the Audio2Face-3D NIM as below
using the NIM_DISABLE_MODEL_DOWNLOAD
flag.
$ docker run -it --rm --name audio2face-3d \
--gpus all \
--network=host \
-e NIM_DISABLE_MODEL_DOWNLOAD=true \
-v $LOCAL_NIM_CACHE:/tmp/a2x \
nvcr.io/nim/nvidia/audio2face-3d:1.3
Stopping the container#
You can easily stop and remove the running container by passing its name to docker stop
and docker rm
commands:
$ docker stop audio2face-3d
$ docker rm audio2face-3d