Quick start#

Steps below will help you setup and run the microservice on a Linux system and use our simple sample application to receive blendshapes in real-time.

For Windows, we recommend using WSL by following WSL Setup Guide. After setting up WSL, you can follow any page in the Audio2Face-3D Authoring section of the documentation, but ensure that you run the commands inside the WSL terminal.

Prerequisites#

This documentation assumes the following system requirements:

OS

Ubuntu 22.04

CUDA

12.6

Driver

535.183.06 (for Data Center GPUs), 560.35.03 (for RTX GPUs)

Docker

latest

  • Any Linux distribution should work but has not been tested by our teams.

  • For Windows Subsystem for Linux (WSL) it is expected to work on 560.94 driver.

  • Some of the newer versions of CUDA 12.x have not been fully tested and may encounter issues during TRT model generation.

  • The sample application will run inside a python Docker container.

NGC ACE EA access#

To download Audio2Face-3D Authoring Container you need access to NGC nvidia/ace. You can request access by filling out the ACE EA application.

Note

Early Access (EA) products are available for selected customers.

NGC Access and Cloud Function Run Key#

You will need a NGC account to get access to NGC resources and self host the A2F-3D Authoring Microservice. A separate Cloud Function Run Key is also needed to use the service. Please reach out to your NVIDIA account manager if you do not have access to NGC or do not have a Cloud Function Run key assigned to you by your NVIDIA account manager.

Setup sample application#

You can download the sample application by cloning this repository: NVIDIA/Audio2Face-3D-Samples Then go to the early_access/a2f-3d-authoring-sample-app subfolder.

$ git clone https://github.com/NVIDIA/Audio2Face-3D-Samples.git
$ cd Audio2Face-3D-Samples/early_access/a2f-3d-authoring-sample-app

Inside you will find the client_nvcf_deploy.py script. Follow the instructions in the README.md file, the Requirements section to setup the python dependencies inside a python environment.

Trying Audio2Face-3D Authoring Microservice#

Using your assigned Cloud Function Run Key, a function-id and a version-id you can communicate with the NVCF deployment of A2F-3D Authoring MS. Note that the API_Key used here should be your assigned Cloud Function Run Key instead of your NGC API Key. Inside the Docker container from the previous step you can run:

$ python3 client_nvcf_deploy.py data_capture --function-id {FUNCTION_ID} --version-id {VERSION_ID} --apikey {API_KEY} --audio-clip ../../example_audio/Claire_neutral.wav

You can find the right ids in this table:

Model

Function ID

Version ID

Mark

5d8b0f5f-6d3c-4987-b066-5c0be0fdda00

6b2e47e4-98b9-485b-8b31-20977683613d

Claire

4dd6b29e-f4bd-45a8-8b14-c420f8b83f5b

235462c6-8d92-4e7d-9ac8-c550b3459bee

James

cc615922-ec78-4cdf-9b72-388ddf1935c4

812c80db-2e62-4e8e-955b-b1de4801febd

Note

This Microservice is hosted in the US. So for remote regions, it is recommend to self-deploy the container.

The results of the script are saved in 2 files:

  • output_blendshape.csv: contains the blendshapes with their name, value and time codes.

  • output_emotions.csv: contains the emotions with their name, value and time codes.

The timecodes are relative to the beginning of the audio file.

For additional functionality like checking the service health, you can follow the full guide of the NVCF version of the sample app.

We recommend the following workflows:

  • (optional) self-deploy the Authoring Microservice.

  • Try the sample app to check your service health (if using self-deploy) or NVCF endpoint health.

  • Try interactive avatar tuning by connecting Maya-ACE to the authoring microservice. Follow this section.