Container Deployment#

We offer Docker containers via the NGC registry for deployment purposes. This guide will demonstrate how to deploy an operational Audio2Face Authoring Microservice by configuring and running the Docker image.

You need to login to the nvcr.io docker registry, follow the instruction of Generate API Key on NGC.

Dependencies#

To do so you will need the following dependencies:

Using Docker Compose#

To run Audio2Face Authoring microservice the easiest way is to use docker compose.

To get started with a quick deploy save the following file as docker-compose.yaml:

You can then set the A2F_MODEL_NAME environment variable to either mark_v2.2 or claire_v1.3 depending on your preference for deployment and run docker compose up.

export A2F_MODEL_NAME=claire_v1.3
docker compose up

Wait for the service to display readiness and you can start interacting with it.

...
a2f-authoring-1  | 2024-09-18T15:33:57.763448Z  INFO a2f_authoring: Service is initialized!

You are now running a local deployment of the Audio2Face Authoring Microservice.

Single container deployment#

If you do not want to deploy the container using docker-compose you can run the container with plain docker commands.

docker run -it --rm --network=host --gpus all nvcr.io/eevaigoeixww/ace-ea/a2f-authoring:0.2.8 ./run_claire_model.sh

If you wish to run mark_v2.2 model instead:

docker run -it --rm --network=host --gpus all nvcr.io/eevaigoeixww/ace-ea/a2f-authoring:0.2.8 ./run_mark_model.sh

Note

To find the checksum of nvcr.io/eevaigoeixww/ace-ea/a2f-authoring:0.2.8, run:

docker images --no-trunc  nvcr.io/eevaigoeixww/ace-ea/a2f-authoring:0.2.8

You should get:

REPOSITORY                                  TAG       IMAGE ID                                                                  CREATED      SIZE
nvcr.io/eevaigoeixww/ace-ea/a2f-authoring   0.2.8     sha256:7cfde05965c24adb4cacab442641d1282228a66323608e314b51fc6bb64f90a2   xxxx ago     10.9GB

TRT Engine Generation#

During the deployment process, a TRT engine will need to be generated to optimize the models for the given GPU. This TRT engine will need to be regenerated when deployment environment changed. This is especially the case when GPU changes are present, with a different architecture or compute capability. The generated TRT engine can potentially be reused on machines with the exact same controlled configuration (same hardware + docker). It is recommended to always regenerate the TRT engine whenever hardware changes are made.