Overview

NVIDIA Audio2Face is a component of NVIDIA ACE, delivering generative AI avatar animation solutions based on audio and emotion inputs.

The Audio2Face microservice is now available as a part of Audio2Face solutions. This microservice brings Omniverse Audio2Face and Audio2Emotion technology to the cloud. This solution enables 3D avatars to be driven by audio input using the powerful combination of NVIDIA’s AI solutions and a supported rendering engine, deployable on premise or in the cloud.

The Audio2Face Microservice converts speech into facial animation in the form of ARKit blendshapes. The facial animation includes emotional expression. Where emotions can be detected, the facial animation system captures key poses and shapes to replicate character facial performance by automatically detecting emotions in the input audio. Additionally emotions can be directly specified as part of the input to the microservice. A rendering engine can consume blendshape topology to display a 3D avatar’s performance.

This microservice supports multiple input streams simultaneously, enabling workflows that allow many users to connect and generate animation output at the same time.

Audio2Face Microservice Cloud 1 to Many

The microservice can connect with other UCS microservices that support its endpoint protocols.

Audio2Face Microservice

Audio2Face Microservice interface

Try out the A2F experience on our demo website.