Architecture overview#

Audio2Face-3D NIM supports 2 main gRPC services:

A Bidirectional Streaming gRPC for receiving audio data and sending animation data.
[Alpha version] A Unary gRPC for getting the current configuration of the microservice.

2 more gRPC services are still supported to not break backwards compatibility however we recommend the bidirectional endpoint:

A Client Streaming gRPC for receive audio data, where the Audio2Face-3D is the server.
A Client Streaming gRPC for sending the animation data, where the Audio2Face-3D is the client.

Audio2Face-3D supports concurrent input streams, allowing multiple users to connect and simultaneously generate animation outputs.

Note

Audio2Face-3D can run either in bidirectional streaming mode as a NIM or in the legacy mode but not both.

Audio2Face-3D NIM Data Flow#

An Audio2Face-3D deployment, in its simplest configuration, consists of a single A2F-3D microservice. Optionally, you can connect telemetry services that collect traces and metrics exposed by the A2F-3D microservice.

The overall architecture of an A2F-3D deployment is illustrated below:

../../_images/arch_ov.png — A2F-3D with a single gRPC client#

The arrow represent one asynchronous bi-directional gRPC stream. The dotted line represents the telemetry data exposed in OpenTelemetry (OTel) format. You can connect this to any service that understands the OTel data format. The generation of telemetry data is optional and can be disabled.