Audio2Face Controller Microservice

Overview

The Audio2Face (A2F) Controller is designed to facilitate the management and integration of the A2F microservice within larger workflows. It acts as both the origin and the destination of the A2F outputs, simplifying the interaction with the A2F service by providing a bi-directional API. This controller makes it possible to use A2F either as a standalone application or as part of a complex pipeline involving additional microservices.

Communication

The A2F Controller microservice receives its data from a bi-directional-streaming RPC. The input data is composed of:

  1. an audio stream header containing information about the upcoming audio data, as well face parameters, post-processing options and blendshape parameters.

  2. audio data as well as emotion data with time code to start applying the emotion

And the output data is:

  1. an animation header containing information about the blendshape names, audio output format, etc.

  2. blendshape data with time code, as well as audio data and camera position if any.

Detailed description of the gRPC prototypes in the grpc prototypes section.

ID management

A big difference between Audio2Face Microservice and A2F Controller Microservice is the ID management.

  • Audio2Face Microservice expects IDs to be given as input to the gRPC call and will serve the same ID as output for the related stream.

  • For A2F Controller bidirectional connection, no IDs have to be provided. This Microservice hides the IDs from clients external to the cluster.

For that reason A2F Controller takes care of generating UUIDs when communicating with Audio2Face. These UUID will fill the id fields used in the Audio2Face Microservice gRPC interface.

Configuration

ac_a2f_config.yaml
audio2face:
  # Url to reach Audio2Face
  send-audio:
    ip: 0.0.0.0
    port: 50000
  receive-anim-data:
    # Port where to open a server to receive the animation data from A2F
    port: 51000
    # Maximum amount of time that A2F Controller will wait when not
    # receiving data from A2F, before cutting the connection
    max_wait_time_idle_ms: 30000

public-interface:
  # port exposed publicly to the outside of the cluster
  # The provided python app connects to it
  port: 52000
  # Maximum number of connected users
  # We advise to use the same number as for A2F config
  max-user-number: 10

common:
  # Maximum amount in second for the processing time
  # After this timeout the connection to A2F will be cut
  max_processing_duration_second: 300
  # Maximum size of 1 audio buffer sent over the grpc stream
  max_audio_buffer_size_second: 10
  # Maximum size of the audio clip to process
  max_audio_clip_size_second: 300
  # Maximum allowed Samplerate
  max_sample_rate: 144000 # 144kHz
  # How often should FPS logs be printed per stream
  fps_logging_interval_second: 1
  garbage_collector:
    # enable or disable the garbage collector
    enabled: true
    # how often the garbage collector should run
    interval_run_second: 10
    # If the garbage collector finds streams holding
    # more than N seconds of data, it will delete data
    # until the amount falls below this threshold.
    # Clients are expected to retrieve data promptly so that
    # the service doesn't retain the data excessively.
    max_size_stored_data_second: 60