Migration Guide#

Migrating from 1.3 to 2.0#

Audio2Face-3D NIM v2.0 introduces significant changes to the deployment-time configuration file schemas and adds support for diffusion-based animation models. Configuration files from v1.3 are not directly compatible with v2.0 and require manual migration.

For the updated default configuration files and examples, refer to Audio2Face-3D NIM Container Deployment and Configuration Guide.

Stylization Configuration Changes#

The a2f section structure has been modernized:

Old format (v1.3):

a2f:
  inference_model_id: claire_v2.3
  blendshape_id: claire_topo1_v2.1
  tongue_blendshape_id: claire_tongue_v1.0
  enable_tongue_blendshapes: true

New format (v2.0):

a2f:
  # regression / diffusion
  inference_type: regression

  regression_model:
    inference_model_id: claire_v2.3.1

  diffusion_model:
    inference_model_id: multi_v3.2
    identity: claire
    constant_noise: true

  enable_tongue_blendshapes: true

Key Changes#

  1. New inference type selector: The inference_type field selects between regression (fast, deterministic) and diffusion (higher quality, more expressive) animation modes.

  2. Nested model configuration: Model IDs are now specified under regression_model and diffusion_model sections instead of at the top level of a2f.

  3. Removed fields: blendshape_id and tongue_blendshape_id are no longer used.

  4. Updated model versions: Regression models updated to claire_v2.3.1, james_v2.3.1, mark_v2.3. The new diffusion model is multi_v3.2.

  5. New tongue parameters in face_params: Added tongue_strength, tongue_height_offset, and tongue_depth_offset to control tongue animation.

    face_params:
      # ... existing params ...
      tongue_strength: 1.3
      tongue_height_offset: 0.0
      tongue_depth_offset: 0.0
    
  6. Extended tongue blendshapes: Added 16 new tongue blendshapes to blendshape_params sections (weight_multipliers, weight_offsets, active_poses, cancel_poses, symmetry_poses):

    • TongueTipUp, TongueTipDown, TongueTipLeft, TongueTipRight

    • TongueRollUp, TongueRollDown, TongueRollLeft, TongueRollRight

    • TongueUp, TongueDown, TongueLeft, TongueRight

    • TongueIn, TongueStretch, TongueWide, TongueNarrow

  7. New blendshape streaming controls in advanced_config.yaml: Added pipeline_parameters.burst_mode and pipeline_parameters.blendshape_streaming_fps to control how blendshapes are sent to the client.

    pipeline_parameters:
      # Burst Mode: Send all frames as fast as possible (~20-30ms total)
      # WARNING: May cause AnimGraph buffer overflow and lip sync issues in Tokkio
      # Values: false = Rate-limited streaming (RECOMMENDED), true = Burst mode
      burst_mode: false
    
      # Streaming Frame Rate (only used when burst_mode = false)
      # Delay per frame = 1000 / blendshape_streaming_fps milliseconds
      # Recommended: 90 (Tokkio/Production), 120-240 (Low latency), 30-60 (Bandwidth-constrained)
      blendshape_streaming_fps: 90
    
  8. GPU blendshape solver in advanced_config.yaml: Added a2f.use_gpu_solver option (default: true). When enabled, blendshape solving runs entirely on GPU, improving performance by avoiding CPU-GPU data transfers.

Migration Steps#

  1. Replace the top-level inference_model_id, blendshape_id, and tongue_blendshape_id fields with the new nested regression_model and diffusion_model sections.

  2. Add the inference_type field set to regression (or diffusion if using the new diffusion model).

  3. Update the model version in regression_model.inference_model_id (e.g., claire_v2.3claire_v2.3.1).

  4. Add the diffusion_model section with inference_model_id: multi_v3.2 and the appropriate identity.

  5. Add tongue parameters to face_params if tongue animation control is desired.

  6. Add the 16 new tongue blendshapes to all blendshape_params subsections if custom blendshape tuning is used.

Migrating from 1.2 to 1.3#

No action is needed. The Audio2Face-3D NIM configuration files are backwards compatible between version 1.2 and 1.3.

Migrating from 1.0 to 1.2#

The Audio2Face-3D NIM (version 1.2) was previously a suite of 2 microservices (in version 1.0): Audio2Face Microservice and Audio2Face Controller. This page will guide you through what has changed between the two versions.

Audio2Face Controller#

The Audio2Face Controller’s functionality has been integrated into Audio2Face-3D Microservice. The gRPC proto service remains the same.

Service Interface#

service A2FControllerService {
  rpc ProcessAudioStream(stream nvidia_ace.controller.v1.AudioStream)
      returns (stream nvidia_ace.controller.v1.AnimationDataStream) {}
}

Audio2Face-3D Microservice#

Audio stream header#

The new field emotion_params in AudioStreamHeader message controls temporal smoothing in the Audio2Emotion SDK.

message AudioStreamHeader {
  nvidia_ace.audio.v1.AudioHeader audio_header = 1;

  nvidia_ace.a2f.v1.FaceParameters face_params = 2;

  nvidia_ace.a2f.v1.EmotionPostProcessingParameters emotion_post_processing_params = 3;

  nvidia_ace.a2f.v1.BlendShapeParameters blendshape_params = 4;

  nvidia_ace.a2f.v1.EmotionParameters emotion_params = 5;
}

The new EmotionParameters message introduces control for emotion smoothing over time. The live_transition_time field defines the duration over which the emotion should be smoothed. The beginning_emotions field provides the initial set of emotions to the smoothing algorithm.

message EmotionParameters {
  optional float live_transition_time = 1;

   map<string, float> beginning_emotion = 2;
}

Blendshape parameters#

The new field enable_clamping_bs_weight in BlendShapeParameters message controls whether or not to clamp the values of the returned blendshapes between 0 and 1. The clamping is applied after multipliers and offsets are applied.

Blendshape clamping is a post-processing step that ensures blendshape weights stay within the standard [0.0, 1.0] range expected by most animation systems. The A2F neural network can produce values outside this range, so clamping normalizes them for compatibility with downstream renderers.

  • Clamping ON (true): Values guaranteed 0.0-1.0, safe for renderers expecting normalized weights. Recommended for production.

  • Clamping OFF (false): Values can exceed range (e.g., 1.2, -0.1), preserves full model output fidelity. Useful for debugging/analysis.

message BlendShapeParameters {

  map<string, float> bs_weight_multipliers = 1;

  map<string, float> bs_weight_offsets = 2;

  optional bool enable_clamping_bs_weight = 3;
}

Migrating Configuration files#

In order to make the configuration file migration easier we provide here a tool to do it.

Clone the repository: NVIDIA/Audio2Face-3D-Samples.git

Checkout the v1.2 tag: git checkout tags/v1.2

Go to migration/deployment_configuration_files_from_v1.0_to_v1.2/ subfolder.

And follow the setup instructions below:

Configuration file migration guide

This sample python app allows you to migrate your A2F-3D config files from v1.0 to v1.2.

Prerequisite

Install:

  • python3

  • python3-venv

Setup a virtual environment and install the needed packages:

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -r requirements.txt

Steps

To do so there are 2 possibilities. You want to migrate the A2F-3D config files used for:

  1. running the docker container

  2. deploying the UCS app

Updating docker container configs

Update:

  • docker_container_configs/a2f_config.yaml

  • docker_container_configs/ac_a2f_config.yaml

with your own config files.

Then run:

$ python3 convert_configuration_files.py docker_config

This will generate new config files compatible with A2F-3D v1.2 and print the folder name.

Updating the UCS app configs

Update:

  • ucs_app_configs/a2f_config.yaml

with your own config file.

Then run:

$ python3 convert_configuration_files.py ucs

This will generate new config files compatible with A2F-3D v1.2 and print the folder name.

Migrating Kubernetes deployment#

The quick deployment resource for Audio2Face-3D via NGC is no longer available. For a straightforward Kubernetes deployment, refer to the detailed steps in this guide: Kubernetes Deployment.