Audio2Face#

Note

Be sure to set up a character for animation before adding Audio2Face to your application.

System Requirements#

Windows:

  • 64-bit Windows 10 or later

Linux:
  • 64-bit Linux OS with libstdc++ version 6.0.30 or later

    • Ubuntu 22.04 or later

    • Fedora 36 or later

    • Debian 12.0 or later

Audio2Face Connection Setting#

Changed in version 2.1: Added settings for NVCF API key, function ID, and function version.

The ACE plugin’s project settings have an option for setting the default Audio2Face server to connect to: Edit > Project Settings… > Plugins > NVIDIA ACE > Default A2F Server Config. The configuration has multiple fields:

  • Dest URL: The server address must include scheme (http or https), host (IP address or hostname), and port number. For example, http://203.0.113.37:52000 or https://a2x.example.com:52010 (both fictional examples). To connect to NVIDIA Cloud Function (NVCF), set the server address to https://grpc.nvcf.nvidia.com:443.

  • API Key: If you are not connecting to an NVCF-hosted Audio2Face service, leave this blank. You can get an API key through https://build.nvidia.com/nvidia/audio2face to connect to NVCF-hosted Audio2Face services.

  • NvCF Function Id: If you are not connecting to an NVCF-hosted Audio2Face service, leave this blank. You can get an NVCF Function ID through https://build.nvidia.com/nvidia/audio2face to connect to NVCF-hosted Audio2Face services.

  • NvCF Function Version: Optional. Leave this blank unless you need to specify a specific version.

Note

We highly recommend that you use the NVCF option where and when possible.

You can change the Audio2Face connection settings at runtime by using the ACE > Audio2Face > Override Audio2Face Connection Info blueprint function, or by calling UACEBlueprintLibrary::SetA2XConnectionInfo from C++.

You can fetch the current Audio2Face connection settings at runtime by using the ACE > Audio2Face > Get Audio2Face Connection Info blueprint function, or by calling UACEBlueprintLibrary::GetA2XConnectionInfo from C++. Current settings are a combination of the project defaults and the runtime overrides.

The project settings are stored in your project’s DefaultEngine.ini file. If your API key is too sensitive to include in a project text file, consider setting it at runtime using the Override Audio2Face Connection Info blueprint function.

Import Speech Clips#

The NVIDIA ACE plugin’s Audio2Face feature supports animating a character from speech stored in Sound Wave assets. Any sample rate is supported as input. The clip is converted at runtime to 16000 Hz mono by the plugin before sending to the Audio2Face service.

If you don’t already have Sound Wave assets in your project you can import the speech audio clips you want to animate:

  1. Open the Content Drawer and select the folder where you want to import your clips.

  2. Right click in the content pane and select Import to [path]….

  3. Navigate to a supported file (.wav, .ogg, .flac, .aif) and open it.

  4. Verify that a new Sound Wave asset appears in the Content Drawer.

See Unreal documentation for more details about import options.

Note

In some cases, the Sound Wave asset may not be usable by the ACE plugin unless it is fully loaded. It is recommended to set Loading Behavior Override to ForceInline on the Sound Wave asset’s properties. The plugin logs a warning in the LogACERuntime category if an asset can’t be read because it isn’t fully loaded.

Animating a Character from a Sound Wave Audio Clip#

Changed in version 2.3: Added async version of blueprint node.

To animate a character from an audio clip stored in a Sound Wave asset, use the latent blueprint node Animate Character from Sound Wave Async on the character actor. These instructions describe the blueprint interface, but you can also call UACEBlueprintLibrary::AnimateCharacterFromSoundWave from C++.

Depending on your application, there are many ways to determine which character to animate. Some options are:

  • have a single default character that is animated

  • automatically animate the character that the player is looking at or the closest character

  • provide some UI for selecting a character

After you’ve chosen a character Actor, animate it from a Sound Wave asset:

  1. Call the ACE > Audio2Face > Animate Character From Sound Wave function.

  2. Provide the actor corresponding to the character you want to animate. If the actor has an ACE Audio Curve Source component attached, this sends the speech clip to NVIDIA Audio2Face.

  3. Provide the speech clip asset as Sound Wave input.

  4. Optionally, provide an Audio2Face Emotion struct as ACEEmotionParameters input.

  5. Optionally, provide an Audio2Face Parameters input.

  6. Optionally, provide an Audio2Face provider name. If no provider name is specified, a default Audio2Face provider will be chosen. See Changing Audio2Face Providers (Optional) for details.

  7. The node indicates whether the audio clip was successfully sent to Audio2Face through the Success return value when the “Audio Send Completed” execution pin activates in a later frame.

Animate Character From Sound Wave node

Note

There is also a non-async Animate Character from Sound Wave blueprint node, which is only present for compatibility with earlier plugin versions. It’s recommended to use the async version to avoid blocking application logic while audio data is sent to Audio2face.

Animating a Character from a Local WAV File (Optional)#

Changed in version 2.3: Added async version of blueprint node.

The plugin supports animating a character from a local WAV file at runtime.

For example, this could be used in an application where you can supply your own audio files for character speech. It’s similar to animating from a Sound Wave asset, but in the case of a WAV file, the audio won’t be stored in an Unreal asset and isn’t baked into the application’s content.

Use the latent blueprint node Animate Character from Wav File Async on the character actor. You can also call UACEBlueprintLibrary::AnimateCharacterFromWavFile from C++.

Animate Character From Wav File node

Note

There is also a non-async Animate Character from Wav File blueprint node, which is only present for compatibility with earlier plugin versions. It’s recommended to use the async version to avoid blocking application logic while audio data is sent to Audio2face.

Animating a Character From Raw Audio Samples (Optional)#

Added in version 2.3.

If your application needs to feed audio generated at runtime into the ACE Unreal plugin, then providing a Sound Wave asset or WAV file as described above may not be an option. For these cases, the plugin exposes a C++ API. Any audio sample rate is supported and samples may be in PCM16 or float32 format.

  1. Add "ACERuntime" to the PrivateDependencyModuleNames in your module’s .Build.cs file.

  2. #include "ACERuntimeModule.h" in your source file.

  3. Call FACERuntimeModule::Get().AnimateFromAudioSamples() and FACERuntimeModule::Get().EndAudioSamples() from your code.

The signature of the C++ APIs exposed is:

// Receive animations using audio from a float sample buffer.
// If bEndOfSamples = true, pending audio data will be flushed and any subsequent call to SendAudioSamples will start
// a new session.
bool AnimateFromAudioSamples(IACEAnimDataConsumer* Consumer, TArrayView<const float> SamplesFloat, int32 NumChannels,
  int32 SampleRate, bool bEndOfSamples, TOptional<FAudio2FaceEmotion> EmotionParameters,
  UAudio2FaceParameters* Audio2FaceParameters, FName A2FProviderName = FName("Default"));

// Receive animations using audio from an int16 PCM sample buffer.
// If bEndOfSamples = true, pending audio data will be flushed and any subsequent call to SendAudioSamples will start
// a new session.
bool AnimateFromAudioSamples(IACEAnimDataConsumer* Consumer, TArrayView<const int16> SamplesInt16, int32 NumChannels,
  int32 SampleRate, bool bEndOfSamples, TOptional<FAudio2FaceEmotion> EmotionParameters,
  UAudio2FaceParameters* Audio2FaceParameters, FName A2FProviderName = FName("Default"));

// Indicate no more samples for the current audio clip. Any subsequent call to AnimateFromAudioSamples will start a
// new session.
// Use this if your last call to SendAudioSamples had bEndOfSamples = false, and now the audio stream has ended.
bool EndAudioSamples(IACEAnimDataConsumer* Consumer);

Parameter description:

  • Consumer: The component that will receive the animations. This will typically be a UACEAudioCurveSourceComponent attached to a character.

  • SamplesFloat or SamplesInt16: The buffer containing the audio samples.

  • NumChannels: 1 for mono, 2 for stereo.

  • SampleRate: Samples per second of the source audio buffer.

  • bEndOfSamples: If you have the entire audio clip at once, set this to true. Otherwise set this to false and call AnimateFromAudioSamples multiple times as chunks of audio become available.

  • EmotionParameters: Optional overrides from the default inferred emotion behavior.

  • Audio2FaceParameters: Optional overrides for model-specific Audio2face facial animation behavior.

  • A2FProviderName: The Audio2face provider to use. You can obtain a list of available providers at runtime with UACEBlueprintLibrary::GetAvailableA2FProviderNames(). See Changing Audio2Face Providers (Optional) for details.

Adjusting Character Emotion (Optional)#

Audio2Face detects emotions from the audio input that affect character animations appropriately. But if your application has information about character emotion, you can also provide this to Audio2Face to blend application-provided emotion overrides with the detected emotion. Functions to animate a character accept an ACEEmotionParameters input of type Audio2FaceEmotion, where individual emotion values can be overridden. Each emotion override value must be between 0.0 and 1.0. Values outside that range are ignored. A value of 0.0 represents a neutral emotion.

The Audio2FaceEmotion struct can change how detected emotions are processed. The following table shows a summary of available options:

Parameter

Description

Valid Range

Default

Overall Emotion Strength

Multiplier applied globally after the mix of emotions is done

0.0 - 1.0

0.6

Detected Emotion Contrast

Increase the spread of detected emotions values by pushing them higher or lower

0.3 - 3.0

1.0

Max Detected Emotions

Firm limit on the quantity of detected emotion values

1 - 6

3

Detected Emotion Smoothing

Coefficient for smoothing detected emotions over time

0.0 - 1.0

0.7

Emotion Override Strength

Blend between detected emotions (0.0) and override emotions (1.0)

0.0 - 1.0

0.5

Emotion Overrides

Individual emotion override values, each in the range 0.0 - 1.0

Disabled or 0.0 - 1.0

Disabled

Audio2FaceEmotion struct

Note

Emotion and face parameter inputs won’t have any effect for audio clips less than 0.5 seconds.

Changing Audio2Face Providers (Optional)#

Added in version 2.2.

The base NVIDIA ACE plugin (NV_ACE_Reference.uplugin) supports selecting an Audio2Face provider at runtime. Additional Audio2Face providers may be implemented by other Unreal plugins.

Use the Get Available Audio2Face Provider Names blueprint function to get a list of available providers at runtime. These provider names may be passed as a parameter to any of the Audio2Face functions described in this documentation, to choose which provider you want to use.

The base NVIDIA ACE plugin includes these providers:

  • “RemoteA2F”: The default Audio2Face provider, available on all supported platforms. Executes Audio2Face remotely by connecting to an NVCF-hosted service or a separately deployed service. Currently incompatible with Animation Stream feature. The “RemoteA2F” provider can’t be used in the same application as Animation Stream.

  • “LegacyA2F”: An alternate remote Audio2Face provider, available only on Windows. Provides the same functionality as “RemoteA2F” but using an implementation from earlier plugin releases. If you see new issues when upgrading to plugin version 2.3.0 or later, you can try “LegacyA2F” to see if it helps. Contact NVIDIA support to report any issues with “RemoteA2F” that would be fixed by using “LegacyA2F”. This provider is expected to be removed in a future plugin update.

Adjusting Audio2Face Parameters (Optional)#

Certain Audio2Face service parameters can be overridden by the application. These parameters tend to be tightly coupled with the model deployed to the service. Typically, it’s not recommended to change these in the application. If you think you need to change any of these, refer to the Audio2Face service documentation for details on what they do.

Set parameters by string name. The parameters available might change depending on the version of the service you have deployed. The set of available parameters for v1.0 Audio2Face service are:

Parameter

Description

Valid Range

Default

skinStrength

Controls the range of motion of the skin

0.0 – 2.0

1.0

upperFaceStrength

Controls the range of motion on the upper regions of the face

0.0 – 2.0

1.0

lowerFaceStrength

Controls the range of motion on the lower regions of the face

0.0 – 2.0

1.0

eyelidOpenOffset

Adjusts the default pose of eyelid open-close (-1.0 means fully closed. 1.0 means fully open)

-1.0 – 1.0

depends on deployed model

blinkStrength

0.0 – 2.0

1.0

lipOpenOffset

Adjusts the default pose of lip close-open (-1.0 means fully closed. 1.0 means fully open)

-0.2 – 0.2

depends on deployed model

upperFaceSmoothing

Applies temporal smoothing to the upper face motion

0.0 – 0.1

0.001

lowerFaceSmoothing

Applies temporal smoothing to the lower face motion

0.0 – 0.1

depends on deployed model

faceMaskLevel

Determines the boundary between the upper and lower regions of the face

0.0 – 1.0

0.6

faceMaskSoftness

Determines how smoothly the upper and lower face regions blend on the boundary

0.001 – 0.5

0.0085

tongueStrength

0.0 – 3.0

depends on deployed model

tongueHeightOffset

-3.0 – 3.0

depends on deployed model

tongueDepthOffset

-3.0 – 3.0

depends on deployed model

Audio2FaceParameters object

Note

Emotion and face parameter inputs won’t have any effect for audio clips less than 0.5 seconds.