Audio2Face#
Note
Be sure to set up a character for animation before adding Audio2Face to your application.
System Requirements#
Windows:
64-bit Windows 10 or later
- Linux:
64-bit Linux OS with libstdc++ version 6.0.30 or later
Ubuntu 22.04 or later
Fedora 36 or later
Debian 12.0 or later
Audio2Face Connection Setting#
Changed in version 2.1: Added settings for NVCF API key, function ID, and function version.
The ACE plugin’s project settings have an option for setting the default Audio2Face server to connect to: Edit > Project Settings… > Plugins > NVIDIA ACE > Default A2F Server Config. The configuration has multiple fields:
Dest URL: The server address must include scheme (http or https), host (IP address or hostname), and port number. For example,
http://203.0.113.37:52000
orhttps://a2x.example.com:52010
(both fictional examples). To connect to NVIDIA Cloud Function (NVCF), set the server address tohttps://grpc.nvcf.nvidia.com:443
.API Key: If you are not connecting to an NVCF-hosted Audio2Face service, leave this blank. You can get an API key through https://build.nvidia.com/nvidia/audio2face to connect to NVCF-hosted Audio2Face services.
NvCF Function Id: If you are not connecting to an NVCF-hosted Audio2Face service, leave this blank. You can get an NVCF Function ID through https://build.nvidia.com/nvidia/audio2face to connect to NVCF-hosted Audio2Face services.
NvCF Function Version: Optional. Leave this blank unless you need to specify a specific version.
Note
We highly recommend that you use the NVCF option where and when possible.
You can change the Audio2Face connection settings at runtime by using
the ACE > Audio2Face > Override Audio2Face Connection Info blueprint
function, or by calling UACEBlueprintLibrary::SetA2XConnectionInfo
from C++.
You can fetch the current Audio2Face connection settings at runtime by
using the ACE > Audio2Face > Get Audio2Face Connection Info
blueprint function, or by calling
UACEBlueprintLibrary::GetA2XConnectionInfo
from C++. Current
settings are a combination of the project defaults and the runtime
overrides.
The project settings are stored in your project’s DefaultEngine.ini
file. If your API key is too sensitive to include in a project text
file, consider setting it at runtime using the Override Audio2Face
Connection Info blueprint function.
Import Speech Clips#
The NVIDIA ACE plugin’s Audio2Face feature supports animating a character from speech stored in Sound Wave assets. Any sample rate is supported as input. The clip is converted at runtime to 16000 Hz mono by the plugin before sending to the Audio2Face service.
If you don’t already have Sound Wave assets in your project you can import the speech audio clips you want to animate:
Open the Content Drawer and select the folder where you want to import your clips.
Right click in the content pane and select Import to [path]….
Navigate to a supported file (.wav, .ogg, .flac, .aif) and open it.
Verify that a new Sound Wave asset appears in the Content Drawer.
See Unreal documentation for more details about import options.
Note
In some cases, the Sound Wave asset may not be usable by the ACE plugin unless it is fully loaded. It is recommended to set Loading Behavior Override to ForceInline on the Sound Wave asset’s properties. The plugin logs a warning in the LogACERuntime category if an asset can’t be read because it isn’t fully loaded.
Animating a Character from a Sound Wave Audio Clip#
Changed in version 2.3: Added async version of blueprint node.
To animate a character from an audio clip stored in a Sound Wave asset,
use the latent blueprint node Animate Character from Sound Wave Async on
the character actor. These instructions describe the blueprint
interface, but you can also call
UACEBlueprintLibrary::AnimateCharacterFromSoundWave
from C++.
Depending on your application, there are many ways to determine which character to animate. Some options are:
have a single default character that is animated
automatically animate the character that the player is looking at or the closest character
provide some UI for selecting a character
After you’ve chosen a character Actor, animate it from a Sound Wave asset:
Call the ACE > Audio2Face > Animate Character From Sound Wave function.
Provide the actor corresponding to the character you want to animate. If the actor has an ACE Audio Curve Source component attached, this sends the speech clip to NVIDIA Audio2Face.
Provide the speech clip asset as Sound Wave input.
Optionally, provide an Audio2Face Emotion struct as ACEEmotionParameters input.
Optionally, provide an Audio2Face Parameters input.
Optionally, provide an Audio2Face provider name. If no provider name is specified, a default Audio2Face provider will be chosen. See Changing Audio2Face Providers (Optional) for details.
The node indicates whether the audio clip was successfully sent to Audio2Face through the Success return value when the “Audio Send Completed” execution pin activates in a later frame.
Note
There is also a non-async Animate Character from Sound Wave blueprint node, which is only present for compatibility with earlier plugin versions. It’s recommended to use the async version to avoid blocking application logic while audio data is sent to Audio2face.
Animating a Character from a Local WAV File (Optional)#
Changed in version 2.3: Added async version of blueprint node.
The plugin supports animating a character from a local WAV file at runtime.
For example, this could be used in an application where you can supply your own audio files for character speech. It’s similar to animating from a Sound Wave asset, but in the case of a WAV file, the audio won’t be stored in an Unreal asset and isn’t baked into the application’s content.
Use the latent blueprint node
Animate Character from Wav File Async
on the character actor. You
can also call UACEBlueprintLibrary::AnimateCharacterFromWavFile
from
C++.
Note
There is also a non-async Animate Character from Wav File blueprint node, which is only present for compatibility with earlier plugin versions. It’s recommended to use the async version to avoid blocking application logic while audio data is sent to Audio2face.
Animating a Character From Raw Audio Samples (Optional)#
Added in version 2.3.
If your application needs to feed audio generated at runtime into the ACE Unreal plugin, then providing a Sound Wave asset or WAV file as described above may not be an option. For these cases, the plugin exposes a C++ API. Any audio sample rate is supported and samples may be in PCM16 or float32 format.
Add
"ACERuntime"
to thePrivateDependencyModuleNames
in your module’s.Build.cs
file.#include "ACERuntimeModule.h"
in your source file.Call
FACERuntimeModule::Get().AnimateFromAudioSamples()
andFACERuntimeModule::Get().EndAudioSamples()
from your code.
The signature of the C++ APIs exposed is:
// Receive animations using audio from a float sample buffer.
// If bEndOfSamples = true, pending audio data will be flushed and any subsequent call to SendAudioSamples will start
// a new session.
bool AnimateFromAudioSamples(IACEAnimDataConsumer* Consumer, TArrayView<const float> SamplesFloat, int32 NumChannels,
int32 SampleRate, bool bEndOfSamples, TOptional<FAudio2FaceEmotion> EmotionParameters,
UAudio2FaceParameters* Audio2FaceParameters, FName A2FProviderName = FName("Default"));
// Receive animations using audio from an int16 PCM sample buffer.
// If bEndOfSamples = true, pending audio data will be flushed and any subsequent call to SendAudioSamples will start
// a new session.
bool AnimateFromAudioSamples(IACEAnimDataConsumer* Consumer, TArrayView<const int16> SamplesInt16, int32 NumChannels,
int32 SampleRate, bool bEndOfSamples, TOptional<FAudio2FaceEmotion> EmotionParameters,
UAudio2FaceParameters* Audio2FaceParameters, FName A2FProviderName = FName("Default"));
// Indicate no more samples for the current audio clip. Any subsequent call to AnimateFromAudioSamples will start a
// new session.
// Use this if your last call to SendAudioSamples had bEndOfSamples = false, and now the audio stream has ended.
bool EndAudioSamples(IACEAnimDataConsumer* Consumer);
Parameter description:
Consumer
: The component that will receive the animations. This will typically be a UACEAudioCurveSourceComponent attached to a character.SamplesFloat
orSamplesInt16
: The buffer containing the audio samples.NumChannels
: 1 for mono, 2 for stereo.SampleRate
: Samples per second of the source audio buffer.bEndOfSamples
: If you have the entire audio clip at once, set this totrue
. Otherwise set this tofalse
and callAnimateFromAudioSamples
multiple times as chunks of audio become available.EmotionParameters
: Optional overrides from the default inferred emotion behavior.Audio2FaceParameters
: Optional overrides for model-specific Audio2face facial animation behavior.A2FProviderName
: The Audio2face provider to use. You can obtain a list of available providers at runtime withUACEBlueprintLibrary::GetAvailableA2FProviderNames()
. See Changing Audio2Face Providers (Optional) for details.
Adjusting Character Emotion (Optional)#
Audio2Face detects emotions from the audio input that affect character animations appropriately. But if your application has information about character emotion, you can also provide this to Audio2Face to blend application-provided emotion overrides with the detected emotion. Functions to animate a character accept an ACEEmotionParameters input of type Audio2FaceEmotion, where individual emotion values can be overridden. Each emotion override value must be between 0.0 and 1.0. Values outside that range are ignored. A value of 0.0 represents a neutral emotion.
The Audio2FaceEmotion struct can change how detected emotions are processed. The following table shows a summary of available options:
Parameter |
Description |
Valid Range |
Default |
---|---|---|---|
Overall Emotion Strength |
Multiplier applied globally after the mix of emotions is done |
0.0 - 1.0 |
0.6 |
Detected Emotion Contrast |
Increase the spread of detected emotions values by pushing them higher or lower |
0.3 - 3.0 |
1.0 |
Max Detected Emotions |
Firm limit on the quantity of detected emotion values |
1 - 6 |
3 |
Detected Emotion Smoothing |
Coefficient for smoothing detected emotions over time |
0.0 - 1.0 |
0.7 |
Emotion Override Strength |
Blend between detected emotions (0.0) and override emotions (1.0) |
0.0 - 1.0 |
0.5 |
Emotion Overrides |
Individual emotion override values, each in the range 0.0 - 1.0 |
Disabled or 0.0 - 1.0 |
Disabled |
Note
Emotion and face parameter inputs won’t have any effect for audio clips less than 0.5 seconds.
Changing Audio2Face Providers (Optional)#
Added in version 2.2.
The base NVIDIA ACE plugin (NV_ACE_Reference.uplugin) supports selecting an Audio2Face provider at runtime. Additional Audio2Face providers may be implemented by other Unreal plugins.
Use the Get Available Audio2Face Provider Names blueprint function to get a list of available providers at runtime. These provider names may be passed as a parameter to any of the Audio2Face functions described in this documentation, to choose which provider you want to use.
The base NVIDIA ACE plugin includes these providers:
“RemoteA2F”: The default Audio2Face provider, available on all supported platforms. Executes Audio2Face remotely by connecting to an NVCF-hosted service or a separately deployed service. Currently incompatible with Animation Stream feature. The “RemoteA2F” provider can’t be used in the same application as Animation Stream.
“LegacyA2F”: An alternate remote Audio2Face provider, available only on Windows. Provides the same functionality as “RemoteA2F” but using an implementation from earlier plugin releases. If you see new issues when upgrading to plugin version 2.3.0 or later, you can try “LegacyA2F” to see if it helps. Contact NVIDIA support to report any issues with “RemoteA2F” that would be fixed by using “LegacyA2F”. This provider is expected to be removed in a future plugin update.
Adjusting Audio2Face Parameters (Optional)#
Certain Audio2Face service parameters can be overridden by the application. These parameters tend to be tightly coupled with the model deployed to the service. Typically, it’s not recommended to change these in the application. If you think you need to change any of these, refer to the Audio2Face service documentation for details on what they do.
Set parameters by string name. The parameters available might change depending on the version of the service you have deployed. The set of available parameters for v1.0 Audio2Face service are:
Parameter |
Description |
Valid Range |
Default |
---|---|---|---|
skinStrength |
Controls the range of motion of the skin |
0.0 – 2.0 |
1.0 |
upperFaceStrength |
Controls the range of motion on the upper regions of the face |
0.0 – 2.0 |
1.0 |
lowerFaceStrength |
Controls the range of motion on the lower regions of the face |
0.0 – 2.0 |
1.0 |
eyelidOpenOffset |
Adjusts the default pose of eyelid open-close (-1.0 means fully closed. 1.0 means fully open) |
-1.0 – 1.0 |
depends on deployed model |
blinkStrength |
0.0 – 2.0 |
1.0 |
|
lipOpenOffset |
Adjusts the default pose of lip close-open (-1.0 means fully closed. 1.0 means fully open) |
-0.2 – 0.2 |
depends on deployed model |
upperFaceSmoothing |
Applies temporal smoothing to the upper face motion |
0.0 – 0.1 |
0.001 |
lowerFaceSmoothing |
Applies temporal smoothing to the lower face motion |
0.0 – 0.1 |
depends on deployed model |
faceMaskLevel |
Determines the boundary between the upper and lower regions of the face |
0.0 – 1.0 |
0.6 |
faceMaskSoftness |
Determines how smoothly the upper and lower face regions blend on the boundary |
0.001 – 0.5 |
0.0085 |
tongueStrength |
0.0 – 3.0 |
depends on deployed model |
|
tongueHeightOffset |
-3.0 – 3.0 |
depends on deployed model |
|
tongueDepthOffset |
-3.0 – 3.0 |
depends on deployed model |
Note
Emotion and face parameter inputs won’t have any effect for audio clips less than 0.5 seconds.