Release Notes#
v1.2.0#
SDK Versions#
Audio2Face
: 0.22.4Audio2Emotion
: 0.7.9
Features#
The new service is now available as a downloadable NIM, seamlessly integrating into the NVIDIA NIM ecosystem.
New James 2.3 inference model provides better lip sync quality, stronger upperface expression for different emotions and less lip stretch artifact during silence.
New Claire 2.3 inference model provides better lip sync quality including F V M B P U S sounds and stronger upperface expression for different emotions.
New Mark 2.3 inference model provides better lip sync quality including F V M B P U S sounds.
Introduced support for bidirectional streaming with gRPC, enabling real-time communication between clients and the service while eliminating the need for the previously required A2F Controller.
Added runtime control for clamping blendshape values between 0 and 1.
Integrated OpenTelemetry for advanced observability, providing unified tracing and metrics.
Added functionality to download pre-built TensorRT (TRT) engines from NVCF, reducing service setup complexity.
Introduced an experimental gRPC endpoint for exporting configurations for a running service instance.
Updated the logging system to output application logs in structured JSON format.
v1.0.0#
SDK Versions#
Audio2Face
: 0.17.0Audio2Emotion
: 0.2.2
Features#
New Claire 1.3 inference model provides enhanced lip movement and better accuracy for P and M sounds.
New Mark 2.2 inference model provides better lip sync and facial performance quality when used with Metahuman characters.
Users can now specify preferred emotions, enabling personalized outputs tailored to specific applications such as interactive avatars and virtual assistants.
Added emotional output to the microservice to help align other downstream animation components.
New output audio sampling rates supported in addition to 16kHz: 22.05kHz, 44.1kHz, 48kHz.
Added the ability to tune each stream at runtime with unique face parameters, emotions parameters, blendshape multipliers, and blendshape offsets.
Key improvements#
Improved the gRPC protocol to use less data and provide a more efficient stream for scalability. USD parser is no longer required.
Improved blendshape solve threading to improve scalability.