Tokkio LLM-RAG - A2F-2D#
Note
This workflow leverages Audio2Face-2D, to access this resource, apply for ACE Early Access. Once approved, verify you have access to this resource or some of the hyperlinks might not work
Introduction#
Tokkio-A2F-2D, empowered by Audio2Face-2D / Speech Live Portrait, allows users to setup their avatar by simply providing a png portrait image file. There are various settings that users can configure based on their needs between performance and quality. Some facial features (e.g. eye focus / eye blink / etc.) can also be customized. More information on the available configurations on the A2F-2D microservice are explained in detail from its A2F-2D user guide.
This reference app is a variation of Tokkio LLM-RAG using A2F-2D as its renderer. Other workflows such as Tokkio Retail can also be used with this particular rendering option.
Minimum GPU Requirements#
L4 |
A10 |
|
---|---|---|
1 stream |
2x |
2x |
3 streams |
4x |
4x |
Prerequisite#
A frontal face portrait png image file
Access to ACE Early Access assets in NGC
Architecture#
Below diagram illustrates the overview of the components that comprise the Tokkio-A2F-2D app as well as their relationship and interaction.
Deployment#
Helm Chart Deployment#
To deploy the Tokkio-A2F-2D app with default settings, use the sample A2F-2D helm chart to deploy. Refer to Integrating Customization changes with rebuild for details on how to install the chart with Helm.
UCS App Spec Deployment#
To modify the default settings and rebuild the app with UCS CLI tool, first download the sample UCS app spec
Note
Refer to Integrating Customization changes with rebuild for instructions on installing, configuring, and running the UCS CLI tool
$ ucf_app_builder_cli app build <ucs-app> <ucs-param>
Replace the above placeholder(s) as follow:
<ucs-app> == UCS app yaml file
<ucs-param> == UCS param yaml file
Running the App#
Run the Front-end to test out your deployment
Note
The steps below are optional
Customizations#
Additional steps specific to the Tokkio A2F-2D app customization are explained below.
For a complete guide on the available configuration options and general tuning guideline, refer to the A2F-2D user guide.
Portrait picture customization#
1. Create a frontal face png file#
A png portrait image file that clearly shows the frontal face of the person being animated is required. A sample portrait image is shown below:
2. Replace the default png file from the POD#
Note
The updated png will be replaced by the default portrait png once the chat-controller POD is restarted
$ kubectl cp <portrait-png> <chat_controller_pod_name>:/workspace/riva/lp_portrait.png
Replace the above placeholder(s) as follow:
<portrait-png> == path of your portrait png file
<chat_controller_pod_name> == POD name of the chat-controller
Animation Property Tuning#
1. Create a JSON config file#
When initiating the gRPC connection with the A2F-2D microservice, a configuration JSON is required to specific the preferences such as, image quality, and more.
For demonstration, a sample JSON config is shown below.
{
"lp_config": {
"animation_cropping_mode": "ANIMATION_CROPPING_MODE_BLEND",
"model_selection": "MODEL_SELECTION_PERF",
"eye_blink_config": {
"blink_frequency": {
"value": 15,
"unit": "UNIT_TIMES_PER_MINUTE"
},
"blink_duration": {
"value": 6,
"unit": "UNIT_FRAME"
}
},
"gaze_look_away_config": {
"enable_gaze_look_away": false,
"max_look_away_offset": {
"value": 20,
"unit": "UNIT_DEGREE_ANGLE"
},
"min_look_away_interval": {
"value": 240,
"unit": "UNIT_FRAME"
},
"look_away_interval_range": {
"value": 60,
"unit": "UNIT_FRAME"
}
},
"mouth_expression_config": {
"mouth_expression_multiplier": 1.0
}
},
"endpoint_config": {
"input_media_config": {
"audio_input_config": {
"stream_config": {
"stream_type": "GRPC"
},
"channels": 1,
"channel_index": 0,
"layout": "AUDIO_LAYOUT_INTERLEAVED",
"sample_rate_hz": 16000,
"chunk_duration_ms": 20,
"encoding": "AUDIO_ENCODING_RAW",
"decoder_config": {
"raw_dec_config": {
"format": "AUDIO_FORMAT_S16LE"
}
}
}
},
"output_media_config": {
"audio_output_config": {
"stream_config": {
"stream_type": "UDP",
"udp_params": {
"host": "127.0.0.1",
"port": "9017"
}
},
"payloader_config": {
"type": "PAYLOADER_RTP"
},
"sample_rate_hz": 16000,
"chunk_duration_ms": 20,
"encoding": "AUDIO_ENCODING_RAW",
"encoder_config": {
"raw_enc_config": {
"format": "AUDIO_FORMAT_S16BE"
}
}
},
"video_output_config": {
"stream_config": {
"stream_type": "UDP",
"udp_params": {
"host": "127.0.0.1",
"port": "9019"
}
},
"payloader_config": {
"type": "PAYLOADER_RTP"
},
"encoding": "H264",
"encoder_config": {
"h264_enc_config": {
"idr_frame_interval": 30
}
}
}
}
},
"quality_profile": "SPEECH_LP_QUALITY_PROFILE_LOW_LATENCY"
}
Configuration options explanation:
animation_cropping_mode
- Portrait image cropping preference - ANIMATION_CROPPING_MODE_FACEBOX, ANIMATION_CROPPING_MODE_BLEND, ANIMATION_CROPPING_MODE_INSET_BLENDmodel_selection
- MODEL_SELECTION_PERF for performance mode and MODEL_SELECTION_QUALITY for quality modeeye_blink_config
⇒ Customize the eye blink behavior of the avatar such as blink_frequency and blink_durationgaze_look_away_config
⇒ Redirect the eyes to look away and specify the angle as well as the intervalsmouth_expression_config
⇒ Multiplier to exaggerate the mouth expressionquality_profile
⇒ Different modes of execution based on the preference of performance vs. quality - SPEECH_LP_QUALITY_PROFILE_LOW_LATENCY, SPEECH_LP_QUALITY_PROFILE_ULTRA_LOW_LATENCY, SPEECH_LP_QUALITY_PROFILE_HIGH_QUALITY, SPEECH_LP_QUALITY_PROFILE_ULTRA_HIGH_QUALITY
For complete details on the configuration options, refer to the protos/v1/speech_live_portrait.proto
under the A2F-2D quick start guide file browser
Note
Save the JSON object to a file as you will need it to configure your UCS app spec in 2. Configure the UCS app spec
2. Configure the UCS app spec#
Specify the LP config JSON file created from Animation Property Tuning in the UCS app yaml under the chat-controller section as below:
Note
Make sure the A2F-2D config json exists in the specified location in <path_to_lp_config>
on your local disk
...
- name: chat-controller
type: ucf.svc.ace-agent.chat-controller
parameters:
imagePullSecrets:
- name: ngc-docker-reg-secret
secrets:
ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
files:
lp_config.json: <path_to_lp_config>
...