Advanced Usage#
Model Caching#
When the container launches for the first time, it downloads the required models from NGC. To avoid downloading the models on subsequent runs, you can cache them locally by using a cache directory:
# Create the cache directory on the host machine
export LOCAL_NIM_CACHE=~/.cache/nim
mkdir -p "$LOCAL_NIM_CACHE"
chmod 777 $LOCAL_NIM_CACHE
# Choose manifest profile id based on target architecture.
export MANIFEST_PROFILE_ID=<enter_valid_manifest_profile_id>
# Run the container with the cache directory mounted in the appropriate location
docker run -it --rm --name=maxine-audio2face-2d-nim \
--runtime=nvidia \
--gpus all \
--shm-size=8GB \
-e NGC_API_KEY=$NGC_API_KEY \
-e NIM_MANIFEST_PROFILE=$MANIFEST_PROFILE_ID \
-e NIM_HTTP_API_PORT=8000 \
-p 8000:8000 \
-p 8001:8001 \
-v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
nvcr.io/nim/nvidia/maxine-audio2face-2d:latest
For more information about MANIFEST_PROFILE_ID
, refer to Model Manifest Profiles.
SSL enablement#
Audio2Face-2D NIM provides an SSL mode to ensure secure communication between clients and the server by encrypting data in transit. To enable SSL, you need to provide the path to the SSL certificate and key files in the container. The following example shows how to do this:
export NGC_API_KEY=<add-your-api-key>
SSL_CERT=path/to/ssl_key
docker run -it --rm --name=maxine-audio2face-2d-nim \
--runtime=nvidia \
--gpus all \
--shm-size=8GB \
-v $SSL_CERT:/opt/nim/crt/:ro \
-e NGC_API_KEY=$NGC_API_KEY \
-p 8000:8000 \
-p 8001:8001\
-e NIM_SSL_MODE="mtls" \
-e NIM_SSL_CERT_PATH="/opt/nim/crt/ssl_cert_server.pem" \
-e NIM_SSL_KEY_PATH="/opt/nim/crt/ssl_key_server.pem" \
nvcr.io/nim/nvidia/maxine-audio2face-2d:latest
NIM_SSL_MODE
can be set to “mtls”, “tls”, or “disabled”. If set to “mtls”, the container uses mutual TLS authentication. If set to “tls”, the container uses TLS authentication.
For more information, refer to NIM SSL Configuration.
Be sure to verify the permissions of the SSL certificate and key files on the host machine. The container will not be able to access the files if they are not readable by the user running the container.
Multiple Concurrent Inputs#
To run the server in multi-input concurrent mode, set the environment variable MAXINE_MAX_CONCURRENCY_PER_GPU
to an integer greater than 1 in the server container. The server will then accept as many concurrent inputs per GPU as specified by the MAXINE_MAX_CONCURRENCY_PER_GPU
variable.
Because Triton distributes the workload equally across all GPUs, if there are NUM_GPUS
GPUs, the total number of concurrent inputs supported by the server will be NUM_GPUS * MAXINE_MAX_CONCURRENCY_PER_GPU
.
NIM Service Configuration Parameters#
Model Selection and Animation Mode#
model_selection
: Model selection - Performance or QualitySupported values:
ModelSelection.MODEL_SELECTION_PERF
,ModelSelection.MODEL_SELECTION_QUALITY
Default:
ModelSelection.MODEL_SELECTION_QUALITY
animation_crop_mode
: Audio2Face animation cropping modeSupported values:
AnimationCroppingMode.ANIMATION_CROPPING_MODE_FACEBOX
AnimationCroppingMode.ANIMATION_CROPPING_MODE_REGISTRATION_BLENDING
AnimationCroppingMode.ANIMATION_CROPPING_MODE_INSET_BLENDING
Default:
AnimationCroppingMode.ANIMATION_CROPPING_MODE_REGISTRATION_BLENDING
Gaze and Eye Movement#
enable_lookaway
: Flag to enable Gaze look awaySupported values:
0
,1
Default:
0
lookaway_max_offset
: Maximum integer value of gaze offset when lookaway is enabledRange:
[5, 25]
Default:
20
lookaway_interval_min
: Minimum number of frames at which random lookaway occursRange:
[1, 600]
Default:
90
lookaway_interval_range
: Range for picking the number of frames for random lookawayRange:
[1, 600]
Default:
240
blink_frequency
: Frequency of eye blinks per minuteRange:
[0, 120]
Default:
6
blink_duration
: Duration of an eye blinkRange:
[2, 150]
Default:
10
Mouth Expression and Head Pose#
mouth_expression_multiplier
: Multiplier to exaggerate mouth expressionRange:
[1.0f, 2.0f]
Default:
1.4f
(Quality mode)1.0f
(Performance mode)
head_pose_mode
: Head Pose Animation modeSupported values:
HeadPoseMode.HEAD_POSE_MODE_RETAIN_FROM_PORTRAIT_IMAGE
HeadPoseMode.HEAD_POSE_MODE_PRE_DEFINED_ANIMATION
HeadPoseMode.HEAD_POSE_MODE_USER_DEFINED_ANIMATION
Default:
HEAD_POSE_MODE_RETAIN_FROM_PORTRAIT_IMAGE
head_pose_multiplier
: Multiplier to dampen the range of Head Pose AnimationRange:
[0.0f, 1.0f]
Default:
1.0f
(Quality mode)0.4f
(Performance mode)
User-Defined Head Pose Animation#
input_head_rotation
: Quaternion defining head pose rotationFormat:
[qx, qy, qz, qw]
Clamped range: ±20° in Euler angles if out of range
Default:
NA
input_head_translation
: Vector3f defining head pose translationFormat:
[tx, ty, sz]
Range:
[±0.03, ±0.02, 0.97-1.03]
Default:
NA
Note
Running both MODEL_SELECTION_QUALITY and MODEL_SELECTION_PERF modes simultaneously in a NIM launch requires a high-end GPU. On lower-end GPUs, we recommend relaunching NIM when switching between performance and quality modes as needed. Because the Triton Inference Server loads models into GPU memory, lower-end GPUs might encounter memory limitations, potentially leading to out-of-memory issues.
Setting Parameters for A2F2D NIM#
Python#
feature_params
is a Python dictionary that holds the feature parameter name and value pairs.
In feature_params
, the key portrait_image
is required and all the other keys are optional. In the absence of optional keys, default values are used.
feature_params = {
"portrait_image": portrait_image_encoded,
"model_selection": ModelSelection.MODEL_SELECTION_QUALITY,
"animation_crop_mode": AnimationCroppingMode.ANIMATION_CROPPING_MODE_REGISTRATION_BLENDING,
"enable_lookaway": 1,
"lookaway_max_offset": 20,
"lookaway_interval_min": 240,
"lookaway_interval_range": 90,
"blink_frequency": 15,
"blink_duration": 6,
"mouth_expression_multiplier": 1.4,
"head_pose_mode": head_pose_mode,
"head_pose_multiplier": 1.0,
"input_head_rotation": rotation_data_stream,
"input_head_translation": translation_data_stream,
}
# Note: input_head_rotation and input_head_translation are only required for head_pose_mode = HEAD_POSE_MODE_USER_DEFINED_ANIMATION
NodeJS#
The parameter portrait_image
is required and all the other keys are optional. In the absence of optional keys, default values are used.
/*
* APIs for all of the parameters
+-------------------------------------------------------------------+
* | Param | API calls |
* +-----------------------------------------------------------------+
* | model_selection | setModelSelection(val) |
* +-----------------------------------------------------------------+
* | animation_crop_mode | setAnimationCropMode(val) |
* +-----------------------------------------------------------------+
* | enable_lookaway | setEnableLookaway(val) |
* +-----------------------------------------------------------------+
* | lookaway_max_offset | setLookawayMaxOffset(val) |
* +-----------------------------------------------------------------+
* | lookaway_interval_min | setLookawayIntervalMin(val) |
* +-----------------------------------------------------------------+
* | lookaway_interval_range | setLookawayIntervalRange(val) |
* +-----------------------------------------------------------------+
* | blink_frequency | setBlinkFrequency(val) |
* +-----------------------------------------------------------------+
* | blink_duration | setBlinkDuration(val) |
* +-----------------------------------------------------------------+
* | mouth_expression_multiplier | setMouthExpressionMultiplier(val) |
| | |
* +-----------------------------------------------------------------+
* | head_pose_mode | setHeadPoseMode(val) |
* +-----------------------------------------------------------------+
* | head_pose_multiplier | setHeadPoseMultiplier(val) |
* +-----------------------------------------------------------------+
* | input_head_rotation | setInputHeadRotation(val) |
* +-----------------------------------------------------------------+
* | input_head_translation | setInputHeadTranslation(val) |
* +-----------------------------------------------------------------+
*/
// Note: input_head_rotation and input_head_translation are only required for head_pose_mode=3