Advanced Usage#

Model Caching#

When the container starts for the first time, it downloads the required models from NGC. To avoid downloading the models on subsequent runs, you can cache them locally by using a cache directory:

# Create the cache directory on the host machine
export LOCAL_NIM_CACHE=~/.cache/nim
mkdir -p "$LOCAL_NIM_CACHE"
chmod 777 $LOCAL_NIM_CACHE

# Choose manifest profile id based on target architecture.
export MANIFEST_PROFILE_ID=<enter_valid_manifest_profile_id>

# Run the container with the cache directory mounted in the appropriate location
docker run -it --rm --name=maxine-eye-contact-nim \
  --runtime=nvidia \
  --gpus all \
  --shm-size=8GB \
  -e NGC_API_KEY=$NGC_API_KEY \
  -e NIM_MANIFEST_PROFILE=$MANIFEST_PROFILE_ID \
  -e NIM_HTTP_API_PORT=8000 \
  -p 8000:8000 \
  -p 8001:8001 \
  -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
  nvcr.io/nim/nvidia/maxine-eye-contact:latest

For more information about MANIFEST_PROFILE_ID, refer to the NIM Model Profile Table.

SSL enablement#

Eye-Contact NIM provides an SSL mode to ensure secure communication between clients and the server by encrypting data in transit. To enable SSL, you need to provide the path to the SSL certificate and key files in the container. The following example shows how to do this:

export NGC_API_KEY=<add-your-api-key>
SSL_CERT=path/to/ssl_key

docker run -it --rm --name=maxine-eye-contact-nim \
  --runtime=nvidia \
  --gpus all \
  --shm-size=8GB \
  -v $SSL_CERT:/opt/nim/crt/:ro \
  -e NGC_API_KEY=$NGC_API_KEY \
  -p 8000:8000 \
  -p 8001:8001\
  -e NIM_SSL_MODE="mtls" \
  -e NIM_SSL_CERT_PATH="/opt/nim/crt/ssl_cert_server.pem" \
  -e NIM_SSL_KEY_PATH="/opt/nim/crt/ssl_key_server.pem" \
  nvcr.io/nim/nvidia/maxine-eye-contact:latest

NIM_SSL_MODE can be set to “mtls”, “tls”, or “disabled”. If set to “mtls”, the container uses mutual TLS authentication. If set to “tls”, the container uses TLS authentication. For more information, refer to NIM SSL Configuration.

Be sure to verify the permissions of the SSL certificate and key files on the host machine. The container will not be able to access the files if they are not readable by the user running the container.

Multiple Concurrent Inputs#

To run the server in multi-input concurrent mode, set the environment variable MAXINE_MAX_CONCURRENCY_PER_GPU to an integer greater than 1 in the server container. The server will then accept as many concurrent inputs per GPU as specified by the MAXINE_MAX_CONCURRENCY_PER_GPU variable.

Since Triton distributes the workload equally across all GPUs, if there are NUM_GPUS GPUs, the total number of concurrent inputs supported by the server will be NUM_GPUS * MAXINE_MAX_CONCURRENCY_PER_GPU.

Feature Parameters#

The following parameters are available for use in this NIM.

Arguments to control feature behavior:#

The following arguments affect the overall behavior of the feature, such as enabling or disabling temporal filtering or gaze redirection.

temporal - (UINT32) Flag to control temporal filtering (default 0xffffffff). When set to true, the landmark computation for eye contact is temporally optimized.
detect_closure - (UINT32) Flag to toggle detection of eye closure and occlusion. If turned off, blink and occlusion detection turns off. This might be desirable during estimation-only mode if you still want to obtain gaze estimation in case of occlusion. Not recommended for gaze redirection. Value is either 0 or 1 (default 0).
eye_size_sensitivity - (UINT32) Eye size sensitivity parameter that modifies the blending parameters to use a larger region around the eyes for blending. Integer value from 2 to 6 (default 3).

Randomized look away parameters:#

A continuous redirection of gaze to look at the camera might give a perception of staring. Some users might find this effect unnatural or undesired. To occasionally break eye contact, you can enable randomized look away in gaze redirection. Although the gaze is always expected to redirect toward the camera within the range of operation, enabling look away makes the user occasionally break gaze lock to the camera with a micro-movement of the eyes at randomly chosen time intervals. The enable_look_away parameter must be set to true to enable this feature. Additionally, you can use the optional parameters look_away_offset_max, look_away_interval_min, and look_away_interval_range to tune the extent and frequency of look away.

enable_lookaway - (UINT32) Flag to toggle look away. If set to on, the eyes are redirected to look away for a random period occasionally to avoid staring. Value is either 0 or 1 (default 0).
lookaway_max_offset - (UINT32) Maximum value of gaze offset angle (degrees) during a random look away when look away is enabled. Requires --enable_look_away parameter to be set to true. Integer value from 1 to 10 (default 5).
lookaway_interval_min - (UINT32) Minimum limit for the number of frames at which random look away occurs when look away is enabled. Requires --enable_look_away parameter to be set to true. Integer value from 1 to 600 (default 100).
lookaway_interval_range - (UINT32) Range for picking the number of frames at which random look away occurs when look away is enabled. Requires --enable_look_away parameter to be set to true. Integer value from 1 to 600 (default 250).

Range control:#

The gaze redirection feature redirects the eyes to look at the camera within a certain range of head and eye motion in which eye contact is desired and looks natural. Beyond this range, the feature gradually transitions away from looking at the camera toward the estimated gaze and eventually turns off in a seamless manner. To provide for various use cases and user preferences, we provide range parameters for the user to control the range of gaze angles and head poses in which gaze redirection occurs and the range in which transition occurs before the redirection is turned off. These are optional parameters.

gaze_pitch_threshold_low and gaze_yaw_threshold_low define the parameters for the pitch and yaw angles of the estimated gaze within which gaze is redirected toward the camera. Beyond these angles, redirected gaze transitions away from the camera and toward the estimated gaze, turning off redirection beyond gaze_pitch_threshold_high and gaze_yaw_threshold_high respectively.

Similarly, head_pitch_threshold_low and head_yaw_threshold_low define the parameters for pitch and yaw angles of the head pose within which gaze is redirected toward the camera. Beyond these angles, redirected gaze transitions away from the camera and toward the estimated gaze, turning off redirection beyond head_pitch_threshold_high and head_yaw_threshold_high.

gaze_pitch_threshold_low - (FP32) Gaze pitch threshold (degrees) at which the redirection starts transitioning away from camera toward estimated gaze. Float value from 10 to 35 (default 20).
gaze_pitch_threshold_high - (FP32) Gaze pitch threshold (degrees) at which the redirection is equal to estimated gaze and the gaze redirection is turned off beyond this angle. Float value from 10 to 35 (default 30).
gaze_yaw_threshold_low - (FP32) Gaze yaw threshold (degrees) at which the redirection starts transitioning away from camera toward estimated gaze. Float value from 10 to 35 (default 20).
gaze_yaw_threshold_high - (FP32) Gaze yaw threshold (degrees) at which the redirection the redirection is equal to estimated gaze and the gaze redirection is turned off beyond this angle. Float value from 10 to 35 (default 30).
head_pitch_threshold_low - (FP32) Head pose pitch threshold (degrees) of the estimated head pose at which redirection starts transitioning away from camera and toward the estimated gaze. Float value from 10 to 35 (default 15).
head_pitch_threshold_high - (FP32) Head pose pitch threshold (degrees) of the estimated head pose at which redirection equals the estimated gaze and redirection is turned off beyond this angle. Float value from 10 to 35 (default 25).
head_yaw_threshold_low - (FP32) Head pose yaw threshold (degrees) at which the redirection starts transitioning away from camera toward estimated gaze. Float value from 10 to 35 (default 25).
head_yaw_threshold_high - (FP32) Head pose yaw threshold (degrees) of the estimated head pose at which redirection equals the estimated gaze and redirection is turned off beyond this angle. Float value from 10 to 35 (default 30).

For more details on the technical aspects of this algorithm, see the technical blog post Improve Human Connection in Video Conferences with NVIDIA Maxine Eye Contact.