Maxine Eye Contact (Latest)
Maxine Eye Contact (Latest)

Advanced Usage

This section provides a detailed breakdown of the inferencing script for more advanced users.

Ensure that the provided client software is in PYTHONPATH and run the following command to set up the client.

Copy
Copied!
            

import os import sys import grpc sys.path.append(os.path.join(os.getcwd(), "../interfaces")) # Importing gRPC compiler auto-generated maxine eyecontact library from eye_contact import eyecontact_pb2, eyecontact_pb2_grpc

The NIM invocation uses bi-directional gRPC streaming. To generate the request data stream, define a Python generator function. This is also known as a Python iterator of form, a simple function that yields after a call. The yield returns a chunk to be streamed. The first item in the stream is used for configuration objects that sets the NVIDIA Maxine Eye Contact feature parameters.

Copy
Copied!
            

def generate_request_for_inference( input_filepath: str = "input.mp4", params: dict = {} ): """Generator to produce the request data stream Args: input_filepath: Path to input file params: Parameters for the feature """ DATA_CHUNKS = 64 * 1024 # bytes, we send the mp4 file in 64KB chunks if params: # if params is supplied, the first item in the input stream is a config object with parameters yield eyecontact_pb2.RedirectGazeRequest( config=eyecontact_pb2.RedirectGazeConfig(**params) ) with open(input_filepath, "rb") as fd: while True: buffer = fd.read(DATA_CHUNKS) if buffer == b"": break yield eyecontact_pb2.RedirectGazeRequest(video_file_data=buffer)

The following parameters are available for use in this NIM:

  • temporal - (UINT32) Flag to control temporal filtering (default 0xffffffff). When set to true, the landmark computation for eye contact is temporally optimized.

  • detect_closure - (UINT32) Flag to toggle detection of eye closure and occlusion on/off. Value is either 0 or 1. (default 0).

  • eye_size_sensitivity - (UINT32) Eye size sensitivity parameter, an integer value between 2 to 6 (default 3)

  • enable_lookaway - (UINT32) Flag to toggle look away on/off. If set to on, the eyes are redirected to look away for a random period occasionally to avoid staring. Value is either 0 or 1.(default 0)

  • lookaway_max_offset - (UINT32) Maximum value of gaze offset angle (degrees) during a random look away, an integer value between 1 to 10 (default 5)

  • lookaway_interval_min - (UINT32) Minimum limit for the number of frames at which random look away occurs, an integer value between 1 to 600 (default 100)

  • lookaway_interval_range - (UINT32) Range for picking the number of frames at which random look away occurs, an integer value between 1 to 600 (default 250)

  • gaze_pitch_threshold_low - (FP32) Gaze pitch threshold (degrees) at which the redirection starts transitioning, away from camera towards estimated gaze, float between 10 to 35 (default 20)

  • gaze_pitch_threshold_high - (FP32) Gaze pitch threshold (degrees) at which the redirection is equal to estimated gaze, float between 10 to 35 (default 30)

  • gaze_yaw_threshold_low - (FP32) Gaze yaw threshold (degrees) at which the redirection starts transitioning, away from camera towards estimated gaze, float between 10 to 35 (default 20)

  • gaze_yaw_threshold_high - (FP32) Gaze yaw threshold (degrees) at which the redirection the redirection is equal to estimated gaze, float between 10 to 35 (default 30)

  • head_pitch_threshold_low - (FP32) Head pose pitch yaw threshold (degrees) at which the redirection start transitioning away from camera towards estimated gaze, float between 10 to 35 (default 15)

  • head_pitch_threshold_high - (FP32) Head pose pitch yaw threshold (degrees) at which the redirection is equal to estimated gaze, float between 10 to 35 (default 25)

  • head_yaw_threshold_low - (FP32) Head pose yaw threshold (degrees) at which the redirection starts transitioning, away from camera towards estimated gaze, float between 10 to 35 (default 25)

  • head_yaw_threshold_high - (FP32) Head pose yaw threshold (degrees) at which the redirection is equal to estimated gaze, float between 10 to 35 (default 30)

Before invoking the NIM, define a function that handles the incoming stream and writes it to an output file. For more details on the technical aspects of this algorithm, please refer to the technical blog.

Copy
Copied!
            

from typing import Iterator def write_output_file_from_response( response_iter: Iterator[eyecontact_pb2.RedirectGazeResponse], output_filepath: str = "output.mp4", ) -> None: """Function to write the output file from the incoming gRPC data stream. Args: response_iter: Responses from the server output_filepath: Path to output file """ with open(output_filepath, "wb") as fd: for response in response_iter: fd.write(response.video_file_data)

Now that we have the request generator and output iterator setup, connect to the NIM and invoke it. The input file path is in the variable input_filepath and the output file is written at the location in the variable output_filepath. params is a Python dictionary that holds the feature parameter name and value pairs. If params is empty, default values are used.

Wait for a message confirming that the function invocation has completed before checking the output file. Fill in the correct host and port for your target in the code snippet below:

Copy
Copied!
            

import time input_filepath = "../assets/sample_input.mp4" output_filepath = "output.mp4" params = {} # params = {"eye_size_sensitivity": 4, "detect_closure": 1 } # example of setting parameters with grpc.insecure_channel(target="localhost:8004") as channel: try: stub = eyecontact_pb2_grpc.MaxineEyeContactServiceStub(channel) start_time = time.time() responses = stub.RedirectGaze( generate_request_for_inference(input_filepath=input_filepath, params=params) ) if params: _ = next(responses) # if we passed the config, the first output # in the stream will be an echo which we will ignore write_output_file_from_response( response_iter=responses, output_filepath=output_filepath ) end_time = time.time() print( f"Function invocation completed in{end_time-start_time:.2f}s, the output file is generated." ) except BaseException as e: print(e)

To run the server in multi-input concurrent mode, set the environment variable MAXINE_MAX_CONCURRENCY_PER_GPU to an integer greater than 1 in the server container. The server will then accept as many concurrent inputs per GPU as specified by the MAXINE_MAX_CONCURRENCY_PER_GPU variable.

Since Triton distributes the workload equally across all GPUs, if there are NUM_GPUS GPUs, the total number of concurrent inputs supported by the server will be NUM_GPUS * MAXINE_MAX_CONCURRENCY_PER_GPU.

When the container starts for the first time, it will download the required models from NGC. To avoid downloading the models on subsequent runs, you can cache them locally by using a cache directory:

Copy
Copied!
            

# Create the cache directory on the host machine export LOCAL_NIM_CACHE=~/.cache/nim mkdir -p "$LOCAL_NIM_CACHE" chmod 777 $LOCAL_NIM_CACHE # Run the container with the cache directory mounted in the appropriate location docker run -it --rm --name=maxine-eye-contact-nim \ --net host \ --runtime=nvidia \ --gpus all \ --shm-size=8GB \ -e NGC_API_KEY=$NGC_API_KEY \ -e MAXINE_MAX_CONCURRENCY_PER_GPU=1 \ -e NIM_HTTP_API_PORT=9000 \ -e NIM_GRPC_API_PORT=50051 \ -p 9000:9000 \ -p 50051:50051 \ -v "$LOCAL_NIM_CACHE:/home/nvs/.cache/nim" \ nvcr.io/nim/nvidia/maxine-eye-contact:latest

Previous Running Inference
Next Support Matrix
© Copyright © 2024, NVIDIA Corporation. Last updated on Oct 1, 2024.