Advanced Usage
This section provides a detailed breakdown of the inferencing script for more advanced users.
Ensure that the provided client software is in PYTHONPATH
and run the following command to set up the client.
import os
import sys
import grpc
sys.path.append(os.path.join(os.getcwd(), "../interfaces"))
# Importing gRPC compiler auto-generated maxine eyecontact library
from eye_contact import eyecontact_pb2, eyecontact_pb2_grpc
The NIM invocation uses bi-directional gRPC streaming. To generate the request data stream, define a Python generator function. This is also known as a Python iterator of form, a simple function that yields after a call. The yield returns a chunk to be streamed. The first item in the stream is used for configuration objects that sets the NVIDIA Maxine Eye Contact feature parameters.
def generate_request_for_inference(
input_filepath: str = "input.mp4", params: dict = {}
):
"""Generator to produce the request data stream
Args:
input_filepath: Path to input file
params: Parameters for the feature
"""
DATA_CHUNKS = 64 * 1024 # bytes, we send the mp4 file in 64KB chunks
if params: # if params is supplied, the first item in the input stream is a config object with parameters
yield eyecontact_pb2.RedirectGazeRequest(
config=eyecontact_pb2.RedirectGazeConfig(**params)
)
with open(input_filepath, "rb") as fd:
while True:
buffer = fd.read(DATA_CHUNKS)
if buffer == b"":
break
yield eyecontact_pb2.RedirectGazeRequest(video_file_data=buffer)
The following parameters are available for use in this NIM:
temporal
- (UINT32) Flag to control temporal filtering (default0xffffffff
). When set to true, the landmark computation for eye contact is temporally optimized.detect_closure
- (UINT32) Flag to toggle detection of eye closure and occlusion on/off. Value is either 0 or 1. (default 0).eye_size_sensitivity
- (UINT32) Eye size sensitivity parameter, an integer value between 2 to 6 (default 3)enable_lookaway
- (UINT32) Flag to toggle look away on/off. If set to on, the eyes are redirected to look away for a random period occasionally to avoid staring. Value is either 0 or 1.(default 0)lookaway_max_offset
- (UINT32) Maximum value of gaze offset angle (degrees) during a random look away, an integer value between 1 to 10 (default 5)lookaway_interval_min
- (UINT32) Minimum limit for the number of frames at which random look away occurs, an integer value between 1 to 600 (default 100)lookaway_interval_range
- (UINT32) Range for picking the number of frames at which random look away occurs, an integer value between 1 to 600 (default 250)gaze_pitch_threshold_low
- (FP32) Gaze pitch threshold (degrees) at which the redirection starts transitioning, away from camera towards estimated gaze, float between 10 to 35 (default 20)gaze_pitch_threshold_high
- (FP32) Gaze pitch threshold (degrees) at which the redirection is equal to estimated gaze, float between 10 to 35 (default 30)gaze_yaw_threshold_low
- (FP32) Gaze yaw threshold (degrees) at which the redirection starts transitioning, away from camera towards estimated gaze, float between 10 to 35 (default 20)gaze_yaw_threshold_high
- (FP32) Gaze yaw threshold (degrees) at which the redirection the redirection is equal to estimated gaze, float between 10 to 35 (default 30)head_pitch_threshold_low
- (FP32) Head pose pitch yaw threshold (degrees) at which the redirection start transitioning away from camera towards estimated gaze, float between 10 to 35 (default 15)head_pitch_threshold_high
- (FP32) Head pose pitch yaw threshold (degrees) at which the redirection is equal to estimated gaze, float between 10 to 35 (default 25)head_yaw_threshold_low
- (FP32) Head pose yaw threshold (degrees) at which the redirection starts transitioning, away from camera towards estimated gaze, float between 10 to 35 (default 25)head_yaw_threshold_high
- (FP32) Head pose yaw threshold (degrees) at which the redirection is equal to estimated gaze, float between 10 to 35 (default 30)
Before invoking the NIM, define a function that handles the incoming stream and writes it to an output file. For more details on the technical aspects of this algorithm, please refer to the technical blog.
from typing import Iterator
def write_output_file_from_response(
response_iter: Iterator[eyecontact_pb2.RedirectGazeResponse],
output_filepath: str = "output.mp4",
) -> None:
"""Function to write the output file from the incoming gRPC data stream.
Args:
response_iter: Responses from the server
output_filepath: Path to output file
"""
with open(output_filepath, "wb") as fd:
for response in response_iter:
fd.write(response.video_file_data)
Now that we have the request generator and output iterator setup, connect to the NIM and invoke it.
The input file path is in the variable input_filepath
and the output file is written at the location in the variable output_filepath
.
params
is a Python dictionary that holds the feature parameter name and value pairs.
If params
is empty, default values are used.
Wait for a message confirming that the function invocation has completed before checking the output file. Fill in the correct host and port for your target in the code snippet below:
import time
input_filepath = "../assets/sample_input.mp4"
output_filepath = "output.mp4"
params = {}
# params = {"eye_size_sensitivity": 4, "detect_closure": 1 } # example of setting parameters
with grpc.insecure_channel(target="localhost:8004") as channel:
try:
stub = eyecontact_pb2_grpc.MaxineEyeContactServiceStub(channel)
start_time = time.time()
responses = stub.RedirectGaze(
generate_request_for_inference(input_filepath=input_filepath, params=params)
)
if params:
_ = next(responses) # if we passed the config, the first output
# in the stream will be an echo which we will ignore
write_output_file_from_response(
response_iter=responses, output_filepath=output_filepath
)
end_time = time.time()
print(
f"Function invocation completed in{end_time-start_time:.2f}s, the output file is generated."
)
except BaseException as e:
print(e)
To run the server in multi-input concurrent mode, set the environment variable MAXINE_MAX_CONCURRENCY_PER_GPU
to an integer greater than 1 in the server container. The server will then accept as many concurrent inputs per GPU as specified by the MAXINE_MAX_CONCURRENCY_PER_GPU
variable.
Since Triton distributes the workload equally across all GPUs, if there are NUM_GPUS
GPUs, the total number of concurrent inputs supported by the server will be NUM_GPUS * MAXINE_MAX_CONCURRENCY_PER_GPU
.
When the container starts for the first time, it will download the required models from NGC. To avoid downloading the models on subsequent runs, you can cache them locally by using a cache directory:
# Create the cache directory on the host machine
export LOCAL_NIM_CACHE=~/.cache/nim
mkdir -p "$LOCAL_NIM_CACHE"
chmod 777 $LOCAL_NIM_CACHE
# Run the container with the cache directory mounted in the appropriate location
docker run -it --rm --name=maxine-eye-contact-nim \
--net host \
--runtime=nvidia \
--gpus all \
--shm-size=8GB \
-e NGC_API_KEY=$NGC_API_KEY \
-e MAXINE_MAX_CONCURRENCY_PER_GPU=1 \
-e NIM_HTTP_API_PORT=9000 \
-e NIM_GRPC_API_PORT=50051 \
-p 9000:9000 \
-p 50051:50051 \
-v "$LOCAL_NIM_CACHE:/home/nvs/.cache/nim" \
nvcr.io/nim/nvidia/maxine-eye-contact:latest