Basic Inference#
Perform a health check on the gRPC endpoint.
Install
grpcurl
from github.com/fullstorydev/grpcurl/releases.Example commands to run on Ubuntu:
wget https://github.com/fullstorydev/grpcurl/releases/download/v1.9.1/grpcurl_1.9.1_linux_amd64.deb sudo dpkg -i grpcurl_1.9.1_linux_amd64.deb
Download the health checking proto:
wget https://raw.githubusercontent.com/grpc/grpc/master/src/proto/grpc/health/v1/health.proto
Run the health check:
grpcurl --plaintext --proto health.proto localhost:8001 grpc.health.v1.Health/Check
If the service is ready, you get a response similar to the following:
{ "status": "SERVING" }
Note
For using grpcurl with an SSL enabled server, avoid using the
--plaintext
argument, and use--cacert
with a CA certificate,--key
with a private key, or--cert
with a certificate file. For more details, refer togrpcurl --help
.Download the Eye Contact Python client code by cloning the Clients repository (NVIDIA-Maxine/nim-clients):
git clone https://github.com/NVIDIA-Maxine/nim-clients.git # Go to the 'eye-contact' folder cd nim-clients/eye-contact/
Install the required dependencies:
sudo apt-get install python3-pip pip install -r requirements.txt
Compile the Protos (Optional)#
If you want to use the client code provided in the Clients repository (NVIDIA-Maxine/nim-clients), you can skip this step.
The proto files are available in the eye-contact/protos
folder. You can compile them to generate client interfaces in your preferred programming language. For more details, refer to Supported languages in the gRPC documentation.
The following is an example of how to compile the protos for Python on Linux and Windows.
The grpcio
version needed for compilation can be referred from requirements.txt
To compile protos on Linux:
# Go to eye-contact/protos/linux folder
cd eye-contact/protos/linux
chmod +x compile_protos.sh
./compile_protos.sh
To compile protos on Windows:
# Go to eye-contact/protos/windows folder
cd eye-contact/protos/windows
./compile_protos.bat
Input and Output#
The input and output of the NVIDIA Eye Contact NIM are mp4
files. The input file must use the H.264 codec for video and can include audio. The output video file also uses the H.264 codec for video and contains the same audio as the input video file.
Note
Videos with Variable Frame Rate (VFR) are not supported.
Input Modes#
The NVIDIA Eye Contact NIM supports two distinct input processing modes based on the input file capabilities:
Transactional Mode (Default)#
In transactional mode, the entire input video file must be received and processed as a complete unit by the NIM before returning results.
This mode is suitable for the following use cases:
Processing of small video files, as these files are copied to NIM in entirety before inference can begin.
Applications that can wait for complete processing before receiving output.
Videos that are not optimized for streaming.
To run the NVIDIA Eye Contact NIM in transactional mode, simply run the NVIDIA Eye Contact sample client without any additional flags:
Go to the scripts
directory:
cd scripts
Send a gRPC request:
python eye-contact.py --target <server_ip:port> \
--input <input file path> \
--output <output file path along with file name>
Streaming Mode#
In streaming mode, the NIM can process incoming video frames incrementally as soon as frame information is available. The output frames are streamed back to the client immediately after processing.
This is the preferred mode for the following use cases:
Streamable video inputs.
Applications that do not want to wait until the entire file is completely uploaded.
Large video files that can benefit from incremental processing and thereby not constrained by the disk space on the server.
Aspect |
Transactional Mode |
Streaming Mode |
---|---|---|
Data Storage |
Entire video and audio files are temporarily copied on disk |
Only frames being processed are temporarily copied in memory |
Processing Start |
NIM waits to receive entire files before starting |
NIM starts processing as soon as data chunk for first frame arrives |
Processing Timing |
Processing begins after all data is received |
Continuous processing without waiting for complete input |
Output Delivery |
Complete output video is returned to client after inference is finished for whole video |
Output frames are generated and returned immediately |
The --streaming
option works with streamable videos where metadata is positioned at the beginning of the file. Videos that are not streamable can be easily converted to a streamable format.
To make any video streamable, use FFmpeg with the following command:
ffmpeg -i sample_video.mp4 -movflags +faststart sample_streamable.mp4
You can then specify the streamable video as input to the NIM by using the --input
parameter and enabling the --streaming
option.
python eye-contact.py --target <server_ip:port> \
--input <input file path> \
--output <output file path along with file name> \
--streaming
Note
The first inference is not indicative of the model’s actual performance because it includes the time taken by the Triton Inference Server to load the models in addition to the time required to process the inference request.
Usage for Preview API Request#
python eye-contact.py --preview-mode \
--target grpc.nvcf.nvidia.com:443 \
--function-id <function_id> \
--api-key $API_KEY_REQUIRED_IF_EXECUTING_OUTSIDE_NGC \
--input <input_file_path> \
--output <output_file_path>
To view details of command-line arguments, run this command:
python eye-contact.py -h
You get a response similar to the following. All parameters are optional.
options:
-h, --help show this help message and exit
--preview-mode Flag to send request to preview NVCF NIM server on https://build.nvidia.com/nvidia/eyecontact/api. (default: False)
--ssl-mode {DISABLED,MTLS,TLS} Flag to set SSL mode, default is DISABLED (default: DISABLED)
--ssl-key SSL_KEY The path to ssl private key. (default: ../ssl_key/ssl_key_client.pem)
--ssl-cert SSL_CERT The path to ssl certificate chain. (default: ../ssl_key/ssl_cert_client.pem)
--ssl-root-cert SSL_ROOT_CERT The path to ssl root certificate. (default: ../ssl_key/ssl_ca_cert.pem)
--target TARGET IP:port of gRPC service, when hosted locally. Use grpc.nvcf.nvidia.com:443 when hosted on NVCF. (default: 127.0.0.1:8001)
--api-key API_KEY NGC API key required for authentication, utilized when using TRY API ignored otherwise (default: None)
--function-id FUNCTION_ID NVCF function ID for the service, utilized when using TRY API ignored otherwise (default: None)
--input INPUT The path to the input video file. (default: ../assets/sample_transactional.mp4)
--output OUTPUT The path for the output video file. (default: output.mp4)
--streaming Flag to enable grpc streaming mode. Required for streamable video input. (default: False)
--bitrate BITRATE Output video bitrate in bps (default: 3000000). This is only applicable when lossless mode is disabled. (default: 3000000)
--idr-interval IDR_INTERVAL The interval for IDR frames in the output video. This is only applicable when lossless mode is disabled. (default: 8) (default: 8)
--lossless Flag to enable lossless mode for video encoding. (default: False)
--custom-encoding-params CUSTOM_ENCODING_PARAMS Custom encoding parameters in JSON format. (default: None)
--temporal TEMPORAL Flag to control temporal filtering (default: 4294967295) (default: 4294967295)
--detect-closure DETECT_CLOSURE Flag to toggle detection of eye closure and occlusion on/off (default: 0) (default: 0)
--eye-size-sensitivity EYE_SIZE_SENSITIVITY Eye size sensitivity parameter (default: 3, range: [2, 6]) (default: 3)
--enable-lookaway {0,1} Flag to toggle look away on/off (default: 0) (default: 0)
--lookaway-max-offset LOOKAWAY_MAX_OFFSET Maximum value of gaze offset angle (degrees) during a random look away (default: 5, range: [1, 10]) (default: 5)
--lookaway-interval-min LOOKAWAY_INTERVAL_MIN Minimum limit for the number of frames at which random look away occurs (default: 100, range: [1, 600]) (default: 100)
--lookaway-interval-range LOOKAWAY_INTERVAL_RANGE Range for picking the number of frames at which random look away occurs (default: 250, range: [1, 600]) (default: 250)
--gaze-pitch-threshold-low GAZE_PITCH_THRESHOLD_LOW Gaze pitch threshold (degrees) at which the redirection starts transitioning (default: 20.0, range: [10, 35]) (default: 20.0)
--gaze-pitch-threshold-high GAZE_PITCH_THRESHOLD_HIGH Gaze pitch threshold (degrees) at which the redirection is equal to estimated gaze (default: 30.0, range: [10, 35]) (default: 30.0)
--gaze-yaw-threshold-low GAZE_YAW_THRESHOLD_LOW Gaze yaw threshold (degrees) at which the redirection starts transitioning (default: 20.0, range: [10, 35]) (default: 20.0)
--gaze-yaw-threshold-high GAZE_YAW_THRESHOLD_HIGH Gaze yaw threshold (degrees) at which the redirection is equal to estimated gaze (default: 30.0, range: [10, 35]) (default: 30.0)
--head-pitch-threshold-low HEAD_PITCH_THRESHOLD_LOW Head pose pitch threshold (degrees) at which the redirection starts transitioning away from camera towards estimated gaze (default: 15.0, range: [10, 35]) (default: 15.0)
--head-pitch-threshold-high HEAD_PITCH_THRESHOLD_HIGH Head pose pitch threshold (degrees) at which the redirection is equal to estimated gaze (default: 15.0, range: [10, 35]) (default: 15.0)
--head-yaw-threshold-low HEAD_YAW_THRESHOLD_LOW Head pose yaw threshold (degrees) at which the redirection starts transitioning (default: 15.0, range: [10, 35]) (default: 15.0)
--head-yaw-threshold-high HEAD_YAW_THRESHOLD_HIGH Head pose yaw threshold (degrees) at which the redirection is equal to estimated gaze (default: 15.0, range: [10, 35]) (default: 15.0)
Example Commands#
Basic inference with default settings:
python eye-contact.py --target 127.0.0.1:8001 --input ../assets/sample_transactional.mp4 --output output.mp4
Using streaming mode for streamable videos:
python eye-contact.py --target 127.0.0.1:8001 --input ../assets/sample_streamable.mp4 --output output.mp4 --streaming
Advanced configuration example:
python eye-contact.py --target 127.0.0.1:8001 --input ../assets/sample_transactional.mp4 --output output.mp4 --custom-encoding-params '{"control-rate": "1", "bitrate": 3000000, "tuning-info-id": "3", "temporalaq": true}'
Python Notebook#
For an interactive experience and to explore all feature parameters, we provide a comprehensive Python notebook that demonstrates the Eye Contact service capabilities.
The Python notebook is located at examples/maxine-eye-contact.ipynb
in the eye-contact
client folder.