Overview#
NVIDIA Eye Contact NIM leverages state-of-the-art AI models to dynamically redirect a user’s eye position toward the camera in real time to simulate natural eye contact and enhance remote digital engagement. NVIDIA Eye Contact NIM models are built on the NVIDIA software platform, incorporating CUDA, TensorRT, and Triton to offer out-of-the-box GPU acceleration.
Architecture#
NVIDIA Eye Contact operates on a region of interest around the eyes, also known as the eye patch. The eye patch is extracted from a video frame using the NVIDIA face tracking pipeline, which computes the 2D facial landmarks and the 6DOF head pose from the video frame. This head pose is then fed into the eye contact network.
The eye contact network has a disentangled encoder-decoder architecture. The encoder estimates the gaze angle from the input eye patch along with a set of features, also known as embeddings. Based on these embeddings, the decoder performs redirection of the gaze in the input patch to make the face look forward.
The final stage of the pipeline involves blending the eye patch back into the original video frame using an inverse transformation. More details on the model can be found here.
Try It Out#
Try the NVIDIA Eye Contact NIM at build.nvidia.com/nvidia/eyecontact.
To experience the NVIDIA Eye Contact NIM API without having to host your own servers, use the Try API feature, which uses the NVIDIA Cloud Function backend.