Overview
The TAO Toolkit Computer Vision (CV) Inference Pipeline is a C++ based SDK that provides APIs to build applications from inferences from purpose-built pre-trained AI models. The underlying framework provides a foundation to build multimodal applications. For example, the Gaze Estimation sample application requires the combination of Face Detection and Facial Landmarks (Fiducial Keypoints) Estimation.
The TAO Computer Vision Inference Pipeline is made up of three key components:
NVIDIA Triton Inference Server: Hosts and serves AI models
NVIDIA TAO Toolkit Converter: Converts pre-trained TAO models into highly optimized TensorRT models.
Inference Client x86 or L4T: Samples written in C++ that display usage of APIs to request Computer Vision inferences
The purpose-built AI models that are supported by this Inference Pipeline are as follows:
Face Detection from FaceNet
Gaze Estimation from GazeNet
Model |
Model output format |
Prunable |
INT8 |
Compatible with TAO Toolkit CV Inference Pipeline |
---|---|---|---|---|
Encrypted ONNX |
Yes |
Yes |
Yes |
|
Encrypted ONNX |
No |
No |
Yes |
|
Encrypted UFF |
Yes |
Yes |
Yes |
|
Encrypted ONNX |
No |
Yes |
Yes |
|
Encrypted ONNX |
No |
No |
Yes |
|
Encrypted ONNX |
No |
Yes |
Yes |
|
Encrypted ONNX |
No |
No |
Yes |
Users can retrain supported TAO networks, drop the optimized TensorRT models into the NVIDIA Triton Inference Server, and build their own AI applications and use-cases using the TAO Toolkit CV API.
The deployment of the Inference Pipeline is managed by the TAO Toolkit CV Quick Start Scripts, which are bash scripts that pull/start relevant containers, compile TAO models, and start the Triton Server.