Integrating TAO CV Models with Triton Inference Server
The NVIDIA TAO Toolkit provides users with an easy interface to generate accurate and optimized models for a number of computer vision use cases. These models are generally deployed via the DeepStream SDK or Riva pipelines.
Triton is an NVIDIA-developed inference software solution to efficiently deploy Deep Neural Networks (DNN) from several frameworks, such as TensorRT, Tensorflow, and ONNXRuntime. As part of this release, TAO Toolkit now provides a reference application outlining the steps required to deploy a trained model into Triton.
Currently, TAO Toolkit Triton apps repository shows reference implementations to integrate the following types of models into Triton Inference Server:
For more information on how to deploy these models using Triton, please refer to the documentation and source code hosted in this GitHub repository.