Integrating TAO CV Models with Triton Inference Server
The NVIDIA TAO Toolkit provides users with an easy interface to generate accurate and optimized models for computer vision and conversational AI use cases. These models are generally deployed via the DeepStream SDK or Riva pipelines.
Triton is an NVIDIA-developed inference software solution to efficiently deploy Deep Neural Networks (DNN) from several frameworks, such as TensorRT, Tensorflow, and ONNXRuntime. As part of this release, TAO Toolkit now provides a reference application outlining the steps required to deploy a trained model into Triton.
Currently, TAO Toolkit supports integration of the following types of models into Triton
For more information on how deploy these models using Triton, please refer to the documentation and source code hosted in this GitHub repository.