NVIDIA Docs Hub NVIDIA TAO NVIDIA TAO Toolkit v4.0 Integrating TAO CV Models with Triton Inference Server

Integrating TAO CV Models with Triton Inference Server

The NVIDIA TAO Toolkit provides users with an easy interface to generate accurate and optimized models for computer vision and conversational AI use cases. These models are generally deployed via the DeepStream SDK or Riva pipelines.

Triton is an NVIDIA-developed inference software solution to efficiently deploy Deep Neural Networks (DNN) from several frameworks, such as TensorRT, Tensorflow, and ONNXRuntime. As part of this release, TAO Toolkit now provides a reference application outlining the steps required to deploy a trained model into Triton.

Currently, TAO Toolkit Triton apps repository shows reference implementations to integrate the following types of models into Triton Inference Server:

For more information on how to deploy these models using Triton, please refer to the documentation and source code hosted in this GitHub repository.