Integrating TAO CV Models with Triton Inference Server

The NVIDIA TAO Toolkit provides users with an easy interface to generate accurate and optimized models for a number of computer vision use cases. These models are generally deployed via the DeepStream SDK or Riva pipelines.

Triton is an NVIDIA-developed inference software solution to efficiently deploy Deep Neural Networks (DNN) from several frameworks, such as TensorRT, Tensorflow, and ONNXRuntime. As part of this release, TAO Toolkit now provides a reference application outlining the steps required to deploy a trained model into Triton.

Currently, TAO Toolkit Triton apps repository shows reference implementations to integrate the following types of models into Triton Inference Server:

For more information on how to deploy these models using Triton, please refer to the documentation and source code hosted in this GitHub repository.

© Copyright 2023, NVIDIA.. Last updated on Sep 13, 2023.