Integrating TLT CV Models with Triton Inference Server
The NVIDIA Transfer Learning Toolkit (TLT) provides users with an easy interface to generate accurate and optimized models for computer vision and conversational AI use cases. These models are generally deployed via the DeepStream SDK or Riva pipelines.
Triton is an NVIDIA-developed inference software solution to efficiently deploy Deep Neural Networks (DNN) from several frameworks, such as TensorRT, Tensorflow, and ONNXRuntime. As part of this release, TLT now provides a reference application outlining the steps required to deploy a trained model into Triton.
Currently, TLT supports integration of the following types of models into Triton
For more information on how deploy these models using Triton, please refer to the documentation and source code hosted in this GitHub repository.