Triton Tutorials#

For users experiencing the “Tensor in” & “Tensor out” approach to Deep Learning Inference, getting started with Triton can lead to many questions. The goal of this repository is to familiarize users with Triton’s features and provide guides and examples to ease migration. For a feature by feature explanation, refer to the Triton Inference Server documentation.

Getting Started Checklist#

Quick Deploy#

The focus of these examples is to demonstrate deployment for models trained with various frameworks. These are quick demonstrations made with an understanding that the user is somewhat familiar with Triton.

Deploy a …#

LLM Tutorials#

The table below contains some popular models that are supported in our tutorials

Note: This is not an exhausitive list of what Triton supports, just what is included in the tutorials.

What does this repository contain?#

This repository contains the following resources:

  • Conceptual Guide: This guide focuses on building a conceptual understanding of the general challenges faced whilst building inference infrastructure and how to best tackle these challenges with Triton Inference Server.

  • Quick Deploy: These are a set of guides about deploying a model from your preferred framework to the Triton Inference Server. These guides assume a basic understanding of the Triton Inference Server. It is recommended to review the getting started material for a complete understanding.

  • HuggingFace Guide: The focus of this guide is to walk the user through different methods in which a HuggingFace model can be deployed using the Triton Inference Server.

  • Feature Guides: This folder is meant to house Triton’s feature-specific examples.

  • Migration Guide: Migrating from an existing solution to Triton Inference Server? Get an understanding of the general architecture that might best fit your use case.

  • Agentic Workflow Guide: This guide provides a set of tutorials designed to help you deploy AI agents efficiently using the Triton Inference Server.

Adding Requests#

Open an issue and specify details for adding a request for an example. Want to make a contribution? Open a pull request and tag an Admin.