Installing Triton

The Triton Inference Server is available as a pre-built Docker container or you can build it from source.

The Triton Docker container is available on the NVIDIA GPU Cloud (NGC).

Before you can pull a container from the NGC container registry, you must have Docker and nvidia-docker installed. For DGX users, this is explained in Preparing to use NVIDIA Containers Getting Started Guide. For users other than DGX, follow the nvidia-docker installation documentation to install the most recent version of CUDA, Docker, and nvidia-docker.

After performing the above setup, you can pull the Triton container using the following command:

docker pull nvcr.io/nvidia/tritonserver:20.08-py3

Replace 20.08 with the version of inference server that you want to pull.