You are currently viewing an out-of-date version of the Triton documentation. For the latest documentation visit the Triton documentation on GitHub.
The Triton Inference Server is available as a pre-built Docker container or you can build it from source.
The Triton Docker container is available on the NVIDIA GPU Cloud (NGC).
Before you can pull a container from the NGC container registry, you must have Docker and nvidia-docker installed. For DGX users, this is explained in Preparing to use NVIDIA Containers Getting Started Guide. For users other than DGX, follow the nvidia-docker installation documentation to install the most recent version of CUDA, Docker, and nvidia-docker.
After performing the above setup, you can pull the Triton container using the following command:
docker pull nvcr.io/nvidia/tritonserver:20.08-py3
Replace 20.08 with the version of inference server that you want to pull.