The Triton Inference Server is available as a pre-built Docker container or you can build it from source.
The Triton Docker container is available on the NVIDIA GPU Cloud (NGC).
Before you can pull a container from the NGC container registry, you must have Docker and nvidia-docker installed. For DGX users, this is explained in Preparing to use NVIDIA Containers Getting Started Guide. For users other than DGX, follow the nvidia-docker installation documentation to install the most recent version of CUDA, Docker, and nvidia-docker.
After performing the above setup, you can pull the Triton container using the following command:
docker pull nvcr.io/nvidia/tritonserver:20.06-py3
Replace 20.06 with the version of inference server that you want to pull.