You are currently viewing an out-of-date version of the Triton documentation. For the latest documentation visit the Triton documentation on GitHub.

The Triton Inference Server exposes both HTTP/REST and GRPC endpoints based on KFServing standard inference protocols that have been proposed by the KFServing project. To fully enable all capabilities Triton also implements a number HTTP/REST and GRPC extensions to the KFServing inference protocol.

The HTTP/REST and GRPC protcols provide endpoints to check server and model health, metadata and statistics. Additional endpoints allow model loading and unloading, and inferencing. See the KFServing and extension documentation for details.