HTTP/REST and GRPC Protocol#

This directory contains documents related to the HTTP/REST and GRPC protocols used by Triton. Triton uses the KServe community standard inference protocols plus several extensions that are defined in the following documents:

Note that some extensions introduce new fields onto the inference protocols, and the other extensions define new protocols that Triton follows, please refer to the extension documents for detail.

For the GRPC protocol, the protobuf specification is also available. In addition, you can find the GRPC health checking protocol protobuf specification here.

Restricted Protocols#

You can configure the Triton endpoints, which implement the protocols, to restrict access to some protocols and to control network settings, please refer to protocol customization guide for detail.

IPv6#

Assuming your host or docker config supports IPv6 connections, tritonserver can be configured to use IPv6 HTTP endpoints as follows:

$ tritonserver ... --http-address ipv6:[::1]&
...
I0215 21:04:11.572305 571 grpc_server.cc:4868] Started GRPCInferenceService at 0.0.0.0:8001
I0215 21:04:11.572528 571 http_server.cc:3477] Started HTTPService at ipv6:[::1]:8000
I0215 21:04:11.614167 571 http_server.cc:184] Started Metrics Service at ipv6:[::1]:8002

This can be confirmed via netstat, for example:

$ netstat -tulpn | grep tritonserver
tcp6      0      0 :::8000      :::*      LISTEN      571/tritonserver
tcp6      0      0 :::8001      :::*      LISTEN      571/tritonserver
tcp6      0      0 :::8002      :::*      LISTEN      571/tritonserver

And can be tested via curl, for example:

$ curl -6 --verbose "http://[::1]:8000/v2/health/ready"
*   Trying ::1:8000...
* TCP_NODELAY set
* Connected to ::1 (::1) port 8000 (#0)
> GET /v2/health/ready HTTP/1.1
> Host: [::1]:8000
> User-Agent: curl/7.68.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Length: 0
< Content-Type: text/plain
<
* Connection #0 to host ::1 left intact

Mapping Triton Server Error Codes to HTTP Status Codes#

This table maps various Triton Server error codes to their corresponding HTTP status codes. It can be used as a reference guide for understanding how Triton Server errors are handled in HTTP responses.

Triton Server Error Code

HTTP Status Code

Description

TRITONSERVER_ERROR_INTERNAL

500

Internal Server Error

TRITONSERVER_ERROR_NOT_FOUND

404

Not Found

TRITONSERVER_ERROR_UNAVAILABLE

503

Service Unavailable

TRITONSERVER_ERROR_UNSUPPORTED

501

Not Implemented

TRITONSERVER_ERROR_UNKNOWN,
TRITONSERVER_ERROR_INVALID_ARG,
TRITONSERVER_ERROR_ALREADY_EXISTS,
TRITONSERVER_ERROR_CANCELLED

400

Bad Request (default for other errors)