Capabilities

Warning

You are currently viewing an out-of-date version of the Triton documentation. For the latest documentation visit the Triton documentation on GitHub.

The following table shows which backends support each major inference server feature. See Datatypes for information on data-types supported by each backend.

Feature

TensorRT

TensorFlow

Caffe2

ONNX Runtime

PyTorch

Custom

Multi-GPU

Yes

Yes

Yes

Yes

Yes

Yes

Multi-Model

Yes

Yes

Yes

Yes

Yes

Yes

Batching

Yes

Yes

Yes

Yes

Yes

Yes

Dynamic Batching

Yes

Yes

Yes

Yes

Yes

Yes

Sequence Batching

Yes

Yes

Yes

Yes

Yes

Yes

Variable-Size Tensors

Yes

Yes

Yes

Yes

Yes

Yes

Shape Tensor

Yes

Tensor Reshape

Yes

Yes

Yes

Yes

Yes

Yes

String Datatype

Yes

Yes

Yes

HTTP API

Yes

Yes

Yes

Yes

Yes

Yes

GRPC API

Yes

Yes

Yes

Yes

Yes

Yes

GRPC Streaming API

Yes

Yes

Yes

Yes

Yes

Yes

Ensembling

Yes

Yes

Yes

Yes

Yes

Yes

Shared Memory API

Yes

Yes

Yes

Yes

Yes

Yes

CUDA Shared Memory API

Yes

Yes

Yes

Yes

Yes

Yes