Version 1 to Version 2 Migration#
Version 2 of Triton does not generally maintain backwards compatibility with version 1. Specifically, you should take the following items into account when transitioning from version 1 to version 2.
The Triton executables and libraries are in /opt/tritonserver. The Triton executable is /opt/tritonserver/bin/tritonserver.
Some tritonserver command-line arguments are removed, changed or have different default behavior in version 2.
–api-version, –http-health-port, –grpc-infer-thread-count, –grpc-stream-infer-thread-count,–allow-poll-model-repository, –allow-model-control and –tf-add-vgpu are removed.
The default for –model-control-mode is changed to none.
–tf-allow-soft-placement and –tf-gpu-memory-fraction are renamed to –backend-config=”tensorflow,allow-soft-placement=<true,false>” and –backend-config=”tensorflow,gpu-memory-fraction=<float>”.
The HTTP/REST and GRPC protocols, while conceptually similar to version 1, are completely changed in version 2. See inference protocols for more information.
Python and C++ client libraries are re-implemented to match the new HTTP/REST and GRPC protocols. The Python client no longer depends on a C++ shared library and so should be usable on any platform that supports Python. See client libraries for more information.
Building Triton has changed significantly in version 2. See build for more information.
In the Docker containers the environment variables indicating the Triton version have changed to have a TRITON prefix, for example, TRITON_SERVER_VERSION.