Release Notes
TensorRT CLI
Minor improvements will be continuously pushed without expectation that you will need to upgrade.
Major changes can be expected at a monthly cadence with expectation that you will upgrade your version of the CLI.
Until we release TensorRT-Cloud CLI 1.0, expect some API breaking changes with new releases.
TensorRT-Cloud
Minor improvements will be continuously pushed to production to provide enhancements as soon as possible.
Major API breaking changes will be announced clearly with release notes. We expect to make some API breaking changes as we receive feedback from EA customers. On an API breaking change, we expect you to upgrade to a newer version of the CLI.
Backward compatibility will be considered for support for GA.
TensorRT-Cloud 0.2.0 Early Access (EA)
Announcements
The TensorRT-Cloud CLI tool is now available on PyPi.
Key Features and Enhancements
The following features and enhancements have been added to this release:
Added support for access to pre-built engines through TensorRT-Cloud.
Added support for more NVIDIA GeForce GPUs. For more information, refer to Planned GPU Support.
Breaking API Changes
CLI flags:
trt-cloud build --weightless
was renamed to--strip-weights
trt-cloud build --strip-weights
(formerly--weightless
) no longer performs refit automatically. It is now an opt-in option with--local-refit
.
Limitations
Input model files have a maximum file size of 5 GB.
This will be fixed in future releases. For now, models larger than 5 GB should use the weightless flow. Refer to the Weight-Stripped Engine Generation section for information on weightless engine building.
Refit requires a GPU of the same SM version as was used to build the engine. (This is a TensorRT limitation.)
By default, weight-stripped engines must be refitted with the original ONNX weights. Only engines that were built with the
--refit
flag in thetrtexec
arg list may be refitted with arbitrary weights.Fully refittable engines might have some performance degradation.
Custom plugins or any custom ops are not supported. Only built-in TensorRT ops and plugins will work.
Input ONNX models must come from one of the following:
S3
GitHub
Local machine
The TensorRT-Cloud server has a daily limit on the amount of data it can process for building engines on Windows. If TensorRT-Cloud hits this limit on a given day, then building on Windows will not be available for the rest of the day.
Known Issues
For inquiries and to report issues, contact tensorrt-cloud-contact@nvidia.com.
TensorRT-Cloud 0.1.1 Early Access (EA)
Announcements
The TensorRT-Cloud CLI tool will be published to PyPi in the near future.
Key Features and Enhancements
The following features and enhancements have been added to this release:
Added support for on-demand ONNX TensorRT engines for closed EA accounts.
Added support for a vast variety of NVIDIA GeForce GPUs available to build TensorRT engines. For more information, refer to Planned GPU Support.
Limitations
Input model files have a maximum file size of 5 GB.
This will be fixed in future releases. For now, models larger than 5 GB should use the weightless flow. Refer to the Weight-Stripped Engine Generation section for information on weightless engine building.
Refit requires a GPU of the same SM version as was used to build the engine. (This is a TensorRT limitation.)
By default, weightless engines must be refitted with the original ONNX weights. Only engines that were built with the
--refit
flag in thetrtexec
arg list may be refitted with arbitrary weights.Fully refittable engines might have some performance degradation.
Custom plugins or any custom ops are not supported. Only built-in TensorRT ops and plugins will work.
Input ONNX models must come from one of the following:
S3
GitHub
Local machine
The TensorRT-Cloud server has a daily limit on the amount of data it can process for building engines on Windows. If TensorRT-Cloud hits this limit on a given day, then building on Windows will not be available for the rest of the day.
Known Issues
For inquiries and to report issues, contact tensorrt-cloud-contact@nvidia.com.