NVIDIA Deep Learning TensorRT Documentation - Last updated May 12, 2020 - Send Feedback -

NVIDIA TensorRT


Release Notes
NVIDIA TensorRT is a C++ library that facilitates high performance inference on NVIDIA GPUs. It is designed to work in connection with deep learning frameworks that are commonly used for training. TensorRT focuses specifically on running an already trained network quickly and efficiently on a GPU for the purpose of generating a result; also known as inferencing. These release notes describe the key features, software enhancements and improvements, and known issues for the TensorRT 7.1.0 Early Access (EA) product package.
Support Matrix
These support matrices provide a look into the supported platforms, features, and hardware capabilities of the TensorRT 7.1.0 Early Access (EA) APIs, parsers, and layers.
Installation Guide
This TensorRT 7.1.0 Early Access (EA) Installation Guide provides the installation requirements, a list of what is included in the TensorRT package, and step-by-step instructions for installing TensorRT.

Inference Library


API
This is the API documentation for the NVIDIA TensorRT library. The following set of APIs allows developers to import pre-trained models, calibrate their networks using INT8, and build and deploy optimized networks. Networks can be imported directly from NVCaffe, or from other frameworks via the UFF or ONNX formats. They may also be created programmatically using the C++ or Python API by instantiating individual layers and setting parameters and weights directly.
Developer Guide
This TensorRT 7.1.0 Early Access (EA) Developer Guide demonstrates how to use the C++ and Python APIs for implementing the most common deep learning layers. It shows how you can take an existing model built with a deep learning framework and use that to build a TensorRT engine using the provided parsers. The Developer Guide also provides step-by-step instructions for common user tasks such as, creating a TensorRT network definition, invoking the TensorRT builder, serializing and deserializing, and how to feed the engine with data and perform inference; all while using either the C++ or Python API.
Samples Support Guide
This TensorRT 7.1.0 Early Access (EA) Samples Support Guide provides a detailed look into every TensorRT sample that is included in the package.

Performance


Best Practices For TensorRT Performance
This Best Practice guide covers various performance considerations related to deploying networks using TensorRT 7.1.0 Early Access (EA). These sections assume that you have a model that is working at an appropriate level of accuracy and that you are able to successfully use TensorRT to do inference for your model.

Optimized Frameworks


Container Release Notes
The TensorRT container is an easy to use container for TensorRT development. The container allows for the TensorRT samples to be built, modified, and executed. These release notes provide a list of key features, packaged software included in the container, software enhancements and improvements, and any known issues for the 20.03 and earlier releases. The TensorRT container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream; which are all tested, tuned, and optimized.

Licenses


SLA
This document is the Software License Agreement (SLA) for NVIDIA TensorRT. This document contains specific license terms and conditions for NVIDIA TensorRT. By accepting this agreement, you agree to comply with all the terms and conditions applicable to the specific product(s) included herein.

Archives


Documentation Archives
This Archives document provides access to previously released TensorRT documentation versions.