TensorRT Documentation#

NVIDIA TensorRT is an SDK that facilitates high-performance machine learning inference. It complements training frameworks such as TensorFlow, PyTorch, and MXNet. It focuses on running an already-trained network quickly and efficiently on NVIDIA hardware.

Attention

Ensure you refer to the Release Notes, which describes the newest features, software enhancements and improvements, and known issues for the TensorRT release product package.

  • The Quick Start Guide is a starting point for developers who want to try out the TensorRT SDK; specifically, it demonstrates how to quickly construct an application to run inference on a TensorRT engine.

  • The Support Matrix provides an overview of the supported platforms, features, and hardware capabilities of the TensorRT APIs, parsers, and layers.

  • The Installing TensorRT section provides the installation requirements, a list of what is included in the TensorRT package, and step-by-step instructions for installing TensorRT.

  • The Architecture section demonstrates how to use the C++ and Python APIs to implement the most common deep learning layers. It shows how to take an existing model built with a deep learning framework and build a TensorRT engine using the provided parsers.

  • The Inference Library section demonstrates how to use the C++ and Python APIs for implementing the most common deep learning layers. It shows how you can take an existing model built with a deep learning framework and build a TensorRT engine using the provided parsers.

    • The Sample Support Guide provides an overview of all the supported TensorRT samples on GitHub and the product package. The TensorRT samples specifically help in recommenders, machine comprehension, character recognition, image classification, and object detection.

  • The Performance section introduces how to use trtexec, a command-line tool designed for TensorRT performance benchmarking, to get the inference performance measurements of your deep learning models.

  • The API section enables developers in C++ and Python based development environments and those looking to experiment with TensorRT to easily parse models (for example, from ONNX) and generate and run PLAN files.

    • The API Migration Guide highlights the TensorRT API modifications. If you are unfamiliar with these changes, refer to our sample code for clarification.

  • The Reference section will help answer commonly asked questions regarding typical use cases as well as provide additiona resources for assistance.