This TensorRT 8.0.3 Quick Start Guide is a starting point for
developers who want to try out TensorRT SDK; specifically, this document demonstrates
how to quickly construct an application to run inference on a TensorRT engine.
NVIDIA TensorRT is a C++ library that facilitates high
performance inference on NVIDIA GPUs. It is designed to work in connection with deep
learning frameworks that are commonly used for training. TensorRT focuses specifically
on running an already trained network quickly and efficiently on a GPU for the purpose
of generating a result; also known as inferencing. These release notes describe the key
features, software enhancements and improvements, and known issues for the TensorRT
8.0.3 product package.
These support matrices provide a look into the supported
platforms, features, and hardware capabilities of the TensorRT 8.0.3 APIs, parsers, and
layers.
This TensorRT 8.0.3 Installation Guide provides the installation
requirements, a list of what is included in the TensorRT package, and step-by-step
instructions for installing TensorRT.
This is the API Reference documentation for the NVIDIA TensorRT
library. The following set of APIs allows developers to import pre-trained models,
calibrate networks for INT8, and build and deploy optimized networks with TensorRT.
Networks can be imported from ONNX. They may also be created programmatically using the
C++ or Python API by instantiating individual layers and setting parameters and weights
directly.
This TensorRT Developer Guide demonstrates how to use the C++ and
Python APIs for implementing the most common deep learning layers. It shows how you can
take an existing model built with a deep learning framework and build a TensorRT engine
using the provided parsers. The Developer Guide also provides step-by-step instructions
for common user tasks such as creating a TensorRT network definition, invoking the
TensorRT builder, serializing and deserializing, and how to feed the engine with data
and perform inference; all while using either the C++ or Python API.
This Samples Support Guide provides an overview of all the
supported TensorRT 8.0.3 samples included on GitHub and in the product package. The
TensorRT samples specifically help in areas such as recommenders, machine comprehension,
character recognition, image classification, and object detection.
This Best Practices Guide covers various performance
considerations related to deploying networks using TensorRT 8.0.3. These sections assume
that you have a model that is working at an appropriate level of accuracy and that you
are able to successfully use TensorRT to do inference for your model.
PyTorch-Quantization is a toolkit for training and evaluating
PyTorch models with simulated quantization. Quantization can be added to the model
automatically, or manually, allowing the model to be tuned for accuracy and performance.
The quantized model can be exported to ONNX and imported to an upcoming version of
TensorRT.
This document is the LICENSE AGREEMENT FOR NVIDIA SOFTWARE
DEVELOPMENT KITS for NVIDIA TensorRT. This document contains specific license terms and
conditions for NVIDIA TensorRT. By accepting this agreement, you agree to comply with
all the terms and conditions applicable to the specific product(s) included herein.