TensorRT 6.3.1 Installation Instructions
Ensure you are familiar with the following installation requirements and notes.
• If you are using the TensorRT Python API and PyCUDA isn’t already installed on your system, see
Installing PyCUDA. If you encounter any issues with PyCUDA usage, you may need to recompile it yourself. For more information, see
https://wiki.tiker.net/PyCuda/Installation/Linux.
• Ensure you are familiar with the TensorRT 6.3.1 Release Notes in the DRIVE OS 5.2.0.0 SDK/PDK Release Notes.
• Verify that you have the CUDA Toolkit installed; version
10.2 is supported.
• The PyTorch examples have been tested with
PyTorch 1.3.0, but may work with older versions.
• If the target system has both TensorRT and one or more training frameworks installed on it, the simplest strategy is to use the same version of cuDNN for the training frameworks as the one that TensorRT ships with. If this is not possible, or for some reason strongly undesirable, be careful to properly manage the side-by-side installation of cuDNN on the single system. In some cases, depending on the training framework being used, this may not be possible without patching the training framework sources.
• The libnvcaffe_parser.so library functionality from previous versions is included in libnvparsers.so since TensorRT 5.0. The installed symbolic link for libnvcaffe_parser.so is updated to point to the new libnvparsers.so library. The static library libnvcaffe_parser.a is also symbolically linked to libnvparsers_static.a.
Installing TensorRT
The Debian installation automatically installs any dependencies; however, it:
• requires sudo or root privileges to install
• provides no flexibility as to which location TensorRT is installed into
• requires that the CUDA Toolkit and cuDNN have also been installed using Debian
• does not allow more than one minor version of TensorRT to be installed at the same time
TensorRT versions: TensorRT is a product made up of separately versioned components. The version on the product conveys important information about the significance of new features while the library version conveys information about the compatibility or incompatibility of the API.
Product/Component | Previous Released Version | Current Version | Version Description |
TensorRT product | 6.2.0 | 6.3.1 | +1.0 when significant new capabilities are added.
+0.1 when capabilities have been improved. |
nvinfer libraries, headers, samples, and documentation. | 6.2.0 | 6.3.1 | +1.0 when the API or ABI changes in a non-compatible way.
+0.1 when the API or ABI changes are backward compatible. |
UFF | uff-converter-tf Debian package | 6.2.0 | 6.3.1 | +0.1 while we are developing the core functionality. Set to 1.0 when we have all base functionality in place. |
uff-*.whl file | 0.6.6 | 0.6.6 |
graphsurgeon | graphsurgeon-tf Debian package | 6.2.0 | 6.3.1 | +0.1 while we are developing the core functionality. Set to 1.0 when we have all base functionality in place. |
graphsurgeon-*.whl file | 0.4.1 | 0.4.1 |
libnvinfer python packages | python-libnvinfer python-libnvinfer-dev python3-libnvinfer python3-libnvinfer-dev Debian packages | 6.2.0 | 6.3.1 | +1.0 when the API or ABI changes in a non-compatible way.
+0.1 when the API or ABI changes are backward compatible. |
tensorrt.whl file | 6.2.0 | 6.3.1 |
Debian Installation
This section contains instructions for a developer installation and an app server installation.
Developer Installation: The following instructions set up a full TensorRT development environment with samples, documentation and both the C++ and Python API.
Ensure you are a member of the NVIDIA Developer Program. If not, follow the prompts to gain access.
2. Click NVIDIA SDK Manager.
3. Log into the NVIDIA Developer Program.
4. Click NVIDIA SDK Manager to continue.
5. Select your download location. Your download begins.
Using the NVIDIA Machine Learning Network Repo For Debian Installation
Note: | NVIDIA recommends the NVIDIA CUDA network repository be set up first before setting up the NVIDIA Machine Learning network repository to satisfy package dependencies. We provide some example commands below to accomplish this task. For more information, see the CUDA installation chapter in the DRIVE OS 5.2.0.0 SDK Development Guide. |
1. Install the NVIDIA CUDA network repository installation package.
os="ubuntu1x04"
cuda="x.y.z"
wget https://developer.download.nvidia.com/compute/cuda/repos/${os}/x86_64/cuda-repo-${os}_${cuda}-1_amd64.deb
sudo dpkg -i cuda-repo-*.deb
Where:
• OS version: ubuntu1x04 is 1804
• CUDA version: x.y.z is 10.2.89
2. Install the NVIDIA Machine Learning network repository installation package.
os="ubuntu1x04"
wget https://developer.download.nvidia.com/compute/machine-learning/repos/${os}/x86_64/nvidia-machine-learning-repo-${os}_1.0.0-1_amd64.deb
sudo dpkg -i nvidia-machine-learning-repo-*.deb
sudo apt-get update
3. Install the TensorRT package that fits your particular needs.
a. For only running TensorRT C++ applications:
sudo apt-get install libnvinfer6 libnvonnxparsers6 libnvparsers6 libnvinfer-plugin6
b. For also building TensorRT C++ applications:
sudo apt-get install libnvinfer-dev libnvonnxparsers-dev
libnvparsers-dev libnvinfer-plugin-dev
c. For running TensorRT Python applications:
sudo apt-get install python-libnvinfer python3-libnvinfer
4. When using the NVIDIA Machine Learning network repository, Ubuntu will by default install TensorRT for the latest CUDA version. The following commands will install libnvinfer6 for an older CUDA version and hold the libnvinfer6 package at this version. Replace 6.x.x with your version of TensorRT and cudax.x with your CUDA version for your install.
version="6.x.x-1+cudax.x"
sudo apt-get install libnvinfer6=${version} libnvonnxparsers6=${version} libnvparsers6=${version} libnvinfer-plugin6=${version} libnvinfer-dev=${version} libnvonnxparsers-dev=${version} libnvparsers-dev=${version} libnvinfer-plugin-dev=${version} python-libnvinfer=${version} python3-libnvinfer=${version}
sudo apt-mark hold libnvinfer6 libnvonnxparsers6 libnvparsers6 libnvinfer-plugin6 libnvinfer-dev libnvonnxparsers-dev libnvparsers-dev libnvinfer-plugin-dev python-libnvinfer python3-libnvinfer
If you want to upgrade to the latest version of TensorRT or the latest version of CUDA, then you can unhold the libnvinfer6 package using the following command.
sudo apt-mark unhold libnvinfer6 libnvonnxparsers6 libnvparsers6 libnvinfer-plugin6 libnvinfer-dev libnvonnxparsers-dev libnvparsers-dev libnvinfer-plugin-dev python-libnvinfer python3-libnvinfer
You may need to repeat these steps for libcudnn7 to prevent cuDNN from being updated to the latest CUDA version. Refer to the TensorRT 6.3.1 Release Notes in the DRIVE OS 5.2.0.0 SDK/PDK Release Notes for the specific version of cuDNN that was tested with your version of TensorRT. Example commands for downgrading and holding the cuDNN version can be found in the Safety Supported Samples And Tools section. Refer to the CUDA installation section of the DRIVE OS 5.2.0.0 SDK Development Guide.
If both the NVIDIA Machine Learning network repository and a TensorRT local repository are enabled at the same time you may observe package conflicts with either TensorRT or cuDNN. You will need to configure APT so that it prefers local packages over network packages. You can do this by creating a new file at /etc/apt/preferences.d/local-repo with the following lines:
Package: *
Pin: origin ""
Pin-Priority: 1001
Note: | This preference change will affect more than just TensorRT in the unlikely event that you have other repositories which are also not downloaded over HTTP(S). To revert APT to its original behavior simply remove the newly created file. |
Additional Installation Methods
Aside from installing TensorRT from the product package, you can also install TensorRT from the following locations:
• TensorRT container. The TensorRT container provides an easy method for deploying TensorRT with all necessary dependencies already packaged in the container. For information about installing TensorRT via a container, see the
TensorRT Container Release Notes.
Safety Supported Samples and Tools
Title | TensorRT Sample Name | Description |
dlaSafetyBuilder | dlaSafetyBuilder | A tool to generate the NvMedia DLA loadable from models without having to develop your own application. |
dlaSafetyRuntime | dlaSafetyRuntime | A tool to load a DLA loadable and run inference using safety certified NvMedia DLA APIs. |
“Hello World” For TensorRT Safety | sampleSafeMNIST | Consists of two parts; build and infer. The build part of this sample demonstrates how to use the builder flag for safety. The inference part of this sample demonstrates how to use the safe runtime, engine and execution context. |
Safety C++ Samples
You can find the Safety samples in the /usr/src/tensorrt/samples package directory. The following Safety samples are shipped with TensorRT:
Running Safety Samples
To run one of the Safety samples, the process typically involves the following steps:
1. Download the dataset.
2. Download the prototxt file.
3. Put all the images and files into the data directory.
4. Compile the sample.
For more information on running samples, see the README.md file included with the sample.
“Hello World” For TensorRT Safety
What does this sample do?
This sample, sampleSafeMNIST, consists of two parts; build and infer. The build part of this sample demonstrates how to use the builder IBuilderConfig::setEngineCapability() flag for safety. The inference part of this sample demonstrates how to use the safe runtime, engine and execution context.
The build part builds a safe version of a TensorRT engine and saves it into a binary file, then the infer part loads the prebuilt safe engine and performs inference on an input image. The infer part uses the safety header proxy, with the CMakeLists.txt file demonstrating how to build it against the safety subset. This sample can be run in FP16 and INT8 modes.
Specifically, this sample demonstrates how to:
• Perform the basic setup and initialization of TensorRT using the Caffe parser
• Import A Caffe Model Using The C++ Parser API
• Preprocess the input and store the result in a managed buffer
• Build An Engine In C++
• Serialize A Model In C++
• Perform Inference In C++
For step-by-step instructions, refer to the DRIVE OS 5.2.0.0 TensorRT 6.3.1 API Reference PDF in the DRIVE OS 5.2.0.0 SDK product package.
Where is this sample located?
This sample is maintained under the /usr/src/tensorrt/samples/sampleSafeMNIST directory. If using the Debian or RPM package, the sample is located at /usr/src/tensorrt/samples/sampleSafeMNIST. If using the tar or zip package, the sample is at <extracted_path>/samples/sampleSafeMNIST.
How do I get started?
Refer to the /usr/src/tensorrt/samples/sampleSafeMNIST/README.md file for detailed information about how this sample works, sample code, and step-by-step instructions on how to run and verify its output.
Cross Compiling Samples For AArch64 Users
The following sections show how to cross compile TensorRT samples for AArch64 users.
Prerequisites
1. Install the CUDA cross-platform toolkit for the corresponding target and set the environment variable CUDA_INSTALL_DIR.
export CUDA_INSTALL_DIR="your cuda install dir"
Where CUDA_INSTALL_DIR is set to /usr/local/cuda by default.
2. Install the cuDNN cross-platform libraries for the corresponding target and set the environment variable CUDNN_INSTALL_DIR.
export CUDNN_INSTALL_DIR="your cudnn install dir"
Where CUDNN_INSTALL_DIR is set to CUDA_INSTALL_DIR by default.
3. Install the TensorRT cross compilation debian packages for the corresponding target.
Note: | If you are using the tar file release for the target platform, then you can safely skip this step. The tar file release already includes the cross compile libraries so no additional packages are required. |
• AArch64 QNX: libnvinfer-dev-cross-qnx
• AArch64 Linux: libnvinfer-dev-cross-aarch64
Building Samples For AArch64 QNX
Download the QNX toolchain and export the following environment variables.
export QNX_HOST=/path/to/your/qnx/toolchain/host/linux/x86_64
export QNX_TARGET=/path/to/your/qnx/toolchain/target/qnx7
Build the samples by issuing:
cd /path/to/TensorRT/samples
make TARGET=qnx
Building Samples For AArch64 Linux
For Linux AArch64 you need to first install the corresponding GCC compiler, aarch64-linux-gnu-g++. In Ubuntu, this can be installed via:
sudo apt-get install g++-aarch64-linux-gnu
Build the samples by issuing:
cd /path/to/TensorRT/samples
make TARGET=aarch64
Upgrading
Upgrading TensorRT to the latest version is only supported when the currently installed TensorRT version is equal to or newer than the last two public releases. For example, TensorRT 6.x.x supports upgrading from TensorRT 5.1.x and TensorRT 6.0.x. If you want to upgrade from an unsupported version, then you should upgrade incrementally until you reach the latest version of TensorRT.
Ubuntu Users
The following section provides step-by-step instructions for upgrading TensorRT for Ubuntu users.
Upgrading From TensorRT 5.x.x To TensorRT 6.x.x
These upgrade instructions are for Ubuntu users only. When upgrading from TensorRT 5.x.x to TensorRT 6.x.x, ensure you are familiar with the following.
Using a Debian file:
• The Debian packages are designed to upgrade your development environment without removing any runtime components that other packages and programs might rely on. If you installed TensorRT 5.x.x via a Debian package and you upgrade to TensorRT 6.x.x, your documentation, samples, and headers will all be updated to the TensorRT 6.x.x content. After you have downloaded the new local repo, use apt-get to upgrade your system to the new version of TensorRT.
os="ubuntu1x04"
tag="cudax.x-trt6.x.x.x-ga-yyyymmdd"
sudo dpkg -i nv-tensorrt-repo-${os}-${tag}_1-1_amd64.deb
sudo apt-get update
sudo apt-get install tensorrt libcudnn7
• If using Python 2.7:
sudo apt-get install python-libnvinfer-dev
• If using Python 3:
sudo apt-get install python3-libnvinfer-dev
• If you are using the uff-converter and/or graphsurgeon, then you should also upgrade those Debian packages to the latest versions.
sudo apt-get install uff-converter-tf graphsurgeon-tf
• After you upgrade, ensure you have a directory /usr/src/tensorrt and the corresponding version shown by the dpkg -l tensorrt command is 6.x.x.x.
• The libnvinfer6 package will not be removed until you use:
sudo apt-get autoremove
• If installing a Debian package on a system where the previously installed version was from a tar file, note that the Debian package will not remove the previously installed files. Unless a side-by-side installation is desired, it would be best to remove the older version before installing the new version to avoid compiling against outdated libraries.
• If you are currently or were previously using the NVIDIA Machine Learning network repository, then it may conflict with the version of libcudnn7 that is expected to be installed from the local repository for TensorRT. The following commands will change libcudnn7 to version 7.6.x.x, which is supported and tested with TensorRT 6.x.x, and hold the libcudnn7 package at this version. Replace cudax.x with the appropriate CUDA version for your install.
version="7.6.x.x-1+cudax.x"
sudo apt-get install libcudnn7=${version} libcudnn7-dev=${version}
sudo apt-mark hold libcudnn7 libcudnn7-dev
Uninstalling TensorRT
To uninstall TensorRT using the Debian package, follow these steps:
1. Uninstall libnvinfer6, which was installed using the Debian package.
sudo apt-get purge "libnvinfer*"
2. Uninstall uff-converter-tf and graphsurgeon-tf, which were also installed using the Debian packages.
sudo apt-get purge graphsurgeon-tf
The uff-converter-tf will also be removed with the above command.
You can use the following command to uninstall uff-converter-tf and not remove graphsurgeon-tf, however, it is no longer required.
sudo apt-get purge uff-converter-tf
You can later use autoremove to uninstall graphsurgeon-tf as well.
sudo apt-get autoremove
3. Uninstall the Python TensorRT wheel file.
If using Python 2.7:
sudo pip2 uninstall tensorrt
If using Python 3.x:
sudo pip3 uninstall tensorrt
4. Uninstall the Python UFF wheel file.
If using Python 2.7:
sudo pip2 uninstall uff
If using Python 3.x:
sudo pip3 uninstall uff
5. Uninstall the Python GraphSurgeon wheel file.
If using Python 2.7:
sudo pip2 uninstall graphsurgeon
If using Python 3.x:
sudo pip3 uninstall graphsurgeon
Installing PyCUDA
This section provides useful information regarding PyCUDA including how to install.
Attention: If you have to update your CUDA version on the system, do not install PyCUDA at this time. Perform the steps in Updating CUDA first, then install PyCUDA.
PyCUDA is used within Python wrappers to access NVIDIA’s CUDA APIs. Some of the key features of PyCUDA include:
• Maps all of CUDA into Python.
• Enables run-time code generation (RTCG) for flexible, fast, automatically tuned codes.
• Added robustness: automatic management of object lifetimes, automatic error checking
• Added convenience: comes with ready-made on-GPU linear algebra, reduction, scan. Add-on packages for FFT and LAPACK available.
• Fast. Near-zero wrapping overhead.
To install PyCUDA first make sure nvcc is in your PATH, then issue the following command:
pip install 'pycuda>=2019.1.1'
If you encounter any issues with PyCUDA usage after installing PyCUDA with the above command, you may need to recompile it yourself. For more information, see
https://wiki.tiker.net/PyCuda/Installation/Linux.
Updating CUDA
Existing installations of PyCUDA will not automatically work with a newly installed CUDA Toolkit. That is because PyCUDA will only work with a CUDA Toolkit that is already on the target system when PyCUDA was installed. This requires that PyCUDA be updated after the newer version of the CUDA Toolkit is installed. The steps below are the most reliable method to ensure that everything works in a compatible fashion after the CUDA Toolkit on your system has been upgraded.
1. Uninstall the existing PyCUDA installation.
2. Update CUDA. For more information, see the CUDA installation chapter in the DRIVE OS SDK Development Guide.
3. Install PyCUDA. To install PyCUDA, issue the following command:
pip install 'pycuda>=2019.1.1'
Troubleshooting
For troubleshooting support refer to your support engineer or post your questions onto the
NVIDIA Developer Forum.