Abstract

This TensorRT Installation Guide provides step-by-step instructions for installing TensorRT 4.0 Release Candidate (RC).

1. Overview

NVIDIA® TensorRT™ is a C++ library that facilitates high performance inference on NVIDIA graphics processing units (GPUs). TensorRT takes a network definition and optimizes it by merging tensors and layers, transforming weights, choosing efficient intermediate data formats, and selecting from a large kernel catalog based on layer parameters and measured performance.

TensorRT consists of import methods to help you express your trained deep learning model for TensorRT to optimize and run. It is an optimization tool that applies graph optimization and layer fusion and finds the fastest implementation of that model leveraging a diverse collection of highly optimized kernels, and a runtime that you can use to execute this network in an inference context.

TensorRT includes an infrastructure that allows you to leverage high speed reduced precision capabilities of Pascal™ GPUs as an optional optimization.

TensorRT is built with gcc 4.8.

2. Getting Started

Ensure you are familiar with the following installation requirements and notes.
  • If you are using the TensorRT Python API and PyCUDA isn’t already installed on your system, see Installing PyCUDA. If you are testing on a Tesla V100, or if you encounter any issues with PyCUDA usage, you will almost certainly need to recompile it yourself. For more information, see Installing PyCUDA on Linux.
  • Ensure you are familiar with the Release Notes. The current version of the release notes can be found online at TensorRT Release Notes.
  • Verify that you have the CUDA Toolkit installed, release 8.0 or 9.0.
  • The TensorFlow to TensorRT model export requires TensorFlow v1.4+ with GPU acceleration enabled.
  • If a target install system has both TensorRT and one or more training frameworks installed on it, the simplest strategy is to use the same version of cuDNN for the training frameworks as the one that TensorRT ships with. If this is not possible, or for some reason strongly undesirable, be careful to properly manage the side-by-side installation of cuDNN on the single system. In some cases, depending on the training framework being used, this may not be possible without patching the training framework sources.
  • The libnvcaffe_parser.so library file from previous versions is now called libnvparsers.so in TensorRT 4.0. The installed symbolic link for libnvcaffe_parser.so is updated to point to the new libnvparsers.so library. The static library libnvcaffe_parser.a is also symbolically linked to the new libnvparser.a.
  • The installation instructions below assume you want the full TensorRT; both the C++ and TensorRT python APIs. In some environments and use cases, you may not want to install the Python functionality. In which case, simply don’t install the debian packages labeled Python or the whl files. None of the C++ API functionality depends on Python. You would need to install the UFF whl file if you want to export the UFF file from TensorFlow.

3. Downloading TensorRT

Ensure you are a member of the NVIDIA Developer Program. If not, follow the prompts to gain access.
  1. Go to: https://developer.nvidia.com/tensorrt.
  2. Click Download.
  3. Complete the TensorRT Download Survey.
  4. Select the checkbox to agree to the license terms.
  5. Follow the Quick Start Instructions and choose which method you want to install the package.
  6. Click the package you want to install. Your download begins.
  7. Verify that you download was successful. The download can be verified by comparing the MD5 checksum file with that of the download file. If either of the checksums differ, the downloaded file is corrupt and needs to be downloaded again. To calculate the MD5 checksum of the downloaded file, run the following:
    $  md5sum <file>

4. Installing TensorRT

You can choose between two installation options when installing TensorRT; a debian package or tar file.
The debian installation automatically installs any dependencies, but:
  • requires sudo root privileges to install
  • provides no flexibility as to which location TensorRT is installed into
  • requires that the CUDA Toolkit has also been installed with a debian package.

The tar file provides more flexibility, however, you need to ensure that you have the necessary dependencies already installed.

TensorRT versions: TensorRT is a product made up of separately versioned components. The version on the product conveys important information about the significance of new features while the library version conveys information about the compatibility or incompatibility of the API. The following table shows the versioning of the TensorRT components.
Table 1. Versioning of TensorRT components
Product or Component Previously Released Version Current Version Version Description
TensorRT product 3.0.4 4.0.0.x +1.0 when significant new capabilities are added.

+0.1 when capabilities have been improved.

nvinfer library, headers, samples, and documentation. 4.0.4 4.1.0 +1.0 when the API changes in a non-compatible way.

+0.1 when the API changes are backward compatible

UFF uff-converter-tf debian package 4.0.4 4.1.0 +0.1 while we are developing the core functionality.

Set to 1.0 when we have all base functionality in place.

uff.whl file 0.2.0 0.3.0
libnvinfer python package
  • python-libnvinfer
  • python-libnvinfer-dev
  • python-libnvinfer-doc
  • python3-libnvinfer
  • python3-libnvinfer-dev
  • python3-libnvinfer-doc debian package
4.0.4 4.1.0 +1.0 when the API changes in a non-compatible way.

+0.1 when the API changes are backward compatible.

tensorrt.whl file 3.0.4 4.0.0.x

4.1. Debian Installation

This section contains instructions for a developer installation and an app server installation. Choose which installation best fits your needs.

Developer Installation: The following instructions sets up a full TensorRT development environment with samples, documentation and both the C++ and Python API.
Attention: If only the C++ development environment is desired, you can modify the following instructions and simply not install the Python and UFF packages.
Note: Before issuing the following commands, you'll need to replace 4.x.x.x and yyyymmdd with your specific TensorRT version and package date. The following commands are examples.
  1. Install TensorRT from the debian package, for example:
    $ sudo dpkg -i  
    nv-tensorrt-repo-ubuntu1604-cuda9.0-rc-trt4.x.x.x-yyyymmdd_1-1_amd64.deb
    
    $ sudo apt-get update
    $ sudo apt-get install tensorrt

    If using Python 2.7:
    $ sudo apt-get install python-libnvinfer-doc swig
    The following additional packages will be installed:
    python-libnvinfer python-libnvinfer-dev swig3.0

    If using Python 3.5:
    $ sudo apt-get install python3-libnvinfer-doc
    The following additional packages will be installed:
    python3-libnvinfer python3-libnvinfer-dev

    In either case:
    $ sudo apt-get install uff-converter-tf
  2. Verify the installation:
    $ dpkg -l | grep TensorRT

    You should see something similar to the following:
    ii  libnvinfer-dev           4.1.x-1+cuda9.0   amd64  TensorRT development libraries and headers
    ii  libnvinfer-samples       4.1.x-1+cuda9.0   amd64  TensorRT samples and documentation
    ii  libnvinfer4              4.1.x-1+cuda9.0   amd64  TensorRT runtime libraries
    ii  python-libnvinfer        4.1.x-1+cuda9.0   amd64  Python bindings for TensorRT
    ii  python-libnvinfer-dev    4.1.x-1+cuda9.0   amd64  Python development package for TensorRT
    ii  python-libnvinfer-doc    4.1.x-1+cuda9.0   amd64  Documentation and samples of python bindings for TensorRT 
    ii  python3-libnvinfer       4.1.x-1+cuda9.0   amd64  Python 3 bindings for TensorRT
    ii  python3-libnvinfer-dev   4.1.x-1+cuda9.0   amd64  Python 3 development package for TensorRT
    ii  python3-libnvinfer-doc   4.1.x-1+cuda9.0   amd64  Documentation and samples of python bindings for TensorRT 
    ii  tensorrt                 4.x.x.x-1+cuda9.0 amd64  Meta package of TensorRT
    ii  uff-converter-tf         4.1.x-1+cuda9.0   amd64  UFF converter for TensorRT pack
    
App Server Installation: When setting up servers which will host TensorRT powered applications, you can simply install any of the following:
  • the libnvinfer package (C++), or
  • the python-libnvinfer package (Python), or
  • the python3-libnvinfer package (Python).
Issue the following commands if you want to run an application that was built with TensorRT. Install TensorRT from the debian package, for example:
$ sudo dpkg -i  
nv-tensorrt-repo-ubuntu1604-cuda9.0-rc-trt4.x.x.x-yyyymmdd_1-1_amd64.deb

$ sudo apt-get update
$ sudo apt-get install libnvinfer

4.2. Tar File Installation

Note: Before issuing the following commands, you'll need to replace 4.x.x.x with your specific TensorRT version. The following commands are examples.
  1. Install the following dependencies, if not already present:
    • Install the CUDA Toolkit v8.0 or 9.0
    • cuDNN 7.0.5
    • Python 2 or Python 3
  2. Choose where you want to install. This tar file will install everything into a directory called TensorRT-4.x.x.x, where 4.x.x.x is your TensorRT version. This directory will have sub-directories like lib, include, data, etc…
  3. Unpack the tar file, for example:
    $ tar xzvf TensorRT-4.x.x.x.Ubuntu-16.04.4.x86_64-gnu.cuda-9.0.cudnn7.0.tar.gz
    
    $ ls TensorRT-4.x.x.x
    bin  data  doc  include  lib  python  samples  targets  TensorRT-Release-Notes.pdf  uff
    
  4. Add the absolute path of TensorRT lib to the environment variable $LD_LIBRARY_PATH, for example:
    $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<eg:TensorRT-4.x.x.x/lib>
  5. Install the Python TensorRT package, for example:
    $ cd TensorRT-4.x.x.x/python

    If using Python 2.7:
    $ sudo pip2 install tensorrt-4.x.x.x-cp27-cp27mu-linux_x86_64.whl

    If using Python 3.5:
    $ sudo pip3 install 
    tensorrt-4.x.x.x-cp35-cp35m-linux_x86_64.whl
    

    In either case:
    $ which tensorrt
    /usr/local/bin/tensorrt
    
  6. Install the Python UFF package, for example:
    $ cd TensorRT-4.x.x.x/uff
    
    $ sudo pip2 install uff-0.3.0-py2.py3-none-any.whl
    $ which convert-to-uff
    /usr/local/bin/convert-to-uff
    
  7. Update the custom_plugins example to point to the location where the tar package was installed into. For example, in the <PYTHON_INSTALL_PATH>/tensorrt/examples/custom_layers/tensorrtplugins/setup.py file, change the following:
    • Change TENSORRT_INC_DIR to point to the <TAR_INSTALL_ROOT>/include directory.
    • Change TENSORRT_LIB_DIR to point to <TAR_INSTALL_ROOT>/lib directory.
  8. Verify the installation:
    1. Ensure that the installed files are located in the correct directories. For example, run the tree -d command to check whether all supported installed files are in place in the lib, include, data, etc… directories.
    2. Build and run one of the shipped samples, for example, sampleMNIST in the installed directory. The sample should be compiled and executable without additional settings. For more information about sampleMNSIT, see the TensorRT Developer Guide.

5. Upgrading from TensorRT 3.0.x to TensorRT 4.x

When upgrading from TensorRT 3.0.x to TensorRT 4.x, ensure you are familiar with the following notes:

Using a debian file:

  • The debian packages are designed to upgrade your development environment without removing any runtime components that other packages and programs might rely on. Therefore, if you installed TensorRT 3.0.x via a debian package and you install TensorRT 4.x, your documentation, samples and headers will be updated to the TensorRT 4.x content.
  • After you upgrade, ensure you have a package called tensorrt and the corresponding version shown by the dpkg -l command is 4.x.x.x.
  • If installing a debian package on a system where the previously installed version was from a tar file, note that the debian package will not remove the previously installed files. Unless a side-by-side installation is desired, it would be best to remove the older version before installing the new version to avoid compiling against outdated headers.
  • If libcudnn6 has been installed in parallel with libcudnn7, then you may need to switch the default libcudnn to libcudnn7 in order to properly build applications with TensorRT. TensorRT 4.0 does not support libcudnn6 and the behavior is unpredictable if libcudnn6 is used. You can switch to the latest libcudnn using update-alternatives auto mode rather than manual mode, which chooses the last installed version of libcudnn, for example:
    $ sudo update-alternatives --auto libcudnn
    
  • If you were previously using the machine learning debian repository, then it will conflict with the version of libcudnn7 that is expected to be installed with the local repository for TensorRT. The following commands will downgrade libcudnn7 to version 7.0.5.15, which is supported and tested with TensorRT, and hold the package at this version. Replace cuda9.0 with the appropriate CUDA version for your install.
    sudo apt-get install libcudnn7=7.0.5.15-1+cuda9.0 
    libcudnn7-dev=7.0.5.15-1+cuda9.0
    sudo apt-mark hold libcudnn7 libcudnn7-dev
    

Using a tar file:

  • When using the tar file installation method, install TensorRT into a new location. Tar file installations can support multiple use cases including having a full installation of TensorRT 3.0.x with headers and documentation side-by-side with a full installation of TensorRT 4.x. If the intention is to have the new version of TensorRT replace the old version, then the old version should be removed once the new version is verified.
  • If installing a tar file on a system where the previously installed version was from a debian package, note that the tar file install will not remove the previously installed packages. Unless a side-by-side installation is desired, it would be best to remove the previously installed libnvinfer4, libnvinfer-dev, and libnvinfer-samples packages to avoid confusion.

6. Uninstalling TensorRT

  1. Uninstall libnvinfer4 which was installed through the debian file.
    $ sudo apt-get purge "libnvinfer*"
  2. Uninstall uff-converter-tf, which was also installed through the debian file.
    $ sudo apt-get purge "uff-converter-tf"
  3. Uninstall the Python TensorRT package.
    If using Python 2.7:
    $ sudo pip2 uninstall tensorrt
    If using Python 3.5:
    $ sudo pip3 uninstall tensorrt
  4. Uninstall the Python UFF package.
    If using Python 2.7:
    $ sudo pip2 uninstall uff
    If using Python 3.5:
    $ sudo pip3 uninstall uff

7. Installing PyCUDA

Attention: If you have to update your CUDA version on your system, do not install PyCUDA at this time. Perform the steps in Updating CUDA instead, then install PyCUDA.
PyCUDA is used within Python wrappers to access NVIDIA’s CUDA APIs. Some of the key features of PyCUDA include:
  • Maps all of CUDA into Python.
  • Enables run-time code generation (RTCG) for flexible, fast, automatically tuned codes.
  • Added robustness: automatic management of object lifetimes, automatic error checking
  • Added convenience: comes with ready-made on-GPU linear algebra, reduction, scan.
  • Add-on packages for FFT and LAPACK available.
  • Fast. Near-zero wrapping overhead.
To install PyCUDA, issue the following command:
pip install 'pycuda>=2017.1.1'

7.1. Updating CUDA

Existing installations of PyCUDA will not automatically work with a newly installed CUDA Toolkit. That is because PyCUDA will only work with a CUDA Toolkit that is already on the target system when PyCUDA was installed. This requires that PyCUDA be updated after the newer version of the CUDA Toolkit is installed. The steps below are the most reliable method to ensure that everything works in a compatible fashion after the CUDA Toolkit on your system has been upgraded.
  1. Uninstall the existing PyCUDA installation.
  2. Update CUDA. For more information, see the CUDA Installation Guide.
  3. Install PyCUDA. To install PyCUDA, issue the following command:
    pip install 'pycuda>=2017.1.1'

8. Troubleshooting

For troubleshooting support refer to your support engineer or post your questions onto the NVIDIA Developer Forum.

Notices

Notice

THE INFORMATION IN THIS GUIDE AND ALL OTHER INFORMATION CONTAINED IN NVIDIA DOCUMENTATION REFERENCED IN THIS GUIDE IS PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE INFORMATION FOR THE PRODUCT, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. Notwithstanding any damages that customer might incur for any reason whatsoever, NVIDIA’s aggregate and cumulative liability towards customer for the product described in this guide shall be limited in accordance with the NVIDIA terms and conditions of sale for the product.

THE NVIDIA PRODUCT DESCRIBED IN THIS GUIDE IS NOT FAULT TOLERANT AND IS NOT DESIGNED, MANUFACTURED OR INTENDED FOR USE IN CONNECTION WITH THE DESIGN, CONSTRUCTION, MAINTENANCE, AND/OR OPERATION OF ANY SYSTEM WHERE THE USE OR A FAILURE OF SUCH SYSTEM COULD RESULT IN A SITUATION THAT THREATENS THE SAFETY OF HUMAN LIFE OR SEVERE PHYSICAL HARM OR PROPERTY DAMAGE (INCLUDING, FOR EXAMPLE, USE IN CONNECTION WITH ANY NUCLEAR, AVIONICS, LIFE SUPPORT OR OTHER LIFE CRITICAL APPLICATION). NVIDIA EXPRESSLY DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY OF FITNESS FOR SUCH HIGH RISK USES. NVIDIA SHALL NOT BE LIABLE TO CUSTOMER OR ANY THIRD PARTY, IN WHOLE OR IN PART, FOR ANY CLAIMS OR DAMAGES ARISING FROM SUCH HIGH RISK USES.

NVIDIA makes no representation or warranty that the product described in this guide will be suitable for any specified use without further testing or modification. Testing of all parameters of each product is not necessarily performed by NVIDIA. It is customer’s sole responsibility to ensure the product is suitable and fit for the application planned by customer and to do the necessary testing for the application in order to avoid a default of the application or the product. Weaknesses in customer’s product designs may affect the quality and reliability of the NVIDIA product and may result in additional or different conditions and/or requirements beyond those contained in this guide. NVIDIA does not accept any liability related to any default, damage, costs or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this guide, or (ii) customer product designs.

Other than the right for customer to use the information in this guide with the product, no other license, either expressed or implied, is hereby granted by NVIDIA under this guide. Reproduction of information in this guide is permissible only if reproduction is approved by NVIDIA in writing, is reproduced without alteration, and is accompanied by all associated conditions, limitations, and notices.

Trademarks

NVIDIA, the NVIDIA logo, and cuBLAS, CUDA, cuDNN, cuFFT, cuSPARSE, DIGITS, DGX, DGX-1, Jetson, Kepler, NVIDIA Maxwell, NCCL, NVLink, Pascal, Tegra, TensorRT, and Tesla are trademarks and/or registered trademarks of NVIDIA Corporation in the Unites States and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.