Prerequisites#

Ensure you are familiar with the following installation requirements and notes.

  • If you use the TensorRT Python API and CUDA-Python but haven’t installed it on your system, refer to the NVIDIA CUDA-Python documentation.

  • Familiarize yourself with the NVIDIA TensorRT Release Notes for the latest features and known issues.

  • Verify that you have installed the NVIDIA CUDA Toolkit. If CUDA has not been installed, review the NVIDIA CUDA Installation Guide for instructions on installing the CUDA Toolkit. The following versions are supported:

  • cuDNN is now an optional dependency for TensorRT and is only used to speed up a few deprecated layers. On Blackwell and later GPUs, cuDNN is not supported for use with TensorRT and does not provide support for these deprecated layers. If you require cuDNN for your application or model, verify it is installed. Review the NVIDIA cuDNN Installation Guide for more information. TensorRT 10.9.0 supports cuDNN 8.9.7. cuDNN is not used by the lean or dispatch runtimes.

  • cuBLAS is now an optional dependency for TensorRT and is only used to speed up a few layers. If you require cuBLAS, verify that you have it installed. Review the NVIDIA cuBLAS website for more information.

  • Some Python samples require TensorFlow 2.13.1, such as efficientdet and efficientnet.

  • The PyTorch examples have been tested with PyTorch >= 2.0 but may work with older versions.

  • The ONNX-TensorRT parser has been tested with ONNX 1.16.0 and supports opset 20.

  • The following installation instructions assume you want both the C++ and Python APIs. However, you may not want to install the Python functionality in some environments and use cases. If so, don’t install the Debian or RPM packages labeled Python. None of the C++ API functionality depends on Python.

  • We provide the possibility to install TensorRT in three different modes:

    • A full installation of TensorRT, including TensorRT plan file builder functionality. This mode is the same as the runtime provided before TensorRT 8.6.0.

    • A lean runtime installation is significantly smaller than the full installation. It allows you to load and run engines built with a version-compatible builder flag. However, this installation does not provide the functionality to build a TensorRT plan file.

    • A dispatch runtime installation. This installation allows for deployments with minimum memory consumption. It allows you to load and run engines built with a version compatible with the builder flag and includes the lean runtime. However, it does not provide the functionality to build a TensorRT plan file.

  • For developers who simply want to convert ONNX models into TensorRT engines, Nsight Deep Learning Designer, a GUI-based tool, can be used without a separate installation of TensorRT. Nsight Deep Learning Designer automatically downloads necessary TensorRT bits (including CUDA, cuDNN, and cuBLAS) on-demand.