Prerequisites#

Before installing TensorRT, ensure your system meets the following requirements. This page is organized by category to help you quickly find the information you need.

Quick Checklist:

✓ NVIDIA GPU (Turing architecture or later) ✓ CUDA Toolkit 12.x or 13.x installed ✓ Appropriate GPU drivers (r535+ on Linux, r537+ on Windows) ✓ Python 3.8-3.13 (for Python API)

Before You Begin#

Review Release Information

Before installation, familiarize yourself with the NVIDIA TensorRT Release Notes to understand:

  • New features in this release

  • Known issues and limitations

  • Platform-specific considerations

  • Compatibility changes

Choose Your API

TensorRT provides both C++ and Python APIs:

  • C++ API - Full functionality, no Python dependency

  • Python API - Convenient for rapid prototyping and integration

  • Both - Most users install both (default)

The installation instructions assume you want both APIs. For C++ only installation, skip Python-specific packages.

Required Software#

NVIDIA CUDA Toolkit (Required)

TensorRT requires the NVIDIA CUDA Toolkit. If not already installed, refer to the NVIDIA CUDA Installation Guide.

Supported CUDA Versions:

Driver Requirements:

  • Linux: NVIDIA driver r535 or later

  • Windows: NVIDIA driver r537 or later

  • CUDA 13.x: NVIDIA driver r580 or later (both platforms)

Optional Dependencies#

The following libraries are optional and only needed for specific use cases:

cuDNN (Optional)

cuDNN is now optional and only used to accelerate a few deprecated layers.

  • When needed: If your model uses deprecated layers that require cuDNN

  • Not supported: On Blackwell+ GPUs or with CUDA 13

  • Installation: Refer to the NVIDIA cuDNN Installation Guide

Note

Most modern networks do not require cuDNN. TensorRT has optimized implementations for nearly all common operations.

cuBLAS (Optional)

cuBLAS is optional and only used for a few specific layers.

  • When needed: If your model requires cuBLAS-accelerated layers

  • Installation: Refer to the NVIDIA cuBLAS website

CUDA-Python (Optional)

CUDA-Python enables direct CUDA kernel calls from Python.

Framework and Model Support#

PyTorch Integration

If you plan to use TensorRT with PyTorch:

  • Tested with: PyTorch >= 2.0

  • Compatibility: May work with older versions

  • Use case: Examples and integration samples

ONNX Model Support

The ONNX-TensorRT parser supports:

  • ONNX version: 1.20.0

  • Opset support: Up to opset 24

  • Backward compatibility: Official support is provided for opset 9 and above

TensorRT Installation Modes#

TensorRT offers three installation modes to suit different deployment scenarios:

Full Installation (Recommended for Development)

  • Includes: Complete builder and runtime functionality

  • Use for: Model development, optimization, and deployment

  • Size: Largest footprint (~2-3 GB)

  • Capabilities: Build engines, run inference, full API access

Lean Runtime (Recommended for Production)

  • Includes: Runtime-only functionality

  • Use for: Production deployment with pre-built engines

  • Size: Significantly smaller (~200-300 MB)

  • Capabilities: Run version-compatible engines

  • Limitations: Cannot build new engines

Dispatch Runtime (Recommended for Minimal Footprint)

  • Includes: Minimal runtime with lean runtime functionality

  • Use for: Memory-constrained deployments

  • Size: Smallest footprint (~100-150 MB)

  • Capabilities: Run version-compatible engines

  • Limitations: Cannot build new engines

Tip

Development to Production Workflow:

  1. Use Full Installation during development to build and optimize models

  2. Serialize optimized engines to plan files

  3. Deploy with Lean Runtime or Dispatch Runtime in production

Next Steps#

After verifying prerequisites:

  1. Proceed to Installing TensorRT to choose your installation method

  2. Select the installation mode that fits your use case

  3. Follow the step-by-step installation instructions