Is this page helpful?

Prerequisites#

Before installing TensorRT, ensure your system meets the following requirements. This page is organized by category to help you quickly find the information you need.

Quick Checklist:

✓ NVIDIA GPU (Turing architecture or later) ✓ CUDA Toolkit 12.x or 13.x installed ✓ Appropriate GPU drivers (r535+ on Linux, r537+ on Windows) ✓ Python 3.8-3.13 (for Python API)

Before You Begin#

Review Release Information

Before installation, familiarize yourself with the NVIDIA TensorRT Release Notes to understand:

New features in this release
Known issues and limitations
Platform-specific considerations
Compatibility changes

Choose Your API

TensorRT provides both C++ and Python APIs:

C++ API - Full functionality, no Python dependency
Python API - Convenient for rapid prototyping and integration
Both - Most users install both (default)

The installation instructions assume you want both APIs. For C++ only installation, skip Python-specific packages.

Required Software#

NVIDIA CUDA Toolkit (Required)

TensorRT requires the NVIDIA CUDA Toolkit. If not already installed, refer to the NVIDIA CUDA Installation Guide.

Supported CUDA Versions:

CUDA 13.x

CUDA 12.x

Driver Requirements:

Linux: NVIDIA driver r535 or later
Windows: NVIDIA driver r537 or later
CUDA 13.x: NVIDIA driver r580 or later (both platforms)

Optional Dependencies#

The following libraries are optional and only needed for specific use cases:

cuDNN (Optional)

cuDNN is now optional and only used to accelerate a few deprecated layers.

When needed: If your model uses deprecated layers that require cuDNN
Not supported: On Blackwell+ GPUs or with CUDA 13
Installation: Refer to the NVIDIA cuDNN Installation Guide

Note

Most modern networks do not require cuDNN. TensorRT has optimized implementations for nearly all common operations.

cuBLAS (Optional)

cuBLAS is optional and only used for a few specific layers.

When needed: If your model requires cuBLAS-accelerated layers
Installation: Refer to the NVIDIA cuBLAS website

CUDA-Python (Optional)

CUDA-Python enables direct CUDA kernel calls from Python.

When needed: If you use TensorRT Python API with custom CUDA operations
Installation: Refer to the NVIDIA CUDA-Python documentation

Framework and Model Support#

PyTorch Integration

If you plan to use TensorRT with PyTorch:

Tested with: PyTorch >= 2.0
Compatibility: May work with older versions
Use case: Examples and integration samples

ONNX Model Support

The ONNX-TensorRT parser supports:

ONNX version: 1.20.0
Opset support: Up to opset 24
Backward compatibility: Official support is provided for opset 9 and above

TensorRT Installation Modes#

TensorRT offers three installation modes to suit different deployment scenarios:

Full Installation (Recommended for Development)

Includes: Complete builder and runtime functionality
Use for: Model development, optimization, and deployment
Size: Largest footprint (~2-3 GB)
Capabilities: Build engines, run inference, full API access

Lean Runtime (Recommended for Production)

Includes: Runtime-only functionality
Use for: Production deployment with pre-built engines
Size: Significantly smaller (~200-300 MB)
Capabilities: Run version-compatible engines
Limitations: Cannot build new engines

Dispatch Runtime (Recommended for Minimal Footprint)

Includes: Minimal runtime with lean runtime functionality
Use for: Memory-constrained deployments
Size: Smallest footprint (~100-150 MB)
Capabilities: Run version-compatible engines
Limitations: Cannot build new engines

Tip

Development to Production Workflow:

Use Full Installation during development to build and optimize models
Serialize optimized engines to plan files
Deploy with Lean Runtime or Dispatch Runtime in production

Next Steps#

After verifying prerequisites:

Proceed to Installing TensorRT to choose your installation method
Select the installation mode that fits your use case
Follow the step-by-step installation instructions