Prerequisites#

Before installing TensorRT, ensure your system meets the following requirements. This page is organized by category to help you quickly find the information you need.

Quick Checklist:

✓ NVIDIA GPU (Turing architecture or later) ✓ CUDA Toolkit 12.x or 13.x installed ✓ Appropriate GPU drivers (r535+ on Linux, r537+ on Windows) ✓ Python 3.10-3.13 recommended (3.8-3.9 bindings available but samples not supported)

Before You Begin#

Review Release Information

Before installation, familiarize yourself with the NVIDIA TensorRT Release Notes to understand:

  • New features in this release

  • Known issues and limitations

  • Platform-specific considerations

  • Compatibility changes

Choose Your API

TensorRT provides both C++ and Python APIs:

  • C++ API - Full functionality, no Python dependency

  • Python API - Convenient for rapid prototyping and integration

  • Both - Most users install both (default)

The installation instructions assume you want both APIs. For C++ only installation, skip Python-specific packages.

Required Software#

NVIDIA CUDA Toolkit

TensorRT requires the NVIDIA CUDA Toolkit. If not already installed, refer to the NVIDIA CUDA Installation Guide.

Supported CUDA Versions:

Driver Requirements:

  • Linux: NVIDIA driver r535 or later

  • Windows: NVIDIA driver r537 or later

  • CUDA 13.x: NVIDIA driver r580 or later (both platforms)

For more information, refer to the TensorRT Support Matrix.

Optional Dependencies#

The following libraries are optional and only needed for specific use cases:

cuBLAS (Optional)

cuBLAS is optional and only used for a few specific layers.

  • When needed: If your model requires cuBLAS-accelerated layers

  • Installation: Refer to the NVIDIA cuBLAS website

CUDA-Python (Optional)

CUDA-Python enables direct CUDA kernel calls from Python.

NCCL (Optional)

Required only for the multi-device inference feature.

For more information, refer to the TensorRT Support Matrix.

Framework and Model Support#

PyTorch Integration

If you plan to use TensorRT with PyTorch:

  • Tested with: PyTorch >= 2.0

  • Compatibility: May work with older versions

  • Use case: Examples and integration samples

ONNX Model Support

The ONNX-TensorRT parser supports:

  • ONNX version: 1.20.0

  • Opset support: Up to opset 25

  • Backward compatibility: Official support is provided for opset 9 and above

For more information, refer to the TensorRT Support Matrix.

TensorRT Installation Modes#

TensorRT offers three installation modes to suit different deployment scenarios:

Full Installation (Recommended for Development)

  • Includes: Complete builder and runtime functionality

  • Use for: Model development, optimization, and deployment

  • Size: Largest footprint (~2-3 GB)

  • Capabilities: Build engines, run inference, full API access

Lean Runtime (Recommended for Production)

  • Includes: Runtime-only functionality

  • Use for: Production deployment with pre-built engines

  • Size: Significantly smaller (~200-300 MB)

  • Capabilities: Run version-compatible engines

  • Limitations: Cannot build new engines

Dispatch Runtime (Recommended for Minimal Footprint)

  • Includes: Minimal runtime with lean runtime functionality

  • Use for: Memory-constrained deployments

  • Size: Smallest footprint (~100-150 MB)

  • Capabilities: Run version-compatible engines

  • Limitations: Cannot build new engines

Tip

Development to Production Workflow:

  1. Use Full Installation during development to build and optimize models

  2. Serialize optimized engines to plan files

  3. Deploy with Lean Runtime or Dispatch Runtime in production

Next Steps#

After verifying prerequisites:

  1. Proceed to Installing TensorRT to choose your installation method

  2. Select the installation mode that fits your use case

  3. Follow the step-by-step installation instructions