Prerequisites#
Before installing TensorRT, ensure your system meets the following requirements. This page is organized by category to help you quickly find the information you need.
Quick Checklist:
✓ NVIDIA GPU (Turing architecture or later) ✓ CUDA Toolkit 12.x or 13.x installed ✓ Appropriate GPU drivers (r535+ on Linux, r537+ on Windows) ✓ Python 3.10-3.13 recommended (3.8-3.9 bindings available but samples not supported)
Before You Begin#
Review Release Information
Before installation, familiarize yourself with the NVIDIA TensorRT Release Notes to understand:
New features in this release
Known issues and limitations
Platform-specific considerations
Compatibility changes
Choose Your API
TensorRT provides both C++ and Python APIs:
C++ API - Full functionality, no Python dependency
Python API - Convenient for rapid prototyping and integration
Both - Most users install both (default)
The installation instructions assume you want both APIs. For C++ only installation, skip Python-specific packages.
Required Software#
NVIDIA CUDA Toolkit
TensorRT requires the NVIDIA CUDA Toolkit. If not already installed, refer to the NVIDIA CUDA Installation Guide.
Supported CUDA Versions:
Driver Requirements:
Linux: NVIDIA driver r535 or later
Windows: NVIDIA driver r537 or later
CUDA 13.x: NVIDIA driver r580 or later (both platforms)
For more information, refer to the TensorRT Support Matrix.
Optional Dependencies#
The following libraries are optional and only needed for specific use cases:
cuBLAS (Optional)
cuBLAS is optional and only used for a few specific layers.
When needed: If your model requires cuBLAS-accelerated layers
Installation: Refer to the NVIDIA cuBLAS website
CUDA-Python (Optional)
CUDA-Python enables direct CUDA kernel calls from Python.
When needed: If you use TensorRT Python API with custom CUDA operations
Installation: Refer to the NVIDIA CUDA-Python documentation
NCCL (Optional)
Required only for the multi-device inference feature.
When needed: When using
IDistCollectiveLayer(SM 80+ / Ampere and later) or multi-device attention viaIAttention::setNbRanks(SM 100+ / Blackwell and later)Installation: Refer to the NVIDIA NCCL Installation Guide. The Deep Learning Framework Containers include a compatible NCCL build.
For more information, refer to the TensorRT Support Matrix.
Framework and Model Support#
PyTorch Integration
If you plan to use TensorRT with PyTorch:
Tested with: PyTorch >= 2.0
Compatibility: May work with older versions
Use case: Examples and integration samples
ONNX Model Support
The ONNX-TensorRT parser supports:
ONNX version: 1.20.0
Opset support: Up to opset 25
Backward compatibility: Official support is provided for opset 9 and above
For more information, refer to the TensorRT Support Matrix.
TensorRT Installation Modes#
TensorRT offers three installation modes to suit different deployment scenarios:
Full Installation (Recommended for Development)
Includes: Complete builder and runtime functionality
Use for: Model development, optimization, and deployment
Size: Largest footprint (~2-3 GB)
Capabilities: Build engines, run inference, full API access
Lean Runtime (Recommended for Production)
Includes: Runtime-only functionality
Use for: Production deployment with pre-built engines
Size: Significantly smaller (~200-300 MB)
Capabilities: Run version-compatible engines
Limitations: Cannot build new engines
Dispatch Runtime (Recommended for Minimal Footprint)
Includes: Minimal runtime with lean runtime functionality
Use for: Memory-constrained deployments
Size: Smallest footprint (~100-150 MB)
Capabilities: Run version-compatible engines
Limitations: Cannot build new engines
Tip
Development to Production Workflow:
Use Full Installation during development to build and optimize models
Serialize optimized engines to plan files
Deploy with Lean Runtime or Dispatch Runtime in production
Next Steps#
After verifying prerequisites:
Proceed to Installing TensorRT to choose your installation method
Select the installation mode that fits your use case
Follow the step-by-step installation instructions