Installation Guide Overview#
This guide provides complete instructions for installing, upgrading, and uninstalling TensorRT on supported platforms. Whether you’re setting up TensorRT for the first time or upgrading an existing installation, this guide will walk you through the process.
About TensorRT#
NVIDIA TensorRT is a high-performance deep learning inference SDK. It includes:
Inference Optimizer - Optimizes trained models for efficient GPU execution
Runtime Engine - Executes optimized models with minimal latency
C++ and Python APIs - Build and deploy inference applications
ONNX Parser - Import models from popular training frameworks
Mixed-Precision Support - FP32, TF32, FP16, BF16, FP8, FP4, INT8, and INT4 inference
TensorRT takes a trained network and produces a highly optimized runtime engine. It applies graph optimizations, layer fusions, and kernel auto-tuning to maximize performance on NVIDIA GPUs from Turing architecture onwards.
What’s in This Section#
This installation guide is organized into the following documents:
- Prerequisites
Before installing TensorRT, review system requirements, supported platforms, and required dependencies.
- Installing TensorRT
Step-by-step instructions for multiple installation methods:
Python Package Index (pip) - Fastest method for Python users
Debian/RPM Packages - System-wide installation with automatic dependency management
Tar/Zip Files - Flexible installation allowing multiple versions simultaneously
Container Images - Pre-configured Docker containers with TensorRT
- Upgrading TensorRT
Instructions for upgrading from previous TensorRT versions while managing compatibility.
- Uninstalling TensorRT
Complete removal instructions for each installation method.