Installation Guide Overview#

This guide provides complete instructions for installing, upgrading, and uninstalling TensorRT on supported platforms. Whether you’re setting up TensorRT for the first time or upgrading an existing installation, this guide will walk you through the process.

About TensorRT#

NVIDIA TensorRT is a high-performance deep learning inference SDK. It includes:

  • Inference Optimizer - Optimizes trained models for efficient GPU execution

  • Runtime Engine - Executes optimized models with minimal latency

  • C++ and Python APIs - Build and deploy inference applications

  • ONNX Parser - Import models from popular training frameworks

  • Mixed-Precision Support - FP32, TF32, FP16, BF16, FP8, FP4, INT8, and INT4 inference

TensorRT takes a trained network and produces a highly optimized runtime engine. It applies graph optimizations, layer fusions, and kernel auto-tuning to maximize performance on NVIDIA GPUs from Turing architecture onwards.

What’s in This Section#

This installation guide is organized into the following documents:

Prerequisites

Before installing TensorRT, review system requirements, supported platforms, and required dependencies.

View Prerequisites

Installing TensorRT

Step-by-step instructions for multiple installation methods:

  • Python Package Index (pip) - Fastest method for Python users

  • Debian/RPM Packages - System-wide installation with automatic dependency management

  • Tar/Zip Files - Flexible installation allowing multiple versions simultaneously

  • Container Images - Pre-configured Docker containers with TensorRT

Start Installation

Upgrading TensorRT

Instructions for upgrading from previous TensorRT versions while managing compatibility.

Upgrade Guide

Uninstalling TensorRT

Complete removal instructions for each installation method.

Uninstall Instructions