Installing TensorRT#

This guide provides step-by-step instructions for installing TensorRT using various methods. Choose the installation method that best fits your development environment and deployment needs.

Before You Begin: Ensure you have reviewed the Prerequisites to confirm your system meets all requirements.

What Is TensorRT, and What Does Installing It Give You?#

NVIDIA TensorRT is an SDK for high-performance deep learning inference (running a trained model on new inputs to produce predictions) on NVIDIA GPUs. It compiles a trained model into a hardware-specific binary called an engine (also referred to as a plan file), then runs that engine inside your application.

Installing TensorRT puts the following on your system:

  • The TensorRT libraries (libnvinfer, libnvonnxparser, and friends), which your C++ or Python application links against to load and execute engines.

  • Python bindings (the tensorrt Python package), so you can build, save, and run engines from Python without writing C++.

  • The trtexec command-line tool, which builds an engine from an ONNX file, benchmarks it, and is the fastest way to confirm a fresh install actually works.

  • C++ headers (NvInfer.h and friends), included with every method except pip, for applications that build against the C++ API.

Once installation finishes, you should be able to run trtexec --help and import tensorrt from Python — at that point you’re ready to follow the Quick Start Guide and build your first engine.

If you only need to run engines built by someone else (for example, a deployment server), the Lean or Dispatch runtimes described below are smaller alternatives to the Full Runtime.

Installation Method Comparison#

Quick Comparison Table:

Method

Best For

Requires Root

C++ Headers

Multi-Version

Installation Time

pip (Python)

Python development

No

No

Yes (venv)

⚡ Fastest (~2 min)

Debian/RPM

System-wide install

Yes

Yes

No

🔵 Fast (~5 min)

Tar/Zip

Multiple versions

No

Yes

Yes

🟡 Moderate (~10 min)

Container (NGC)

Isolated environments

No (Docker)

Yes

Yes

⚡ Fastest (~5 min)

Choosing Your Installation Method#

Choose pip if you:

  • Are developing primarily in Python

  • Want the fastest installation

  • Are working in a Python virtual environment

  • Do not need C++ development headers

Choose Debian/RPM if you:

  • Want system-wide installation with automatic updates

  • Need C++ development support

  • Have sudo/root access

  • Prefer standard Linux package management

Choose Tar/Zip if you:

  • Need multiple TensorRT versions simultaneously

  • Want control over installation location

  • Are installing without root privileges

  • Need C++ headers but want flexibility

Choose Container if you:

  • Want a pre-configured environment

  • Are deploying in Kubernetes or Docker

  • Need consistent environments across systems

  • Want to avoid dependency management

Understanding TensorRT Runtime Options#

TensorRT offers three runtime configurations with different capabilities and footprint sizes:

Full Runtime (Recommended for Development)
  • Builder and runtime functionality (~2 GB)

  • Packages: tensorrt (pip), tensorrt (deb/rpm), TensorRT-* (tar/zip)

Lean Runtime (Recommended for Production Deployment)
  • Runtime-only for pre-built engines (approximately 200–300 MB; see the runtime footprint comparison in Installation Prerequisites for full sizing context)

  • Packages: tensorrt_lean (pip), tensorrt-lean (deb/rpm)

  • Note: Engines must be built with version-compatible builder flag

Dispatch Runtime (Recommended for Minimal Footprint Deployment)
  • Minimal runtime for pre-built engines (approximately 100–150 MB; see Installation Prerequisites for the canonical size table)

  • Packages: tensorrt_dispatch (pip), tensorrt-dispatch (deb/rpm)

  • Note: Engines must be built with version-compatible builder flag

Downloading TensorRT#

Before installing with Debian (local repo), RPM (local repo), Tar, or Zip methods, you must download TensorRT packages.

Tip

For pip installation: Skip this section. The pip method downloads packages automatically from PyPI.

Prerequisites:

  • NVIDIA Developer Program membership (free)

  • Account login

Download Steps:

  1. Go to https://developer.nvidia.com/tensorrt.

  2. Click Download Now.

  3. Select TensorRT version 11.0.0 (or your target version).

  4. Accept the license agreement.

  5. Download the package for your platform:

    • Linux x86-64: Debian local repo (.deb), RPM local repo (.rpm), or Tar (.tar.gz)

    • Linux ARM SBSA and JetPack: Debian local repo (.deb) or Tar (.tar.gz)

    • Windows x64: Zip (.zip)

Installation Methods#