Skip to main content

Ctrl+K

NVIDIA TensorRT

Documentation Home

NVIDIA TensorRT

Documentation Home

Table of Contents

Getting Started

Release Notes
- 10.16.1 (Latest)
- 10.16.0
- 10.15.1
- 10.14.1
- 10.13.3
- 10.13.2
- 10.13.0
- 10.12.0
- 10.11.0
- 10.10.0
- 10.9.0
- 10.8.0
- 10.7.0
- 10.6.0
- 10.5.0
- 10.4.0
- 10.3.0
- 10.2.0
- 10.1.0
- 10.0.1
- 10.0.0 Early Access
Quick Start Guide
Support Matrix

Installing TensorRT

Installation Guide Overview
Prerequisites
Installing TensorRT
Upgrading TensorRT
Uninstalling TensorRT

Architecture

Architecture Overview
TensorRT’s Capabilities
How TensorRT Works

Inference Library

C++ API Documentation
Python API Documentation
Sample Support Guide
Advanced Topics
Working with Quantized Types
Accuracy Considerations
Working with Dynamic Shapes
- Dynamic Shapes: Core Concepts
- Dynamic Shapes: Advanced Topics
Extending TensorRT with Custom Layers
Working with Loops
Working with Conditionals
Working with DLA
TensorRT API Capture and Replay
Working with Transformers

Performance

Best Practices

API

C++ API
Python API
Migrating from TensorRT 8.x to 10.x
ONNX GraphSurgeon API
Polygraphy API

Reference

Troubleshooting
Data Format Descriptions
Command-Line Programs
Operators Documentation
Additional Resources
LICENSE AGREEMENT
Glossary

Advanced Topics

Is this page helpful?

Advanced Topics#

This section covers advanced TensorRT features and configuration options.

Version Compatibility
Refitting an Engine
Algorithm Selection and Reproducible Builds
I/O Formats
Engine Inspector
Weight Streaming
- Cross-Platform Compatibility
Tiling Optimization
- Multi-Device Inference (Preview Feature)

previous

Sample Support Guide

next

Version Compatibility

Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2021-2026, NVIDIA Corporation.

Last updated on Apr 07, 2026.