Is this page helpful?

NVIDIA TensorRT Documentation#

NVIDIA TensorRT is an SDK that facilitates high-performance machine learning inference. It complements training frameworks such as TensorFlow, PyTorch, and MXNet. TensorRT focuses on running an already-trained network quickly and efficiently on NVIDIA hardware.

Quick Start#

🆕 New to NVIDIA TensorRT? → Start with the Quick Start Guide to build and deploy your first optimized inference engine in 30–60 minutes
⬆️ Upgrading from 10.11 or earlier? → See What’s New in 10.12.0 below
🔧 Need help with a specific task? → Jump to the Installing TensorRT or Troubleshooting section

🆕 What’s New in NVIDIA TensorRT 10.12.0#

Latest Release Highlights

MXFP8 Quantization Support - Block quantization across 32 high-precision elements with E8M0 scaling factor for improved model compression
Enhanced Debug Tensor Feature - Mark all unfused tensors as debug tensors without preventing fusion, with support for NumPy, string, and raw data formats
Distributive Independence Determinism - Guarantee identical outputs across distributive axis when inputs are identical, improving reproducibility
Weak Typing APIs Deprecated - Migration to strong-typing exclusively; refer to Strong Typing vs Weak Typing guide for migration
Refactored Python Samples - New samples with cleaner structure: 1_run_onnx_with_tensorrt and 2_construct_network_with_layer_apis

View 10.12.0 Release Notes

What You’ll Find Here#

🚀 Getting Started - Quick start guide, release notes, and platform support matrix
📦 Installing TensorRT - Installation requirements, prerequisites, and step-by-step setup instructions
🏗️ Architecture - TensorRT design overview, optimization capabilities, and how the inference engine works
🔧 Inference Library - C++ and Python APIs, code samples, and advanced features like quantization and dynamic shapes
⚡ Performance - Best practices for optimization and using trtexec for benchmarking
📚 API - Complete API references for C++, Python, ONNX GraphSurgeon, and Polygraphy tools
📖 Reference - Troubleshooting guides, operator support, command-line tools, and glossary

Previous Releases#

Note

For complete version history and detailed changelogs, visit the Release Notes section or the TensorRT GitHub Releases.