Skip to main content
country_code
Ctrl+K
NVIDIA TensorRT - Home NVIDIA TensorRT - Home

NVIDIA TensorRT

  • Documentation Home
NVIDIA TensorRT - Home NVIDIA TensorRT - Home

NVIDIA TensorRT

  • Documentation Home

Table of Contents

Getting Started

  • Release Notes
    • 10.16.1 (Latest)
    • 10.16.0
    • 10.15.1
    • 10.14.1
    • 10.13.3
    • 10.13.2
    • 10.13.0
    • 10.12.0
    • 10.11.0
    • 10.10.0
    • 10.9.0
    • 10.8.0
    • 10.7.0
    • 10.6.0
    • 10.5.0
    • 10.4.0
    • 10.3.0
    • 10.2.0
    • 10.1.0
    • 10.0.1
    • 10.0.0 Early Access
  • Quick Start Guide
  • Support Matrix

Installing TensorRT

  • Installation Guide Overview
  • Prerequisites
  • Installing TensorRT
    • Method 1: Python Package Index (pip)
    • Method 2: Debian Package Installation
    • Method 3: RPM Package Installation
    • Method 4: Tar File Installation
    • Method 5: Zip File Installation (Windows)
    • Method 6: Container Installation
    • Alternative Installation Methods
  • Upgrading TensorRT
  • Uninstalling TensorRT

Architecture

  • Architecture Overview
  • TensorRT’s Capabilities
  • How TensorRT Works

Inference Library

  • C++ API Documentation
  • Python API Documentation
  • Sample Support Guide
  • Advanced Topics
    • Version Compatibility
    • Refitting an Engine
    • Algorithm Selection and Reproducible Builds
    • I/O Formats
    • Engine Inspector
    • Weight Streaming
    • Tiling Optimization
  • Working with Quantized Types
  • Accuracy Considerations
  • Working with Dynamic Shapes
    • Dynamic Shapes: Core Concepts
    • Dynamic Shapes: Advanced Topics
  • Extending TensorRT with Custom Layers
    • Adding Custom Layers Using the C++ API
    • Adding Custom Layers using the Python API (TensorRT >= 10.6)
    • Enabling Timing Caching and Using Custom Tactics
    • Plugin API Description
  • Working with Loops
  • Working with Conditionals
  • Working with DLA
  • TensorRT API Capture and Replay
  • Working with Transformers

Performance

  • Best Practices
    • Performance Benchmarking using trtexec
    • Advanced Performance Measurement Techniques
    • Hardware/Software Environment for Performance Measurements
    • Optimizing TensorRT Performance
    • Overhead of Shape Change and Optimization Profile Switching
    • Improving Model Accuracy
    • Optimizing Builder Performance

API

  • C++ API
  • Python API
  • Migrating from TensorRT 8.x to 10.x
    • Migrating Python Code from TensorRT 8.x to 10.x
    • Migrating C++ Code from TensorRT 8.x to 10.x
    • Migrating trtexec Usage from TensorRT 8.x to 10.x
    • Migrating Safety Runtime Code from TensorRT 8.x to 10.x
  • ONNX GraphSurgeon API
  • Polygraphy API

Reference

  • Troubleshooting
  • Data Format Descriptions
  • Command-Line Programs
  • Operators Documentation
  • Additional Resources
  • LICENSE AGREEMENT
  • Glossary
  • Polygraphy API
Is this page helpful?

Polygraphy API#

Polygraphy is a toolkit designed to assist in running and debugging deep learning models in various frameworks.

Interface - Polygraphy API

GitHub - TensorRT > tools > Polygraphy

See also

Troubleshooting

Uses Polygraphy for model debugging and accuracy validation.

Optimizing Performance

References Polygraphy for benchmarking workflows.

ONNX GraphSurgeon API

Companion tool for modifying ONNX models before optimization.

previous

ONNX GraphSurgeon API

next

Troubleshooting

NVIDIA NVIDIA
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2021-2026, NVIDIA Corporation.

Last updated on Apr 07, 2026.