Core Concepts

TensorRT-RTX Workflow

The general TensorRT-RTX workflow consists of 3 steps:

  1. Populate a tensorrt_rtx.INetworkDefinition either with a parser or by using the TensorRT Network API (see tensorrt_rtx.INetworkDefinition for more details). The tensorrt_rtx.Builder can be used to generate an empty tensorrt_rtx.INetworkDefinition .

  2. Use the tensorrt_rtx.Builder to build a tensorrt_rtx.ICudaEngine using the populated tensorrt_rtx.INetworkDefinition .

  3. Create a tensorrt_rtx.IExecutionContext from the tensorrt_rtx.ICudaEngine and use it to perform optimized inference.

Classes Overview

Logger

Most other TensorRT-RTX classes use a logger to report errors, warnings and informative messages. TensorRT-RTX provides a basic tensorrt_rtx.Logger implementation, but you can write your own implementation by deriving from tensorrt_rtx.ILogger for more advanced functionality.

Parsers

Parsers are used to populate a tensorrt_rtx.INetworkDefinition from a model trained in a Deep Learning framework.

Network

The tensorrt_rtx.INetworkDefinition represents a computational graph. In order to populate the network, TensorRT-RTX provides an ONNX parser. It is also possible to populate the network manually using the Network API.

Builder

The tensorrt_rtx.Builder is used to build a tensorrt_rtx.ICudaEngine . In order to do so, it must be provided a populated tensorrt_rtx.INetworkDefinition .

Engine and Context

The tensorrt_rtx.ICudaEngine is the output of the TensorRT-RTX optimizer. It is used to generate a tensorrt_rtx.IExecutionContext that can perform inference.