Core Concepts

TensorRT-RTX Workflow

The general TensorRT-RTX workflow consists of 3 steps:

  1. Populate a tensorrt-rtx.INetworkDefinition either with a parser or by using the TensorRT Network API (see tensorrt-rtx.INetworkDefinition for more details). The tensorrt-rtx.Builder can be used to generate an empty tensorrt-rtx.INetworkDefinition .

  2. Use the tensorrt-rtx.Builder to build a tensorrt-rtx.ICudaEngine using the populated tensorrt-rtx.INetworkDefinition .

  3. Create a tensorrt-rtx.IExecutionContext from the tensorrt-rtx.ICudaEngine and use it to perform optimized inference.

Classes Overview

Logger

Most other TensorRT-RTX classes use a logger to report errors, warnings and informative messages. TensorRT-RTX provides a basic tensorrt-rtx.Logger implementation, but you can write your own implementation by deriving from tensorrt-rtx.ILogger for more advanced functionality.

Parsers

Parsers are used to populate a tensorrt-rtx.INetworkDefinition from a model trained in a Deep Learning framework.

Network

The tensorrt-rtx.INetworkDefinition represents a computational graph. In order to populate the network, TensorRT-RTX provides an ONNX parser. It is also possible to populate the network manually using the Network API.

Builder

The tensorrt-rtx.Builder is used to build a tensorrt-rtx.ICudaEngine . In order to do so, it must be provided a populated tensorrt-rtx.INetworkDefinition .

Engine and Context

The tensorrt-rtx.ICudaEngine is the output of the TensorRT-RTX optimizer. It is used to generate a tensorrt-rtx.IExecutionContext that can perform inference.