Core Concepts¶
TensorRT-RTX Workflow¶
The general TensorRT-RTX workflow consists of 3 steps:
Populate a
tensorrt-rtx.INetworkDefinition
either with a parser or by using the TensorRT Network API (seetensorrt-rtx.INetworkDefinition
for more details). Thetensorrt-rtx.Builder
can be used to generate an emptytensorrt-rtx.INetworkDefinition
.Use the
tensorrt-rtx.Builder
to build atensorrt-rtx.ICudaEngine
using the populatedtensorrt-rtx.INetworkDefinition
.Create a
tensorrt-rtx.IExecutionContext
from thetensorrt-rtx.ICudaEngine
and use it to perform optimized inference.
Classes Overview¶
Logger¶
Most other TensorRT-RTX classes use a logger to report errors, warnings and informative messages. TensorRT-RTX provides a basic tensorrt-rtx.Logger
implementation, but you can write your own implementation by deriving from tensorrt-rtx.ILogger
for more advanced functionality.
Parsers¶
Parsers are used to populate a tensorrt-rtx.INetworkDefinition
from a model trained in a Deep Learning framework.
Network¶
The tensorrt-rtx.INetworkDefinition
represents a computational graph. In order to populate the network, TensorRT-RTX provides an ONNX parser. It is also possible to populate the network manually using the Network API.
Builder¶
The tensorrt-rtx.Builder
is used to build a tensorrt-rtx.ICudaEngine
. In order to do so, it must be provided a populated tensorrt-rtx.INetworkDefinition
.
Engine and Context¶
The tensorrt-rtx.ICudaEngine
is the output of the TensorRT-RTX optimizer. It is used to generate a tensorrt-rtx.IExecutionContext
that can perform inference.