Core Concepts¶
TensorRT-RTX Workflow¶
The general TensorRT-RTX workflow consists of 3 steps:
Populate a
tensorrt_rtx.INetworkDefinitioneither with a parser or by using the TensorRT-RTX Network API (seetensorrt_rtx.INetworkDefinitionfor more details). Thetensorrt_rtx.Buildercan be used to generate an emptytensorrt_rtx.INetworkDefinition.Use the
tensorrt_rtx.Builderto build atensorrt_rtx.ICudaEngineusing the populatedtensorrt_rtx.INetworkDefinition.Create a
tensorrt_rtx.IExecutionContextfrom thetensorrt_rtx.ICudaEngineand use it to perform optimized inference.
Classes Overview¶
Logger¶
Most other TensorRT-RTX classes use a logger to report errors, warnings and informative messages. TensorRT-RTX provides a basic tensorrt_rtx.Logger implementation, but you can write your own implementation by deriving from tensorrt_rtx.ILogger for more advanced functionality.
Parsers¶
Parsers are used to populate a tensorrt_rtx.INetworkDefinition from a model trained in a Deep Learning framework.
Network¶
The tensorrt_rtx.INetworkDefinition represents a computational graph. In order to populate the network, TensorRT-RTX provides a suite of parsers for a variety of Deep Learning frameworks. It is also possible to populate the network manually using the Network API.
Builder¶
The tensorrt_rtx.Builder is used to build a tensorrt_rtx.ICudaEngine . In order to do so, it must be provided a populated tensorrt_rtx.INetworkDefinition .
Engine and Context¶
The tensorrt_rtx.ICudaEngine is the output of the TensorRT-RTX optimizer. It is used to generate a tensorrt_rtx.IExecutionContext that can perform inference.