Core Concepts¶
TensorRT Workflow¶
The general TensorRT workflow consists of 3 steps:
Populate a
tensorrt.INetworkDefinition
either with a parser or by using the TensorRT Network API (seetensorrt.INetworkDefinition
for more details). Thetensorrt.Builder
can be used to generate an emptytensorrt.INetworkDefinition
.Use the
tensorrt.Builder
to build atensorrt.ICudaEngine
using the populatedtensorrt.INetworkDefinition
.Create a
tensorrt.IExecutionContext
from thetensorrt.ICudaEngine
and use it to perform optimized inference.
Classes Overview¶
Logger¶
Most other TensorRT classes use a logger to report errors, warnings and informative messages. TensorRT provides a basic tensorrt.Logger
implementation, but you can write your own implementation by deriving from tensorrt.ILogger
for more advanced functionality.
Parsers¶
Parsers are used to populate a tensorrt.INetworkDefinition
from a model trained in a Deep Learning framework.
Network¶
The tensorrt.INetworkDefinition
represents a computational graph. In order to populate the network, TensorRT provides a suite of parsers for a variety of Deep Learning frameworks. It is also possible to populate the network manually using the Network API.
Builder¶
The tensorrt.Builder
is used to build a tensorrt.ICudaEngine
. In order to do so, it must be provided a populated tensorrt.INetworkDefinition
.
Engine and Context¶
The tensorrt.ICudaEngine
is the output of the TensorRT optimizer. It is used to generate a tensorrt.IExecutionContext
that can perform inference.