Core Concepts
TensorRT Workflow
The general TensorRT workflow consists of 3 steps:
- Populate a - tensorrt.INetworkDefinitioneither with a parser or by using the TensorRT Network API (see- tensorrt.INetworkDefinitionfor more details). The- tensorrt.Buildercan be used to generate an empty- tensorrt.INetworkDefinition.
- Use the - tensorrt.Builderto build a- tensorrt.ICudaEngineusing the populated- tensorrt.INetworkDefinition.
- Create a - tensorrt.IExecutionContextfrom the- tensorrt.ICudaEngineand use it to perform optimized inference.
Classes Overview
Logger
Most other TensorRT classes use a logger to report errors, warnings and informative messages. TensorRT provides a basic tensorrt.Logger implementation, but you can write your own implementation by deriving from tensorrt.ILogger for more advanced functionality.
Parsers
Parsers are used to populate a tensorrt.INetworkDefinition from a model trained in a Deep Learning framework.
Network
The tensorrt.INetworkDefinition represents a computational graph. In order to populate the network, TensorRT provides a suite of parsers for a variety of Deep Learning frameworks. It is also possible to populate the network manually using the Network API.
Builder
The tensorrt.Builder is used to build a tensorrt.ICudaEngine . In order to do so, it must be provided a populated tensorrt.INetworkDefinition .
Engine and Context
The tensorrt.ICudaEngine is the output of the TensorRT optimizer. It is used to generate a tensorrt.IExecutionContext that can perform inference.