tensorrt.utils¶
The utils
package handles much of the boilerplate
needed to create a TensorRT engine.
Parsing Utilities¶
Caffe Model to TensorRT Engine¶
-
tensorrt.utils.
caffe_to_trt_engine
(logger, deploy_file, model_file, max_batch_size, max_workspace_size, output_layers, datatype=<DataType.FLOAT: 0>, plugin_factory=None, calibrator=None)¶ Parses a Caffe model and create an engine for inference
Takes a Caffe model prototxt and caffemodel, name(s) of the output layer(s), and engine settings to create a engine that can be used for inference
Parameters: - logger tensorrt.infer.Logger (-) – A logger is needed to monitor the progress of building the engine
- deploy_file str (-) – Path to caffe model prototxt
- model_file str (-) – Path to caffe caffemodel file
- max_batch_size int (-) – Maximum size of batch allowed for the engine
- max_workspace_size int (-) – Maximum size of engine maxWorkspaceSize
- output_layers [str] (-) – List of output layer names
- datatype tensorrt.infer.DataType (-) – Operating data type of the engine, can be FP32, FP16 if supported on the platform, or INT8 with calibrator. Default:
tensorrt.infer.DataType.FLOAT
- plugin_factory tensort.infer.PluginFactory * (-) – Custom layer factory. Default:
None
- calibrator INT8 calibrator * (-) – (currently unsupported in python). Default:
None
- Returns
tensorrt.infer.CudaEngine
: An engine that can be used to execute inference
UFF Model Stream to TensorRT Engine¶
-
tensorrt.utils.
uff_to_trt_engine
(logger, stream, parser, max_batch_size, max_workspace_size, datatype=<DataType.FLOAT: 0>, plugin_factory=None, calibrator=None)¶ Parses a UFF Model Stream and generates an engine
Takes a UFF Stream (created with a UFF exporter) and generates a TensorRT engine that can then be saved or executed
Parameters: - logger tensorrt.infer.Logger (-) – Logging system for the application
- stream [Py2]str/[Py3]bytes (-) – Serialized UFF graph
- parser tensorrt.parsers.uffparser.UffParser (-) – uff parser
- max_batch_size int (-) – Maximum batch size
- max_workspace_size int (-) – Maximum workspace size
- datatype tensorrt.infer.DataType (-) – Operating data type of the engine, can be FP32, FP16 if supported on the platform, or INT8 with calibrator. Default:
tensorrt.infer.DataType.FLOAT
- plugins_factory tensorrt.infer.PluginFactory * (-) – Custom layer factory. Default:
None
- calibrator tensorrt.infer.Int8Calibrator * (-) – Currently unsupported. Default:
None
Returns: TensorRT Engine to be used or excuted
Return type: tensorrt.infer.CudaEngine
UFF File to TensorRT Engine¶
-
tensorrt.utils.
uff_file_to_trt_engine
(logger, uff_file, parser, max_batch_size, max_workspace_size, datatype=<DataType.FLOAT: 0>, plugin_factory=None, calibrator=None)¶ Parses a UFF file and generates an engine
Takes a UFF file (created with a UFF exporter) and generates a TensorRT engine that can then be saved or executed
Parameters: - logger tensorrt.infer.Logger (-) – Logging system for the application
- uff_file str (-) – Path to UFF file
- parser tensorrt.parsers.uffparser.UffParser (-) – uff parser
- max_batch_size int (-) – Maximum batch size
- max_workspace_size int (-) – Maximum workspace size
- datatype tensorrt.infer.DataType (-) – Operating data type of the engine, can be FP32, FP16 if supported on the platform, or INT8 with calibrator. Default:
tensorrt.infer.DataType.FLOAT
- plugins_factory tensorrt.infer.PluginFactory * (-) – Custom layer factory. Default: `` None``
- calibrator tensorrt.infer.Int8Calibrator * (-) – Currently unsupported. Default:
None
Returns: TensorRT Engine to be used or excuted
Return type: tensorrt.infer.CudaEngine
Saving and Loading Models¶
Load Engine from File¶
-
tensorrt.utils.
load_engine
(logger, filepath, plugins=None)¶ Load a saved engine file
Creates an engine from a file containting a serialized engine
Parameters: - logger tensorrt.infer.Logger (-) – A logger is needed to monitor the progress of building the engine
- filepath str (-) – Path to engine file
- plugins tensorrt.infer.PluginFactory * (-) – Custom layer factory
Returns: An engine that can be used to execute inference
Return type: tensorrt.infer.CudaEngine
Save Engine to File¶
-
tensorrt.utils.
write_engine_to_file
(filepath, engine)¶ Write an engine to a file
Takes a serialized engine and wrties it to a file to be loaded later
Parameters: - filepath str (-) – Path to engine file
- engine tensorrt.infer.CudaEngine (-) – An engine that can be used to execute inference
Returns: Whether the file was written or not
Return type: bool
Load Weights from File¶
-
tensorrt.utils.
load_weights
(filepath)¶ Load model weights from file
Loads weights from a .wts file into a dictionary of layer names and associated weights encoded in tensorrt.infer.Weights object
Parameters: filepath str (-) – path to the weights file Returns: Dictionary of layer names and associated weights Return type: dict {str, tensorrt.infer.Weights}