tensorrt.utils¶

The utils package handles much of the boilerplate needed to create a TensorRT engine.

Parsing Utilities¶

Caffe Model to TensorRT Engine¶

tensorrt.utils.caffe_to_trt_engine(logger, deploy_file, model_file, max_batch_size, max_workspace_size, output_layers, datatype=<DataType.FLOAT: 0>, plugin_factory=None, calibrator=None)¶

Parses a Caffe model and create an engine for inference

Takes a Caffe model prototxt and caffemodel, name(s) of the output layer(s), and engine settings to create a engine that can be used for inference

Parameters:

logger tensorrt.infer.Logger (-) – A logger is needed to monitor the progress of building the engine
deploy_file str (-) – Path to caffe model prototxt
model_file str (-) – Path to caffe caffemodel file
max_batch_size int (-) – Maximum size of batch allowed for the engine
max_workspace_size int (-) – Maximum size of engine maxWorkspaceSize
output_layers [str] (-) – List of output layer names
datatype tensorrt.infer.DataType (-) – Operating data type of the engine, can be FP32, FP16 if supported on the platform, or INT8 with calibrator. Default: tensorrt.infer.DataType.FLOAT
plugin_factory tensort.infer.PluginFactory * (-) – Custom layer factory. Default:None
calibrator INT8 calibrator * (-) – (currently unsupported in python). Default:None

Returns

tensorrt.infer.CudaEngine: An engine that can be used to execute inference

UFF Model Stream to TensorRT Engine¶

tensorrt.utils.uff_to_trt_engine(logger, stream, parser, max_batch_size, max_workspace_size, datatype=<DataType.FLOAT: 0>, plugin_factory=None, calibrator=None)¶

Parses a UFF Model Stream and generates an engine

Takes a UFF Stream (created with a UFF exporter) and generates a TensorRT engine that can then be saved or executed

Parameters:

logger tensorrt.infer.Logger (-) – Logging system for the application
stream [Py2]str/[Py3]bytes (-) – Serialized UFF graph
parser tensorrt.parsers.uffparser.UffParser (-) – uff parser
max_batch_size int (-) – Maximum batch size
max_workspace_size int (-) – Maximum workspace size
datatype tensorrt.infer.DataType (-) – Operating data type of the engine, can be FP32, FP16 if supported on the platform, or INT8 with calibrator. Default: tensorrt.infer.DataType.FLOAT
plugins_factory tensorrt.infer.PluginFactory * (-) – Custom layer factory. Default: None
calibrator tensorrt.infer.Int8Calibrator * (-) – Currently unsupported. Default: None

Returns:

TensorRT Engine to be used or excuted

Return type:

tensorrt.infer.CudaEngine

UFF File to TensorRT Engine¶

tensorrt.utils.uff_file_to_trt_engine(logger, uff_file, parser, max_batch_size, max_workspace_size, datatype=<DataType.FLOAT: 0>, plugin_factory=None, calibrator=None)¶

Parses a UFF file and generates an engine

Takes a UFF file (created with a UFF exporter) and generates a TensorRT engine that can then be saved or executed

Parameters:

logger tensorrt.infer.Logger (-) – Logging system for the application
uff_file str (-) – Path to UFF file
parser tensorrt.parsers.uffparser.UffParser (-) – uff parser
max_batch_size int (-) – Maximum batch size
max_workspace_size int (-) – Maximum workspace size
datatype tensorrt.infer.DataType (-) – Operating data type of the engine, can be FP32, FP16 if supported on the platform, or INT8 with calibrator. Default: tensorrt.infer.DataType.FLOAT
plugins_factory tensorrt.infer.PluginFactory * (-) – Custom layer factory. Default: `` None``
calibrator tensorrt.infer.Int8Calibrator * (-) – Currently unsupported. Default: None

Returns:

TensorRT Engine to be used or excuted

Return type:

tensorrt.infer.CudaEngine

Saving and Loading Models¶

Load Engine from File¶

tensorrt.utils.load_engine(logger, filepath, plugins=None)¶

Load a saved engine file

Creates an engine from a file containting a serialized engine

Parameters:

logger tensorrt.infer.Logger (-) – A logger is needed to monitor the progress of building the engine
filepath str (-) – Path to engine file
plugins tensorrt.infer.PluginFactory * (-) – Custom layer factory

Returns:

An engine that can be used to execute inference

Return type:

tensorrt.infer.CudaEngine

Save Engine to File¶

tensorrt.utils.write_engine_to_file(filepath, engine)¶

Write an engine to a file

Takes a serialized engine and wrties it to a file to be loaded later

Parameters:

filepath str (-) – Path to engine file
engine tensorrt.infer.CudaEngine (-) – An engine that can be used to execute inference

Returns:

Whether the file was written or not

Return type:

bool

Load Weights from File¶

tensorrt.utils.load_weights(filepath)¶

Load model weights from file

Loads weights from a .wts file into a dictionary of layer names and associated weights encoded in tensorrt.infer.Weights object

Parameters:	filepath str (-) – path to the weights file
Returns:	Dictionary of layer names and associated weights
Return type:	`dict {str, tensorrt.infer.Weights}`