tensorrt.utils

The utils package handles much of the boilerplate needed to create a TensorRT engine.

Parsing Utilities

Caffe Model to TensorRT Engine

tensorrt.utils.caffe_to_trt_engine(logger, deploy_file, model_file, max_batch_size, max_workspace_size, output_layers, datatype=<DataType.FLOAT: 0>, plugin_factory=None, calibrator=None)

Parses a Caffe model and create an engine for inference

Takes a Caffe model prototxt and caffemodel, name(s) of the output layer(s), and engine settings to create a engine that can be used for inference

Parameters:
  • logger tensorrt.infer.Logger (-) – A logger is needed to monitor the progress of building the engine
  • deploy_file str (-) – Path to caffe model prototxt
  • model_file str (-) – Path to caffe caffemodel file
  • max_batch_size int (-) – Maximum size of batch allowed for the engine
  • max_workspace_size int (-) – Maximum size of engine maxWorkspaceSize
  • output_layers [str] (-) – List of output layer names
  • datatype tensorrt.infer.DataType (-) – Operating data type of the engine, can be FP32, FP16 if supported on the platform, or INT8 with calibrator. Default: tensorrt.infer.DataType.FLOAT
  • plugin_factory tensort.infer.PluginFactory * (-) – Custom layer factory. Default:None
  • calibrator INT8 calibrator * (-) – (currently unsupported in python). Default:None
Returns
  • tensorrt.infer.CudaEngine: An engine that can be used to execute inference

UFF Model Stream to TensorRT Engine

tensorrt.utils.uff_to_trt_engine(logger, stream, parser, max_batch_size, max_workspace_size, datatype=<DataType.FLOAT: 0>, plugin_factory=None, calibrator=None)

Parses a UFF Model Stream and generates an engine

Takes a UFF Stream (created with a UFF exporter) and generates a TensorRT engine that can then be saved or executed

Parameters:
  • logger tensorrt.infer.Logger (-) – Logging system for the application
  • stream [Py2]str/[Py3]bytes (-) – Serialized UFF graph
  • parser tensorrt.parsers.uffparser.UffParser (-) – uff parser
  • max_batch_size int (-) – Maximum batch size
  • max_workspace_size int (-) – Maximum workspace size
  • datatype tensorrt.infer.DataType (-) – Operating data type of the engine, can be FP32, FP16 if supported on the platform, or INT8 with calibrator. Default: tensorrt.infer.DataType.FLOAT
  • plugins_factory tensorrt.infer.PluginFactory * (-) – Custom layer factory. Default: None
  • calibrator tensorrt.infer.Int8Calibrator * (-) – Currently unsupported. Default: None
Returns:

TensorRT Engine to be used or excuted

Return type:

  • tensorrt.infer.CudaEngine

UFF File to TensorRT Engine

tensorrt.utils.uff_file_to_trt_engine(logger, uff_file, parser, max_batch_size, max_workspace_size, datatype=<DataType.FLOAT: 0>, plugin_factory=None, calibrator=None)

Parses a UFF file and generates an engine

Takes a UFF file (created with a UFF exporter) and generates a TensorRT engine that can then be saved or executed

Parameters:
  • logger tensorrt.infer.Logger (-) – Logging system for the application
  • uff_file str (-) – Path to UFF file
  • parser tensorrt.parsers.uffparser.UffParser (-) – uff parser
  • max_batch_size int (-) – Maximum batch size
  • max_workspace_size int (-) – Maximum workspace size
  • datatype tensorrt.infer.DataType (-) – Operating data type of the engine, can be FP32, FP16 if supported on the platform, or INT8 with calibrator. Default: tensorrt.infer.DataType.FLOAT
  • plugins_factory tensorrt.infer.PluginFactory * (-) – Custom layer factory. Default: `` None``
  • calibrator tensorrt.infer.Int8Calibrator * (-) – Currently unsupported. Default: None
Returns:

TensorRT Engine to be used or excuted

Return type:

  • tensorrt.infer.CudaEngine

Saving and Loading Models

Load Engine from File

tensorrt.utils.load_engine(logger, filepath, plugins=None)

Load a saved engine file

Creates an engine from a file containting a serialized engine

Parameters:
  • logger tensorrt.infer.Logger (-) – A logger is needed to monitor the progress of building the engine
  • filepath str (-) – Path to engine file
  • plugins tensorrt.infer.PluginFactory * (-) – Custom layer factory
Returns:

An engine that can be used to execute inference

Return type:

  • tensorrt.infer.CudaEngine

Save Engine to File

tensorrt.utils.write_engine_to_file(filepath, engine)

Write an engine to a file

Takes a serialized engine and wrties it to a file to be loaded later

Parameters:
  • filepath str (-) – Path to engine file
  • engine tensorrt.infer.CudaEngine (-) – An engine that can be used to execute inference
Returns:

Whether the file was written or not

Return type:

  • bool

Load Weights from File

tensorrt.utils.load_weights(filepath)

Load model weights from file

Loads weights from a .wts file into a dictionary of layer names and associated weights encoded in tensorrt.infer.Weights object

Parameters:filepath str (-) – path to the weights file
Returns:Dictionary of layer names and associated weights
Return type:
  • dict {str, tensorrt.infer.Weights}