Loaders

Module: polygraphy.tools.args

class TrtLoadPluginsArgs[source]

Bases: BaseArgs

TensorRT Plugin Loading: loading TensorRT plugins.

parse_impl(args)[source]

Parses command-line arguments and populates the following attributes:

plugins

Path(s) to plugin libraries.

Type:: List[str]

add_to_script_impl(script, loader_name: str)[source]

Parameters:: loader_name (str) – The name of the loader which should be consumed by the LoadPlugins loader.

class TrtOnnxFlagArgs[source]

Bases: BaseArgs

ONNX-TRT Parser Flags: setting flags for TensorRT’s ONNX parser

Depends on:

TrtConfigArgs: If NATIVE_INSTANCENORM should be automatically enabled in VC/HC mode

parse_impl(args)[source]

Parses command-line arguments and populates the following attributes:

flags

flags for onnxparser

Type:: List[str]

get_flags()[source]: Updates and returns the ONNX parser flags as necessary. This must be called only in add_to_script_impl. Flags should not be accessed directly.

class TrtLoadNetworkArgs(allow_custom_outputs: bool | None = None, allow_onnx_loading: bool | None = None, allow_tensor_formats: bool | None = None)[source]

Bases: BaseArgs

TensorRT Network Loading: loading TensorRT networks.

Depends on:

ModelArgs

TrtLoadPluginsArgs

OnnxLoadArgs: if allow_onnx_loading == True

TrtOnnxFlagArgs

Parameters:

allow_custom_outputs (bool) – Whether to allow marking custom output tensors. Defaults to True.
allow_onnx_loading (bool) – Whether to allow parsing networks from an ONNX model. Defaults to True.
allow_tensor_formats (bool) – Whether to allow tensor formats and related options to be set. Defaults to False.

parse_impl(args)[source]

Parses command-line arguments and populates the following attributes:

outputs

Names of output tensors.

Type:: List[str]

exclude_outputs

Names of tensors which should be unmarked as outputs.

Type:: List[str]

trt_network_func_name

The name of the function in a custom network script that creates the network.

Type:: str

layer_precisions

Layer names mapped to their desired compute precision, in string form.

Type:: Dict[str, str]

tensor_datatypes

Tensor names mapped to their desired data types, in string form.

Type:: Dict[str, str]

tensor_formats

Tensor names mapped to their desired formats, in string form.

Type:: Dict[str, List[str]]

postprocess_scripts

A list of tuples specifying a path to a network postprocessing script and the name of the postprocessing function.

Type:: List[Tuple[str, str]]

strongly_typed

Whether to mark the network as being strongly typed.

Type:: bool

mark_debug

Names of tensors which should be marked as debug tensors.

Type:: List[str]

load_network()[source]

Loads a TensorRT Network model according to arguments provided on the command-line.

Returns:: tensorrt.INetworkDefinition

class TrtSaveEngineBytesArgs(output_opt: str | None = None, output_short_opt: str | None = None)[source]

Bases: BaseArgs

TensorRT Engine Saving: saving TensorRT engines.

Saves a serialized engine. This should be preferred over TrtSaveEngineArgs() since as of TensorRT 8.6, version compatible engines cannot be re-serialized after they have been initially deserialized.

Parameters:

output_opt (str) – The name of the output path option. Defaults to “output”. Use a value of False to disable the option.
output_short_opt (str) – The short option to use for the output path. Defaults to “-o”. Use a value of False to disable the short option.

parse_impl(args)[source]

Parses command-line arguments and populates the following attributes:

path

The path at which to save the TensorRT engine.

Type:: str

add_to_script_impl(script, loader_name)[source]

Parameters:: loader_name (str) – The name of the loader which will generate the serialized engine.
Returns:: The name of the loader added to the script.
Return type:: str

save_engine_bytes(engine_bytes, path=None)[source]

Saves a serialized TensorRT engine according to arguments provided on the command-line.

Parameters:

engine_bytes (bytes) – The serialized TensorRT engine to save.
path (str) – The path at which to save the engine. If no path is provided, it is determined from command-line arguments.

Returns:

The serialized engine that was saved.

Return type:

bytes

TrtSaveEngineArgs: alias of Deprecated

class TrtLoadEngineBytesArgs(allow_saving: bool | None = None)[source]

Bases: BaseArgs

TensorRT Engine: loading or building TensorRT engines.

Depends on:

ModelArgs

TrtLoadPluginsArgs

TrtLoadNetworkArgs: if support for building engines is required

TrtConfigArgs: if support for building engines is required

TrtSaveEngineBytesArgs: if allow_saving == True

Parameters:: allow_saving (bool) – Whether to allow loaded models to be saved. Defaults to False.

parse_impl(args)[source]

Parses command-line arguments and populates the following attributes:

save_timing_cache

Path at which to save the tactic timing cache.

Type:: str

add_to_script_impl(script, network_name=None)[source]

Parameters:: network_name (str) – The name of a variable in the script pointing to a network loader.

load_engine_bytes(network=None)[source]

Loads a TensorRT engine according to arguments provided on the command-line.

Parameters:: network (Tuple[trt.Builder, trt.INetworkDefinition, Optional[parser]]) – A tuple containing a TensorRT builder, network and optionally parser.
Returns:: The engine.
Return type:: tensorrt.ICudaEngine

class TrtLoadEngineArgs[source]

Bases: BaseArgs

TensorRT Engine: loading TensorRT engines.

Depends on:

TrtLoadEngineBytesArgs

TrtLoadPluginsArgs

parse_impl(args)[source]

Parses command-line arguments and populates the following attributes:

load_runtime

Path rom which to load a runtime that can be used to load a version compatible engine that excludes the lean runtime.

Type:: str

add_to_script_impl(script, network_name=None)[source]

Parameters:: network_name (str) – The name of a variable in the script pointing to a network loader.

load_engine(network=None)[source]

Loads a TensorRT engine according to arguments provided on the command-line.

Parameters:: network (Tuple[trt.Builder, trt.INetworkDefinition, Optional[parser]]) – A tuple containing a TensorRT builder, network and optionally parser.
Returns:: The engine.
Return type:: tensorrt.ICudaEngine

class TrtConfigArgs(precision_constraints_default: bool | None = None, allow_random_data_calib_warning: bool | None = None, allow_custom_input_shapes: bool | None = None, allow_engine_capability: bool | None = None, allow_tensor_formats: bool | None = None)[source]

Bases: BaseArgs

TensorRT Builder Configuration: creating the TensorRT BuilderConfig.

Depends on:

DataLoaderArgs

ModelArgs: if allow_custom_input_shapes == True

Parameters:

precision_constraints_default (str) – The default value to use for the precision constraints option. Defaults to “none”.
allow_random_data_calib_warning (bool) – Whether to issue a warning when randomly generated data is being used for calibration. Defaults to True.
allow_custom_input_shapes (bool) – Whether to allow custom input shapes when randomly generating data. Defaults to True.
allow_engine_capability (bool) – Whether to allow engine capability to be specified. Defaults to False.
allow_tensor_formats (bool) – Whether to allow tensor formats and related options to be set. Defaults to False.

parse_impl(args)[source]

Parses command-line arguments and populates the following attributes:

profile_dicts

A list of profiles where each profile is a dictionary that maps input names to a tuple of (min, opt, max) shapes.

Type:: List[OrderedDict[str, Tuple[Shape]]]

tf32

Whether to enable TF32.

Type:: bool

fp16

Whether to enable FP16.

Type:: bool

bf16

Whether to enable BF16.

Type:: bool

fp8

Whether to enable FP8.

Type:: bool

int8

Whether to enable INT8.

Type:: bool

precision_constraints

The precision constraints to apply.

Type:: str

restricted

Whether to enable safety scope checking in the builder.

Type:: bool

calibration_cache

Path to the calibration cache.

Type:: str

calibration_base_class

The name of the base class to use for the calibrator.

Type:: str

sparse_weights

Whether to enable sparse weights.

Type:: bool

load_timing_cache

Path from which to load a timing cache.

Type:: str

load_tactics

Path from which to load a tactic replay file.

Type:: str

save_tactics

Path at which to save a tactic replay file.

Type:: str

tactic_sources

Strings representing enum values of the tactic sources to enable.

Type:: List[str]

trt_config_script

Path to a custom TensorRT config script.

Type:: str

trt_config_func_name

Name of the function in the custom config script that creates the config.

Type:: str

trt_config_postprocess_script

Path to a TensorRT config postprocessing script.

Type:: str

trt_config_postprocess_func_name

Name of the function in the config postprocessing script that applies the post-processing.

Type:: str

use_dla

Whether to enable DLA.

Type:: bool

allow_gpu_fallback

Whether to allow GPU fallback when DLA is enabled.

Type:: bool

memory_pool_limits

Mapping of strings representing memory pool enum values to memory limits in bytes.

Type:: Dict[str, int]

engine_capability

The desired engine capability.

Type:: str

direct_io

Whether to disallow reformatting layers at network input/output tensors which have user-specified formats.

Type:: bool

preview_features

Names of preview features to enable.

Type:: List[str]

refittable

Whether the engine should be refittable.

Type:: bool

strip_plan

Whether the engine should be built with the refittable weights stripped.

Type:: bool

builder_optimization_level

The builder optimization level.

Type:: int

hardware_compatibility_level

A string representing a hardware compatibility level enum value.

Type:: str

profiling_verbosity

A string representing a profiling verbosity level enum value.

Type:: str

max_aux_streams

The maximum number of auxiliary streams that TensorRT is allowed to use.

Type:: int

version_compatible

Whether or not to build a TensorRT forward-compatible.

Type:: bool

exclude_lean_runtime

Whether to exclude the lean runtime from a version compatible plan.

Type:: bool

quantization_flags

Names of quantization flags to enable.

Type:: List[str]

error_on_timing_cache_miss

Whether to emit error when a tactic being timed is not present in the timing cache.

Type:: bool

disable_compilation_cache

Whether to disable caching JIT-compiled code.

Type:: bool

weight_streaming

Whether to enable weight streaming for the TensorRT Engine.

Type:: bool

create_config(builder, network)[source]

Creates a TensorRT BuilderConfig according to arguments provided on the command-line.

Parameters:

builder (trt.Builder) – The TensorRT builder to use to create the configuration.
network (trt.INetworkDefinition) – The TensorRT network for which to create the config. The network is used to automatically create a default optimization profile if none are provided.

Returns:

The TensorRT builder configuration.

Return type:

trt.IBuilderConfig