Loaders

Module: polygraphy.tools.args

class TrtLoadPluginsArgs[source]

Bases: polygraphy.tools.args.base.BaseArgs

TensorRT Plugin Loading: loading TensorRT plugins.

parse_impl(args)[source]

Parses command-line arguments and populates the following attributes:

plugins

Path(s) to plugin libraries.

Type

List[str]

add_to_script_impl(script, loader_name: str)[source]
Parameters

loader_name (str) – The name of the loader which should be consumed by the LoadPlugins loader.

class TrtOnnxFlagArgs[source]

Bases: polygraphy.tools.args.base.BaseArgs

ONNX-TRT Parser Flags: setting flags for TensorRT’s ONNX parser

Depends on:

  • TrtConfigArgs: If NATIVE_INSTANCENORM should be automatically enabled in VC/HC mode

parse_impl(args)[source]

Parses command-line arguments and populates the following attributes:

flags

flags for onnxparser

Type

List[str]

get_flags()[source]

Updates and returns the ONNX parser flags as necessary. This must be called only in add_to_script_impl. Flags should not be accessed directly.

class TrtLoadNetworkArgs(allow_custom_outputs: Optional[bool] = None, allow_onnx_loading: Optional[bool] = None, allow_tensor_formats: Optional[bool] = None)[source]

Bases: polygraphy.tools.args.base.BaseArgs

TensorRT Network Loading: loading TensorRT networks.

Depends on:

  • ModelArgs

  • TrtLoadPluginsArgs

  • OnnxLoadArgs: if allow_onnx_loading == True

  • TrtOnnxFlagArgs

Parameters
  • allow_custom_outputs (bool) – Whether to allow marking custom output tensors. Defaults to True.

  • allow_onnx_loading (bool) – Whether to allow parsing networks from an ONNX model. Defaults to True.

  • allow_tensor_formats (bool) – Whether to allow tensor formats and related options to be set. Defaults to False.

parse_impl(args)[source]

Parses command-line arguments and populates the following attributes:

outputs

Names of output tensors.

Type

List[str]

exclude_outputs

Names of tensors which should be unmarked as outputs.

Type

List[str]

trt_network_func_name

The name of the function in a custom network script that creates the network.

Type

str

layer_precisions

Layer names mapped to their desired compute precision, in string form.

Type

Dict[str, str]

tensor_datatypes

Tensor names mapped to their desired data types, in string form.

Type

Dict[str, str]

tensor_formats

Tensor names mapped to their desired formats, in string form.

Type

Dict[str, List[str]]

postprocess_scripts

A list of tuples specifying a path to a network postprocessing script and the name of the postprocessing function.

Type

List[Tuple[str, str]]

strongly_typed

Whether to mark the network as being strongly typed.

Type

bool

load_network()[source]

Loads a TensorRT Network model according to arguments provided on the command-line.

Returns

tensorrt.INetworkDefinition

class TrtSaveEngineBytesArgs(output_opt: Optional[str] = None, output_short_opt: Optional[str] = None)[source]

Bases: polygraphy.tools.args.base.BaseArgs

TensorRT Engine Saving: saving TensorRT engines.

Saves a serialized engine. This should be preferred over TrtSaveEngineArgs() since as of TensorRT 8.6, version compatible engines cannot be re-serialized after they have been initially deserialized.

Parameters
  • output_opt (str) – The name of the output path option. Defaults to “output”. Use a value of False to disable the option.

  • output_short_opt (str) – The short option to use for the output path. Defaults to “-o”. Use a value of False to disable the short option.

parse_impl(args)[source]

Parses command-line arguments and populates the following attributes:

path

The path at which to save the TensorRT engine.

Type

str

add_to_script_impl(script, loader_name)[source]
Parameters

loader_name (str) – The name of the loader which will generate the serialized engine.

Returns

The name of the loader added to the script.

Return type

str

save_engine_bytes(engine_bytes, path=None)[source]

Saves a serialized TensorRT engine according to arguments provided on the command-line.

Parameters
  • engine_bytes (bytes) – The serialized TensorRT engine to save.

  • path (str) – The path at which to save the engine. If no path is provided, it is determined from command-line arguments.

Returns

The serialized engine that was saved.

Return type

bytes

TrtSaveEngineArgs

alias of polygraphy.mod.exporter.deprecate.<locals>.deprecate_impl.<locals>.Deprecated

class TrtLoadEngineBytesArgs(allow_saving: Optional[bool] = None)[source]

Bases: polygraphy.tools.args.base.BaseArgs

TensorRT Engine: loading or building TensorRT engines.

Depends on:

  • ModelArgs

  • TrtLoadPluginsArgs

  • TrtLoadNetworkArgs: if support for building engines is required

  • TrtConfigArgs: if support for building engines is required

  • TrtSaveEngineBytesArgs: if allow_saving == True

Parameters

allow_saving (bool) – Whether to allow loaded models to be saved. Defaults to False.

parse_impl(args)[source]

Parses command-line arguments and populates the following attributes:

save_timing_cache

Path at which to save the tactic timing cache.

Type

str

add_to_script_impl(script, network_name=None)[source]
Parameters

network_name (str) – The name of a variable in the script pointing to a network loader.

load_engine_bytes(network=None)[source]

Loads a TensorRT engine according to arguments provided on the command-line.

Parameters

network (Tuple[trt.Builder, trt.INetworkDefinition, Optional[parser]]) – A tuple containing a TensorRT builder, network and optionally parser.

Returns

The engine.

Return type

tensorrt.ICudaEngine

class TrtLoadEngineArgs[source]

Bases: polygraphy.tools.args.base.BaseArgs

TensorRT Engine: loading TensorRT engines.

Depends on:

  • TrtLoadEngineBytesArgs

  • TrtLoadPluginsArgs

parse_impl(args)[source]

Parses command-line arguments and populates the following attributes:

load_runtime

Path rom which to load a runtime that can be used to load a version compatible engine that excludes the lean runtime.

Type

str

add_to_script_impl(script, network_name=None)[source]
Parameters

network_name (str) – The name of a variable in the script pointing to a network loader.

load_engine(network=None)[source]

Loads a TensorRT engine according to arguments provided on the command-line.

Parameters

network (Tuple[trt.Builder, trt.INetworkDefinition, Optional[parser]]) – A tuple containing a TensorRT builder, network and optionally parser.

Returns

The engine.

Return type

tensorrt.ICudaEngine

class TrtConfigArgs(precision_constraints_default: Optional[bool] = None, allow_random_data_calib_warning: Optional[bool] = None, allow_custom_input_shapes: Optional[bool] = None, allow_engine_capability: Optional[bool] = None, allow_tensor_formats: Optional[bool] = None)[source]

Bases: polygraphy.tools.args.base.BaseArgs

TensorRT Builder Configuration: creating the TensorRT BuilderConfig.

Depends on:

  • DataLoaderArgs

  • ModelArgs: if allow_custom_input_shapes == True

Parameters
  • precision_constraints_default (str) – The default value to use for the precision constraints option. Defaults to “none”.

  • allow_random_data_calib_warning (bool) – Whether to issue a warning when randomly generated data is being used for calibration. Defaults to True.

  • allow_custom_input_shapes (bool) – Whether to allow custom input shapes when randomly generating data. Defaults to True.

  • allow_engine_capability (bool) – Whether to allow engine capability to be specified. Defaults to False.

  • allow_tensor_formats (bool) – Whether to allow tensor formats and related options to be set. Defaults to False.

parse_impl(args)[source]

Parses command-line arguments and populates the following attributes:

profile_dicts

A list of profiles where each profile is a dictionary that maps input names to a tuple of (min, opt, max) shapes.

Type

List[OrderedDict[str, Tuple[Shape]]]

tf32

Whether to enable TF32.

Type

bool

fp16

Whether to enable FP16.

Type

bool

bf16

Whether to enable BF16.

Type

bool

fp8

Whether to enable FP8.

Type

bool

int8

Whether to enable INT8.

Type

bool

precision_constraints

The precision constraints to apply.

Type

str

restricted

Whether to enable safety scope checking in the builder.

Type

bool

calibration_cache

Path to the calibration cache.

Type

str

calibration_base_class

The name of the base class to use for the calibrator.

Type

str

sparse_weights

Whether to enable sparse weights.

Type

bool

load_timing_cache

Path from which to load a timing cache.

Type

str

load_tactics

Path from which to load a tactic replay file.

Type

str

save_tactics

Path at which to save a tactic replay file.

Type

str

tactic_sources

Strings representing enum values of the tactic sources to enable.

Type

List[str]

trt_config_script

Path to a custom TensorRT config script.

Type

str

trt_config_func_name

Name of the function in the custom config script that creates the config.

Type

str

trt_config_postprocess_script

Path to a TensorRT config postprocessing script.

Type

str

trt_config_postprocess_func_name

Name of the function in the config postprocessing script that applies the post-processing.

Type

str

use_dla

Whether to enable DLA.

Type

bool

allow_gpu_fallback

Whether to allow GPU fallback when DLA is enabled.

Type

bool

memory_pool_limits

Mapping of strings representing memory pool enum values to memory limits in bytes.

Type

Dict[str, int]

engine_capability

The desired engine capability.

Type

str

direct_io

Whether to disallow reformatting layers at network input/output tensors which have user-specified formats.

Type

bool

preview_features

Names of preview features to enable.

Type

List[str]

refittable

Whether the engine should be refittable.

Type

bool

builder_optimization_level

The builder optimization level.

Type

int

hardware_compatibility_level

A string representing a hardware compatibility level enum value.

Type

str

profiling_verbosity

A string representing a profiling verbosity level enum value.

Type

str

max_aux_streams

The maximum number of auxiliary streams that TensorRT is allowed to use.

Type

int

version_compatible

Whether or not to build a TensorRT forward-compatible.

Type

bool

exclude_lean_runtime

Whether to exclude the lean runtime from a version compatible plan.

Type

bool

quantization_flags

Names of quantization flags to enable.

Type

List[str]

error_on_timing_cache_miss

Whether to emit error when a tactic being timed is not present in the timing cache.

Type

bool

disable_compilation_cache

Whether to disable caching JIT-compiled code.

Type

bool

create_config(builder, network)[source]

Creates a TensorRT BuilderConfig according to arguments provided on the command-line.

Parameters
  • builder (trt.Builder) – The TensorRT builder to use to create the configuration.

  • network (trt.INetworkDefinition) – The TensorRT network for which to create the config. The network is used to automatically create a default optimization profile if none are provided.

Returns

The TensorRT builder configuration.

Return type

trt.IBuilderConfig