Loaders

Module: polygraphy.backend.onnx

class GsFromOnnx(model)[source]

Bases: BaseLoader

Functor that creates an ONNX-GraphSurgeon graph from an ONNX ModelProto.

Creates an ONNX-GraphSurgeon graph from an ONNX ModelProto.

Parameters:

model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

call_impl()[source]
Returns:

The ONNX-GraphSurgeon representation of the ONNX model

Return type:

onnx_graphsurgeon.Graph

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

gs_from_onnx(model)

Immediately evaluated functional variant of GsFromOnnx .

Creates an ONNX-GraphSurgeon graph from an ONNX ModelProto.

Parameters:

model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

Returns:

The ONNX-GraphSurgeon representation of the ONNX model

Return type:

onnx_graphsurgeon.Graph

class OnnxFromPath(path, external_data_dir=None, ignore_external_data=None)[source]

Bases: BaseLoader

Functor that loads an ONNX model from a file.

Loads an ONNX model from a file.

Parameters:
  • path (str) – The path from which to load the model.

  • external_data_dir (str) – The directory where external data for the model is stored.

  • ignore_external_data (bool) – Whether to ignore any external data and just load the model structure without any weights. The model will be usable only for purposes that don’t require weights, such as extracting subgraphs or inspecting model structure. This can be useful in cases where external data is not available. Defaults to False.

call_impl()[source]
Returns:

The ONNX model

Return type:

onnx.ModelProto

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

onnx_from_path(path, external_data_dir=None, ignore_external_data=None)

Immediately evaluated functional variant of OnnxFromPath .

Loads an ONNX model from a file.

Parameters:
  • path (str) – The path from which to load the model.

  • external_data_dir (str) – The directory where external data for the model is stored.

  • ignore_external_data (bool) – Whether to ignore any external data and just load the model structure without any weights. The model will be usable only for purposes that don’t require weights, such as extracting subgraphs or inspecting model structure. This can be useful in cases where external data is not available. Defaults to False.

Returns:

The ONNX model

Return type:

onnx.ModelProto

class OnnxFromTfGraph(graph, opset=None, optimize=None)[source]

Bases: BaseLoader

Functor that loads a TensorFlow graph and converts it to ONNX using the tf2onnx converter.

Converts a TensorFlow model into ONNX.

Parameters:
  • graph (Union[Tuple[tf.Graph, Sequence[str]], Callable() -> Tuple[tf.Graph, Sequence[str]]]) – A tuple containing a TensorFlow graph and output names or a callable that returns one.

  • opset (int) – The ONNX opset to use during conversion.

  • optimize (bool) – Whether to use tf2onnx’s graph optimization pass.

call_impl()[source]
Returns:

The ONNX model.

Return type:

onnx.ModelProto

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

onnx_from_tf_graph(graph, opset=None, optimize=None)

Immediately evaluated functional variant of OnnxFromTfGraph .

Converts a TensorFlow model into ONNX.

Parameters:
  • graph (Union[Tuple[tf.Graph, Sequence[str]], Callable() -> Tuple[tf.Graph, Sequence[str]]]) – A tuple containing a TensorFlow graph and output names or a callable that returns one.

  • opset (int) – The ONNX opset to use during conversion.

  • optimize (bool) – Whether to use tf2onnx’s graph optimization pass.

Returns:

The ONNX model.

Return type:

onnx.ModelProto

class ModifyOutputs(model, outputs=None, exclude_outputs=None, copy=None)[source]

Bases: BaseLoadOnnxCopy

Functor that modifies the outputs of an ONNX model.

Modifies outputs of an ONNX model.

Parameters:
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • outputs (Sequence[str]) – Names of tensors to mark as outputs. If provided, this will override the existing model outputs. If a value of constants.MARK_ALL is used instead of a list, all tensors in the network are marked.

  • exclude_outputs (Sequence[str]) – Names of tensors to exclude as outputs. This can be useful in conjunction with outputs=constants.MARK_ALL to omit outputs.

  • copy (bool) – Whether to create a copy of the model first. Defaults to False.

call_impl()[source]
Returns:

The ONNX model with modified outputs.

Return type:

onnx.ModelProto

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

modify_outputs(model, outputs=None, exclude_outputs=None, copy=None)

Immediately evaluated functional variant of ModifyOutputs .

Modifies outputs of an ONNX model.

Parameters:
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • outputs (Sequence[str]) – Names of tensors to mark as outputs. If provided, this will override the existing model outputs. If a value of constants.MARK_ALL is used instead of a list, all tensors in the network are marked.

  • exclude_outputs (Sequence[str]) – Names of tensors to exclude as outputs. This can be useful in conjunction with outputs=constants.MARK_ALL to omit outputs.

  • copy (bool) – Whether to create a copy of the model first. Defaults to False.

Returns:

The ONNX model with modified outputs.

Return type:

onnx.ModelProto

class ConvertToFp16(model, copy=None)[source]

Bases: BaseLoadOnnxCopy

Functor that converts all floating point tensors in the model to 16-bit precision. This is not needed in order to use TensorRT’s fp16 precision, but may be useful for other backends.

Converts all floating point tensors in the model to 16-bit precision.

Parameters:
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • copy (bool) – Whether to create a copy of the model first. Defaults to False.

call_impl()[source]
Returns:

The modified ONNX model.

Return type:

onnx.ModelProto

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

convert_to_fp16(model, copy=None)

Immediately evaluated functional variant of ConvertToFp16 .

Converts all floating point tensors in the model to 16-bit precision.

Parameters:
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • copy (bool) – Whether to create a copy of the model first. Defaults to False.

Returns:

The modified ONNX model.

Return type:

onnx.ModelProto

class FoldConstants(model, num_passes=None, do_shape_inference=None, partitioning=None, fold_shapes=None, copy=None, error_ok=None, size_threshold=None, allow_onnxruntime_shape_inference=None)[source]

Bases: BaseLoadOnnxCopy

Functor that folds constants in an ONNX model.

Fold constants in an ONNX model.

Parameters:
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • num_passes (int) – The number of constant folding passes to run. Sometimes, subgraphs that compute tensor shapes may not be foldable in a single pass. By default, Polygraphy will automatically determine the number of passes required.

  • do_shape_inference (bool) – Whether to run shape inference in the model between passes. This enables the loader to fold Shape nodes. Only effective if fold_shapes is True. Defaults to True.

  • partitioning (Union[str, None]) –

    Whether/How to partition the graph so that errors in folding one part of a model do not affect other parts. Available modes are:

    • None: Do not partition the graph. If inference fails, no constants are folded.

    • ’basic’: Partition the graph. If inference fails in one partition, other partitions will remain unaffected.

    • ’recursive’: Parition the graph recursively. If inference fails in a partition, the partition will be further partitioned.

    Defaults to None.

  • fold_shapes (bool) – Whether to fold Shape nodes in the graph. This requires shapes to be inferred in the graph, and can only fold static shapes. Defaults to True.

  • copy (bool) – Whether to create a copy of the model first. Defaults to False.

  • error_ok (bool) – Whether to suppress errors during constant folding. If this is set to False, errors will be re-raised. Defaults to True.

  • size_threshold (int) – The maximum size threshold, in bytes, for which to fold constants. Any tensors larger than this value will not be folded. Set to None to disable the size threshold and always fold constants. For example, some models may apply ops like Tile or Expand to constants, which can result in very large tensors. Rather than pre-computing those constants and bloating the model size, it may be desirable to skip folding them and allow them to be computed at runtime. Defaults to None.

  • allow_onnxruntime_shape_inference (bool) – Allow ONNX-Runtime’s shape inference to be used if available instead of ONNX’s shape inference utilities. The former may provide performance or memory usage benefits. Has no effect if do_shape_inference is False. Defaults to True.

call_impl()[source]
Returns:

The new ONNX model with constants folded.

Return type:

onnx.ModelProto

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

fold_constants(model, num_passes=None, do_shape_inference=None, partitioning=None, fold_shapes=None, copy=None, error_ok=None, size_threshold=None, allow_onnxruntime_shape_inference=None)

Immediately evaluated functional variant of FoldConstants .

Fold constants in an ONNX model.

Parameters:
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • num_passes (int) – The number of constant folding passes to run. Sometimes, subgraphs that compute tensor shapes may not be foldable in a single pass. By default, Polygraphy will automatically determine the number of passes required.

  • do_shape_inference (bool) – Whether to run shape inference in the model between passes. This enables the loader to fold Shape nodes. Only effective if fold_shapes is True. Defaults to True.

  • partitioning (Union[str, None]) –

    Whether/How to partition the graph so that errors in folding one part of a model do not affect other parts. Available modes are:

    • None: Do not partition the graph. If inference fails, no constants are folded.

    • ’basic’: Partition the graph. If inference fails in one partition, other partitions will remain unaffected.

    • ’recursive’: Parition the graph recursively. If inference fails in a partition, the partition will be further partitioned.

    Defaults to None.

  • fold_shapes (bool) – Whether to fold Shape nodes in the graph. This requires shapes to be inferred in the graph, and can only fold static shapes. Defaults to True.

  • copy (bool) – Whether to create a copy of the model first. Defaults to False.

  • error_ok (bool) – Whether to suppress errors during constant folding. If this is set to False, errors will be re-raised. Defaults to True.

  • size_threshold (int) – The maximum size threshold, in bytes, for which to fold constants. Any tensors larger than this value will not be folded. Set to None to disable the size threshold and always fold constants. For example, some models may apply ops like Tile or Expand to constants, which can result in very large tensors. Rather than pre-computing those constants and bloating the model size, it may be desirable to skip folding them and allow them to be computed at runtime. Defaults to None.

  • allow_onnxruntime_shape_inference (bool) – Allow ONNX-Runtime’s shape inference to be used if available instead of ONNX’s shape inference utilities. The former may provide performance or memory usage benefits. Has no effect if do_shape_inference is False. Defaults to True.

Returns:

The new ONNX model with constants folded.

Return type:

onnx.ModelProto

class SetUpperBound(model, upper_bounds, copy=None)[source]

Bases: BaseLoadOnnxCopy

Functor that sets upper bounds for tensors with unbounded DDS in an ONNX model.

Requires that the model has been constant folded and has shapes inferred.

Set upper bounds for tensors with unbounded DDS in an ONNX model.

Parameters:
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • upper_bounds (Union[int, Dict[str, int]]) – The upper bounds for tensors with unbounded DDS. If a single integer is provided, it will be used as the default upper bound for all tensors with unbounded DDS. This can also be provided on a per-tensor basis using a dictionary. In that case, use an empty string (“”) as the key to specify default upper bound for tensors not explicitly listed.

  • copy (bool) – Whether to create a copy of the model first. Defaults to False.

call_impl()[source]
Returns:

The new ONNX model.

Return type:

onnx.ModelProto

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

set_upper_bound(model, upper_bounds, copy=None)

Immediately evaluated functional variant of SetUpperBound .

Set upper bounds for tensors with unbounded DDS in an ONNX model.

Parameters:
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • upper_bounds (Union[int, Dict[str, int]]) – The upper bounds for tensors with unbounded DDS. If a single integer is provided, it will be used as the default upper bound for all tensors with unbounded DDS. This can also be provided on a per-tensor basis using a dictionary. In that case, use an empty string (“”) as the key to specify default upper bound for tensors not explicitly listed.

  • copy (bool) – Whether to create a copy of the model first. Defaults to False.

Returns:

The new ONNX model.

Return type:

onnx.ModelProto

class InferShapes(model, error_ok=None, external_data_dir=None, save_to_disk_threshold_bytes=None, allow_onnxruntime=None)[source]

Bases: BaseLoader

Functor that runs shape inference on an ONNX model.

Run shape inference on an ONNX model.

Parameters:
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto, str, Callable() -> str]) – An ONNX model or a callable that returns one, or a path to a model. Supports models larger than the 2 GB protobuf limit.

  • error_ok (bool) – Whether errors during shape inference should be suppressed. Defaults to True.

  • external_data_dir (str) – The directory where external data for the model is stored. Only used if the model is provided via a path rather than a loader.

  • save_to_disk_threshold_bytes (int) – The size in bytes above which a ModelProto will be serialized to the disk before running shape inference. This can be used to work around the 2 GB protobuf limitation. Defaults to 2 GB.

  • allow_onnxruntime (bool) – Allow ONNX-Runtime’s shape inference to be used if available instead of ONNX’s shape inference utilities. The former may provide performance or memory usage benefits. Defaults to True.

call_impl()[source]
Returns:

The new ONNX model with shapes inferred.

Return type:

onnx.ModelProto

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

infer_shapes(model, error_ok=None, external_data_dir=None, save_to_disk_threshold_bytes=None, allow_onnxruntime=None)

Immediately evaluated functional variant of InferShapes .

Run shape inference on an ONNX model.

Parameters:
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto, str, Callable() -> str]) – An ONNX model or a callable that returns one, or a path to a model. Supports models larger than the 2 GB protobuf limit.

  • error_ok (bool) – Whether errors during shape inference should be suppressed. Defaults to True.

  • external_data_dir (str) – The directory where external data for the model is stored. Only used if the model is provided via a path rather than a loader.

  • save_to_disk_threshold_bytes (int) – The size in bytes above which a ModelProto will be serialized to the disk before running shape inference. This can be used to work around the 2 GB protobuf limitation. Defaults to 2 GB.

  • allow_onnxruntime (bool) – Allow ONNX-Runtime’s shape inference to be used if available instead of ONNX’s shape inference utilities. The former may provide performance or memory usage benefits. Defaults to True.

Returns:

The new ONNX model with shapes inferred.

Return type:

onnx.ModelProto

class ExtractSubgraph(model, input_metadata=None, output_metadata=None, check_meta=None)[source]

Bases: BaseLoader

Functor that extracts a subgraph from an ONNX model.

Extracts a subgraph from an ONNX model.

Parameters:
  • model (Union[Union[onnx.ModelProto, onnx_graphsurgeon.Graph], Callable() -> Union[onnx.ModelProto, onnx_graphsurgeon.Graph]]) – An ONNX model or ONNX-GraphSurgeon Graph or a callable that returns one.

  • input_metadata (TensorMetadata) – Metadata for the inputs of the subgraph. Name, shape, and data type are required. If not provided, the graph outputs are not modified.

  • output_metadata (TensorMetadata) – Metadata for the outputs of the subgraph. Name and data type are required. If not provided, the graph outputs are not modified.

  • check_meta (bool) – Whether to check that the provided input and output metadata include all the expected fields. Defaults to True.

call_impl()[source]
Returns:

The new ONNX model or ONNX-GraphSurgeon Graph.

Return type:

Union[onnx.ModelProto, onnx_graphsurgeon.Graph]

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

extract_subgraph(model, input_metadata=None, output_metadata=None, check_meta=None)

Immediately evaluated functional variant of ExtractSubgraph .

Extracts a subgraph from an ONNX model.

Parameters:
  • model (Union[Union[onnx.ModelProto, onnx_graphsurgeon.Graph], Callable() -> Union[onnx.ModelProto, onnx_graphsurgeon.Graph]]) – An ONNX model or ONNX-GraphSurgeon Graph or a callable that returns one.

  • input_metadata (TensorMetadata) – Metadata for the inputs of the subgraph. Name, shape, and data type are required. If not provided, the graph outputs are not modified.

  • output_metadata (TensorMetadata) – Metadata for the outputs of the subgraph. Name and data type are required. If not provided, the graph outputs are not modified.

  • check_meta (bool) – Whether to check that the provided input and output metadata include all the expected fields. Defaults to True.

Returns:

The new ONNX model or ONNX-GraphSurgeon Graph.

Return type:

Union[onnx.ModelProto, onnx_graphsurgeon.Graph]

class SaveOnnx(model, path, external_data_path=None, size_threshold=None, all_tensors_to_one_file=None)[source]

Bases: BaseLoader

Functor that saves an ONNX model to the specified path.

Saves an ONNX model to the specified path.

Parameters:
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • path (str) – Path at which to write the ONNX model.

  • external_data_path (str) – Path to save external data. This is always a relative path; external data is always written to the same directory as the model. Set to an empty string to use the default path. Set to None to disable. Defaults to None if the model is within the protobuf size threshold and an empty string otherwise.

  • size_threshold (int) – Tensor size threshold, in bytes, above which tensor data will be stored in the external file. Tensors smaller that this threshold will remain in the ONNX file. Has no effect if external_data_path is not set. Defaults to 1024.

  • all_tensors_to_one_file (bool) – Whether to write all tensors to one file when saving external data. Has no effect if external_data_path is not set. Defaults to True.

call_impl()[source]
Returns:

The model, after saving it.

Return type:

onnx.ModelProto

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

save_onnx(model, path, external_data_path=None, size_threshold=None, all_tensors_to_one_file=None)

Immediately evaluated functional variant of SaveOnnx .

Saves an ONNX model to the specified path.

Parameters:
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • path (str) – Path at which to write the ONNX model.

  • external_data_path (str) – Path to save external data. This is always a relative path; external data is always written to the same directory as the model. Set to an empty string to use the default path. Set to None to disable. Defaults to None if the model is within the protobuf size threshold and an empty string otherwise.

  • size_threshold (int) – Tensor size threshold, in bytes, above which tensor data will be stored in the external file. Tensors smaller that this threshold will remain in the ONNX file. Has no effect if external_data_path is not set. Defaults to 1024.

  • all_tensors_to_one_file (bool) – Whether to write all tensors to one file when saving external data. Has no effect if external_data_path is not set. Defaults to True.

Returns:

The model, after saving it.

Return type:

onnx.ModelProto

class BytesFromOnnx(model)[source]

Bases: BaseLoader

Functor that serializes an ONNX model.

Serializes an ONNX model.

Parameters:

model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

call_impl()[source]
Returns:

The serialized model.

Return type:

bytes

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

bytes_from_onnx(model)

Immediately evaluated functional variant of BytesFromOnnx .

Serializes an ONNX model.

Parameters:

model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

Returns:

The serialized model.

Return type:

bytes

class OnnxFromBytes(serialized_onnx)[source]

Bases: BaseLoader

Functor that deserializes an ONNX model.

Deserializes an ONNX model.

Parameters:

serialized_onnx (Union[bytes, Callable() -> bytes]) – A serialized ONNX model or a callable that returns one.

call_impl()[source]
Returns:

The ONNX model.

Return type:

onnx.ModelProto

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

onnx_from_bytes(serialized_onnx)

Immediately evaluated functional variant of OnnxFromBytes .

Deserializes an ONNX model.

Parameters:

serialized_onnx (Union[bytes, Callable() -> bytes]) – A serialized ONNX model or a callable that returns one.

Returns:

The ONNX model.

Return type:

onnx.ModelProto