Loaders

Module: polygraphy.backend.onnx

class GsFromOnnx(model)[source]

Bases: polygraphy.backend.base.loader.BaseLoader

Functor that creates an ONNX-GraphSurgeon graph from an ONNX ModelProto.

Creates an ONNX-GraphSurgeon graph from an ONNX ModelProto.

Parameters

model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

call_impl()[source]
Returns

The ONNX-GraphSurgeon representation of the ONNX model

Return type

onnx_graphsurgeon.Graph

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

class OnnxFromPath(path, external_data_dir=None)[source]

Bases: polygraphy.backend.base.loader.BaseLoader

Functor that loads an ONNX model from a file.

Loads an ONNX model from a file.

Parameters
  • path (str) – The path from which to load the model.

  • external_data_dir (str) – The directory where external data for the model is stored.

call_impl()[source]
Returns

The ONNX model

Return type

onnx.ModelProto

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

class OnnxFromTfGraph(graph, opset=None, optimize=None, fold_constant=None)[source]

Bases: polygraphy.backend.base.loader.BaseLoader

Functor that loads a TensorFlow graph and converts it to ONNX using the tf2onnx converter.

Converts a TensorFlow model into ONNX.

Parameters
  • graph (Union[Tuple[tf.Graph, Sequence[str]], Callable() -> Tuple[tf.Graph, Sequence[str]]]) – A tuple containing a TensorFlow graph and output names or a callable that returns one.

  • opset (int) – The ONNX opset to use during conversion.

  • optimize (bool) – Whether to use tf2onnx’s graph optimization pass.

  • fold_constant (bool) – Whether to fold constants in the TensorFlow Graph. Requires that optimize is also enabled. Defaults to True.

call_impl()[source]
Returns

The ONNX model.

Return type

onnx.ModelProto

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

class ModifyOutputs(model, outputs=None, exclude_outputs=None, copy=None)[source]

Bases: polygraphy.backend.onnx.loader.BaseLoadOnnxCopy

Functor that modifies the outputs of an ONNX model.

Modifies outputs of an ONNX model.

Parameters
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • outputs (Sequence[str]) – Names of tensors to mark as outputs. If provided, this will override the existing model outputs. If a value of constants.MARK_ALL is used instead of a list, all tensors in the network are marked.

  • exclude_outputs (Sequence[str]) – Names of tensors to exclude as outputs. This can be useful in conjunction with outputs=constants.MARK_ALL to omit outputs.

  • copy (bool) – Whether to create a copy of the model first. Defaults to False.

call_impl()[source]
Returns

The ONNX model with modified outputs.

Return type

onnx.ModelProto

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

class ConvertToFp16(model, copy=None)[source]

Bases: polygraphy.backend.onnx.loader.BaseLoadOnnxCopy

Functor that converts all floating point tensors in the model to 16-bit precision. This is not needed in order to use TensorRT’s fp16 precision, but may be useful for other backends.

Converts all floating point tensors in the model to 16-bit precision.

Parameters
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • copy (bool) – Whether to create a copy of the model first. Defaults to False.

call_impl()[source]
Returns

The modified ONNX model.

Return type

onnx.ModelProto

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

class FoldConstants(model, num_passes=None, do_shape_inference=None, partitioning=None, fold_shapes=None, copy=None, error_ok=None)[source]

Bases: polygraphy.backend.onnx.loader.BaseLoadOnnxCopy

Functor that folds constants in an ONNX model.

Fold constants in an ONNX model.

Parameters
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • num_passes (int) – The number of constant folding passes to run. Sometimes, subgraphs that compute tensor shapes may not be foldable in a single pass. By default, Polygraphy will automatically determine the number of passes required.

  • do_shape_inference (bool) – Whether to run shape inference in the model between passes. This enables the loader to fold Shape nodes. Only effective if fold_shapes is True. Defaults to True.

  • partitioning (Union[str, None]) –

    Whether/How to partition the graph so that errors in folding one part of a model do not affect other parts. Available modes are:

    • None: Do not partition the graph. If inference fails, no constants are folded.

    • ’basic’: Partition the graph. If inference fails in one partition, other partitions will remain unaffected.

    • ’recursive’: Parition the graph recursively. If inference fails in a partition, the partition will be further partitioned.

    Defaults to None.

  • fold_shapes (bool) – Whether to fold Shape nodes in the graph. This requires shapes to be inferred in the graph, and can only fold static shapes. Defaults to True.

  • copy (bool) – Whether to create a copy of the model first. Defaults to False.

  • error_ok (bool) – Whether to suppress errors during constant folding. If this is set to False, errors will be re-raised. Defaults to True.

call_impl()[source]
Returns

The new ONNX model with constants folded.

Return type

onnx.ModelProto

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

class InferShapes(model, error_ok=None, external_data_dir=None, save_to_disk_threshold_bytes=None)[source]

Bases: polygraphy.backend.base.loader.BaseLoader

Functor that runs shape inference on an ONNX model.

Run shape inference on an ONNX model.

Parameters
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one, or a path to a model. Supports models larger than the 2 GiB protobuf limit.

  • error_ok (bool) – Whether errors during shape inference should be suppressed. Defaults to True.

  • external_data_dir (str) – The directory where external data for the model is stored. Only used if the model is provided via a path rather than a loader.

  • save_to_disk_threshold_bytes (int) – The size in bytes above which a ModelProto will be serialized to the disk before running shape inference. This can be used to work around the 2 GiB protobuf limitation. Defaults to ~2 GiB.

call_impl()[source]
Returns

The new ONNX model with shapes inferred.

Return type

onnx.ModelProto

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

class ExtractSubgraph(model, input_metadata=None, output_metadata=None, check_meta=None)[source]

Bases: polygraphy.backend.base.loader.BaseLoader

Functor that extracts a subgraph from an ONNX model.

Extracts a subgraph from an ONNX model.

Parameters
  • model (Union[Union[onnx.ModelProto, onnx_graphsurgeon.Graph], Callable() -> Union[onnx.ModelProto, onnx_graphsurgeon.Graph]]) – An ONNX model or ONNX-GraphSurgeon Graph or a callable that returns one.

  • input_metadata (TensorMetadata) – Metadata for the inputs of the subgraph. Name, shape, and data type are required. If not provided, the graph outputs are not modified.

  • output_metadata (TensorMetadata) – Metadata for the outputs of the subgraph. Name and data type are required. If not provided, the graph outputs are not modified.

  • check_meta (bool) – Whether to check that the provided input and output metadata include all the expected fields. Defaults to True.

call_impl()[source]
Returns

The new ONNX model or ONNX-GraphSurgeon Graph.

Return type

Union[onnx.ModelProto, onnx_graphsurgeon.Graph]

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

class SaveOnnx(model, path, external_data_path=None, size_threshold=None, all_tensors_to_one_file=None)[source]

Bases: polygraphy.backend.base.loader.BaseLoader

Functor that saves an ONNX model to the specified path.

Saves an ONNX model to the specified path.

Parameters
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • path (str) – Path at which to write the ONNX model.

  • external_data_path (str) – Path to save external data. This is always a relative path; external data is always written to the same directory as the model. Set to an empty string to use the default path. Set to None to disable. Defaults to None.

  • size_threshold (int) – Tensor size threshold, in bytes, above which tensor data will be stored in the external file. Tensors smaller that this threshold will remain in the ONNX file. Has no effect if external_data_path is not set. Defaults to 1024.

  • all_tensors_to_one_file (bool) – Whether to write all tensors to one file when saving external data. Has no effect if external_data_path is not set. Defaults to True.

call_impl()[source]
Returns

The model, after saving it.

Return type

onnx.ModelProto

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

class BytesFromOnnx(model)[source]

Bases: polygraphy.backend.base.loader.BaseLoader

Functor that serializes an ONNX model.

Serializes an ONNX model.

Parameters

model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

call_impl()[source]
Returns

The serialized model.

Return type

bytes

__call__(*args, **kwargs)

Invokes the loader by forwarding arguments to call_impl.

Note: call_impl should not be called directly - use this function instead.

gs_from_onnx(model)

Immediately evaluated functional variant of GsFromOnnx .

Creates an ONNX-GraphSurgeon graph from an ONNX ModelProto.

Parameters

model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

Returns

The ONNX-GraphSurgeon representation of the ONNX model

Return type

onnx_graphsurgeon.Graph

onnx_from_path(path, external_data_dir=None)

Immediately evaluated functional variant of OnnxFromPath .

Loads an ONNX model from a file.

Parameters
  • path (str) – The path from which to load the model.

  • external_data_dir (str) – The directory where external data for the model is stored.

Returns

The ONNX model

Return type

onnx.ModelProto

onnx_from_tf_graph(graph, opset=None, optimize=None, fold_constant=None)

Immediately evaluated functional variant of OnnxFromTfGraph .

Converts a TensorFlow model into ONNX.

Parameters
  • graph (Union[Tuple[tf.Graph, Sequence[str]], Callable() -> Tuple[tf.Graph, Sequence[str]]]) – A tuple containing a TensorFlow graph and output names or a callable that returns one.

  • opset (int) – The ONNX opset to use during conversion.

  • optimize (bool) – Whether to use tf2onnx’s graph optimization pass.

  • fold_constant (bool) – Whether to fold constants in the TensorFlow Graph. Requires that optimize is also enabled. Defaults to True.

Returns

The ONNX model.

Return type

onnx.ModelProto

modify_outputs(model, outputs=None, exclude_outputs=None, copy=None)

Immediately evaluated functional variant of ModifyOutputs .

Modifies outputs of an ONNX model.

Parameters
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • outputs (Sequence[str]) – Names of tensors to mark as outputs. If provided, this will override the existing model outputs. If a value of constants.MARK_ALL is used instead of a list, all tensors in the network are marked.

  • exclude_outputs (Sequence[str]) – Names of tensors to exclude as outputs. This can be useful in conjunction with outputs=constants.MARK_ALL to omit outputs.

  • copy (bool) – Whether to create a copy of the model first. Defaults to False.

Returns

The ONNX model with modified outputs.

Return type

onnx.ModelProto

convert_to_fp16(model, copy=None)

Immediately evaluated functional variant of ConvertToFp16 .

Converts all floating point tensors in the model to 16-bit precision.

Parameters
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • copy (bool) – Whether to create a copy of the model first. Defaults to False.

Returns

The modified ONNX model.

Return type

onnx.ModelProto

fold_constants(model, num_passes=None, do_shape_inference=None, partitioning=None, fold_shapes=None, copy=None, error_ok=None)

Immediately evaluated functional variant of FoldConstants .

Fold constants in an ONNX model.

Parameters
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • num_passes (int) – The number of constant folding passes to run. Sometimes, subgraphs that compute tensor shapes may not be foldable in a single pass. By default, Polygraphy will automatically determine the number of passes required.

  • do_shape_inference (bool) – Whether to run shape inference in the model between passes. This enables the loader to fold Shape nodes. Only effective if fold_shapes is True. Defaults to True.

  • partitioning (Union[str, None]) –

    Whether/How to partition the graph so that errors in folding one part of a model do not affect other parts. Available modes are:

    • None: Do not partition the graph. If inference fails, no constants are folded.

    • ’basic’: Partition the graph. If inference fails in one partition, other partitions will remain unaffected.

    • ’recursive’: Parition the graph recursively. If inference fails in a partition, the partition will be further partitioned.

    Defaults to None.

  • fold_shapes (bool) – Whether to fold Shape nodes in the graph. This requires shapes to be inferred in the graph, and can only fold static shapes. Defaults to True.

  • copy (bool) – Whether to create a copy of the model first. Defaults to False.

  • error_ok (bool) – Whether to suppress errors during constant folding. If this is set to False, errors will be re-raised. Defaults to True.

Returns

The new ONNX model with constants folded.

Return type

onnx.ModelProto

infer_shapes(model, error_ok=None, external_data_dir=None, save_to_disk_threshold_bytes=None)

Immediately evaluated functional variant of InferShapes .

Run shape inference on an ONNX model.

Parameters
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one, or a path to a model. Supports models larger than the 2 GiB protobuf limit.

  • error_ok (bool) – Whether errors during shape inference should be suppressed. Defaults to True.

  • external_data_dir (str) – The directory where external data for the model is stored. Only used if the model is provided via a path rather than a loader.

  • save_to_disk_threshold_bytes (int) – The size in bytes above which a ModelProto will be serialized to the disk before running shape inference. This can be used to work around the 2 GiB protobuf limitation. Defaults to ~2 GiB.

Returns

The new ONNX model with shapes inferred.

Return type

onnx.ModelProto

extract_subgraph(model, input_metadata=None, output_metadata=None, check_meta=None)

Immediately evaluated functional variant of ExtractSubgraph .

Extracts a subgraph from an ONNX model.

Parameters
  • model (Union[Union[onnx.ModelProto, onnx_graphsurgeon.Graph], Callable() -> Union[onnx.ModelProto, onnx_graphsurgeon.Graph]]) – An ONNX model or ONNX-GraphSurgeon Graph or a callable that returns one.

  • input_metadata (TensorMetadata) – Metadata for the inputs of the subgraph. Name, shape, and data type are required. If not provided, the graph outputs are not modified.

  • output_metadata (TensorMetadata) – Metadata for the outputs of the subgraph. Name and data type are required. If not provided, the graph outputs are not modified.

  • check_meta (bool) – Whether to check that the provided input and output metadata include all the expected fields. Defaults to True.

Returns

The new ONNX model or ONNX-GraphSurgeon Graph.

Return type

Union[onnx.ModelProto, onnx_graphsurgeon.Graph]

save_onnx(model, path, external_data_path=None, size_threshold=None, all_tensors_to_one_file=None)

Immediately evaluated functional variant of SaveOnnx .

Saves an ONNX model to the specified path.

Parameters
  • model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

  • path (str) – Path at which to write the ONNX model.

  • external_data_path (str) – Path to save external data. This is always a relative path; external data is always written to the same directory as the model. Set to an empty string to use the default path. Set to None to disable. Defaults to None.

  • size_threshold (int) – Tensor size threshold, in bytes, above which tensor data will be stored in the external file. Tensors smaller that this threshold will remain in the ONNX file. Has no effect if external_data_path is not set. Defaults to 1024.

  • all_tensors_to_one_file (bool) – Whether to write all tensors to one file when saving external data. Has no effect if external_data_path is not set. Defaults to True.

Returns

The model, after saving it.

Return type

onnx.ModelProto

bytes_from_onnx(model)

Immediately evaluated functional variant of BytesFromOnnx .

Serializes an ONNX model.

Parameters

model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one.

Returns

The serialized model.

Return type

bytes