Loaders
Module: polygraphy.backend.onnx
- class GsFromOnnx(model)[source]
- Bases: - BaseLoader- Functor that creates an ONNX-GraphSurgeon graph from an ONNX ModelProto. - Creates an ONNX-GraphSurgeon graph from an ONNX ModelProto. - Parameters:
- model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one. 
 - call_impl()[source]
- Returns:
- The ONNX-GraphSurgeon representation of the ONNX model 
- Return type:
- onnx_graphsurgeon.Graph 
 
 - __call__(*args, **kwargs)
- Invokes the loader by forwarding arguments to - call_impl.- Note: - call_implshould not be called directly - use this function instead.
 
- gs_from_onnx(model)
- Immediately evaluated functional variant of - GsFromOnnx.- Creates an ONNX-GraphSurgeon graph from an ONNX ModelProto. - Parameters:
- model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one. 
- Returns:
- The ONNX-GraphSurgeon representation of the ONNX model 
- Return type:
- onnx_graphsurgeon.Graph 
 
- class OnnxFromPath(path, external_data_dir=None, ignore_external_data=None)[source]
- Bases: - BaseLoader- Functor that loads an ONNX model from a file. - Loads an ONNX model from a file. - Parameters:
- path (str) – The path from which to load the model. 
- external_data_dir (str) – The directory where external data for the model is stored. 
- ignore_external_data (bool) – Whether to ignore any external data and just load the model structure without any weights. The model will be usable only for purposes that don’t require weights, such as extracting subgraphs or inspecting model structure. This can be useful in cases where external data is not available. Defaults to False. 
 
 - __call__(*args, **kwargs)
- Invokes the loader by forwarding arguments to - call_impl.- Note: - call_implshould not be called directly - use this function instead.
 
- onnx_from_path(path, external_data_dir=None, ignore_external_data=None)
- Immediately evaluated functional variant of - OnnxFromPath.- Loads an ONNX model from a file. - Parameters:
- path (str) – The path from which to load the model. 
- external_data_dir (str) – The directory where external data for the model is stored. 
- ignore_external_data (bool) – Whether to ignore any external data and just load the model structure without any weights. The model will be usable only for purposes that don’t require weights, such as extracting subgraphs or inspecting model structure. This can be useful in cases where external data is not available. Defaults to False. 
 
- Returns:
- The ONNX model 
- Return type:
- onnx.ModelProto 
 
- class OnnxFromTfGraph(graph, opset=None, optimize=None)[source]
- Bases: - BaseLoader- Functor that loads a TensorFlow graph and converts it to ONNX using the tf2onnx converter. - Converts a TensorFlow model into ONNX. - Parameters:
- graph (Union[Tuple[tf.Graph, Sequence[str]], Callable() -> Tuple[tf.Graph, Sequence[str]]]) – A tuple containing a TensorFlow graph and output names or a callable that returns one. 
- opset (int) – The ONNX opset to use during conversion. 
- optimize (bool) – Whether to use tf2onnx’s graph optimization pass. 
 
 - __call__(*args, **kwargs)
- Invokes the loader by forwarding arguments to - call_impl.- Note: - call_implshould not be called directly - use this function instead.
 
- onnx_from_tf_graph(graph, opset=None, optimize=None)
- Immediately evaluated functional variant of - OnnxFromTfGraph.- Converts a TensorFlow model into ONNX. - Parameters:
- graph (Union[Tuple[tf.Graph, Sequence[str]], Callable() -> Tuple[tf.Graph, Sequence[str]]]) – A tuple containing a TensorFlow graph and output names or a callable that returns one. 
- opset (int) – The ONNX opset to use during conversion. 
- optimize (bool) – Whether to use tf2onnx’s graph optimization pass. 
 
- Returns:
- The ONNX model. 
- Return type:
- onnx.ModelProto 
 
- class ModifyOutputs(model, outputs=None, exclude_outputs=None, copy=None)[source]
- Bases: - BaseLoadOnnxCopy- Functor that modifies the outputs of an ONNX model. - Modifies outputs of an ONNX model. - Parameters:
- model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one. 
- outputs (Sequence[str]) – Names of tensors to mark as outputs. If provided, this will override the existing model outputs. If a value of constants.MARK_ALL is used instead of a list, all tensors in the network are marked. 
- exclude_outputs (Sequence[str]) – Names of tensors to exclude as outputs. This can be useful in conjunction with - outputs=constants.MARK_ALLto omit outputs.
- copy (bool) – Whether to create a copy of the model first. Defaults to False. 
 
 - __call__(*args, **kwargs)
- Invokes the loader by forwarding arguments to - call_impl.- Note: - call_implshould not be called directly - use this function instead.
 
- modify_outputs(model, outputs=None, exclude_outputs=None, copy=None)
- Immediately evaluated functional variant of - ModifyOutputs.- Modifies outputs of an ONNX model. - Parameters:
- model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one. 
- outputs (Sequence[str]) – Names of tensors to mark as outputs. If provided, this will override the existing model outputs. If a value of constants.MARK_ALL is used instead of a list, all tensors in the network are marked. 
- exclude_outputs (Sequence[str]) – Names of tensors to exclude as outputs. This can be useful in conjunction with - outputs=constants.MARK_ALLto omit outputs.
- copy (bool) – Whether to create a copy of the model first. Defaults to False. 
 
- Returns:
- The ONNX model with modified outputs. 
- Return type:
- onnx.ModelProto 
 
- class ConvertToFp16(model, copy=None)[source]
- Bases: - BaseLoadOnnxCopy- Functor that converts all floating point tensors in the model to 16-bit precision. This is not needed in order to use TensorRT’s fp16 precision, but may be useful for other backends. - Converts all floating point tensors in the model to 16-bit precision. - Parameters:
- model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one. 
- copy (bool) – Whether to create a copy of the model first. Defaults to False. 
 
 - __call__(*args, **kwargs)
- Invokes the loader by forwarding arguments to - call_impl.- Note: - call_implshould not be called directly - use this function instead.
 
- convert_to_fp16(model, copy=None)
- Immediately evaluated functional variant of - ConvertToFp16.- Converts all floating point tensors in the model to 16-bit precision. - Parameters:
- model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one. 
- copy (bool) – Whether to create a copy of the model first. Defaults to False. 
 
- Returns:
- The modified ONNX model. 
- Return type:
- onnx.ModelProto 
 
- class FoldConstants(model, num_passes=None, do_shape_inference=None, partitioning=None, fold_shapes=None, copy=None, error_ok=None, size_threshold=None, allow_onnxruntime_shape_inference=None)[source]
- Bases: - BaseLoadOnnxCopy- Functor that folds constants in an ONNX model. - Fold constants in an ONNX model. - Parameters:
- model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one. 
- num_passes (int) – The number of constant folding passes to run. Sometimes, subgraphs that compute tensor shapes may not be foldable in a single pass. By default, Polygraphy will automatically determine the number of passes required. 
- do_shape_inference (bool) – Whether to run shape inference in the model between passes. This enables the loader to fold Shape nodes. Only effective if fold_shapes is True. Defaults to True. 
- partitioning (Union[str, None]) – - Whether/How to partition the graph so that errors in folding one part of a model do not affect other parts. Available modes are: - None: Do not partition the graph. If inference fails, no constants are folded. 
- ’basic’: Partition the graph. If inference fails in one partition, other partitions will remain unaffected. 
- ’recursive’: Parition the graph recursively. If inference fails in a partition, the partition will be further partitioned. 
 - Defaults to None. 
- fold_shapes (bool) – Whether to fold Shape nodes in the graph. This requires shapes to be inferred in the graph, and can only fold static shapes. Defaults to True. 
- copy (bool) – Whether to create a copy of the model first. Defaults to False. 
- error_ok (bool) – Whether to suppress errors during constant folding. If this is set to - False, errors will be re-raised. Defaults to True.
- size_threshold (int) – The maximum size threshold, in bytes, for which to fold constants. Any tensors larger than this value will not be folded. Set to - Noneto disable the size threshold and always fold constants. For example, some models may apply ops like Tile or Expand to constants, which can result in very large tensors. Rather than pre-computing those constants and bloating the model size, it may be desirable to skip folding them and allow them to be computed at runtime. Defaults to None.
- allow_onnxruntime_shape_inference (bool) – Allow ONNX-Runtime’s shape inference to be used if available instead of ONNX’s shape inference utilities. The former may provide performance or memory usage benefits. Has no effect if - do_shape_inferenceis False. Defaults to True.
 
 - call_impl()[source]
- Returns:
- The new ONNX model with constants folded. 
- Return type:
- onnx.ModelProto 
 
 - __call__(*args, **kwargs)
- Invokes the loader by forwarding arguments to - call_impl.- Note: - call_implshould not be called directly - use this function instead.
 
- fold_constants(model, num_passes=None, do_shape_inference=None, partitioning=None, fold_shapes=None, copy=None, error_ok=None, size_threshold=None, allow_onnxruntime_shape_inference=None)
- Immediately evaluated functional variant of - FoldConstants.- Fold constants in an ONNX model. - Parameters:
- model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one. 
- num_passes (int) – The number of constant folding passes to run. Sometimes, subgraphs that compute tensor shapes may not be foldable in a single pass. By default, Polygraphy will automatically determine the number of passes required. 
- do_shape_inference (bool) – Whether to run shape inference in the model between passes. This enables the loader to fold Shape nodes. Only effective if fold_shapes is True. Defaults to True. 
- partitioning (Union[str, None]) – - Whether/How to partition the graph so that errors in folding one part of a model do not affect other parts. Available modes are: - None: Do not partition the graph. If inference fails, no constants are folded. 
- ’basic’: Partition the graph. If inference fails in one partition, other partitions will remain unaffected. 
- ’recursive’: Parition the graph recursively. If inference fails in a partition, the partition will be further partitioned. 
 - Defaults to None. 
- fold_shapes (bool) – Whether to fold Shape nodes in the graph. This requires shapes to be inferred in the graph, and can only fold static shapes. Defaults to True. 
- copy (bool) – Whether to create a copy of the model first. Defaults to False. 
- error_ok (bool) – Whether to suppress errors during constant folding. If this is set to - False, errors will be re-raised. Defaults to True.
- size_threshold (int) – The maximum size threshold, in bytes, for which to fold constants. Any tensors larger than this value will not be folded. Set to - Noneto disable the size threshold and always fold constants. For example, some models may apply ops like Tile or Expand to constants, which can result in very large tensors. Rather than pre-computing those constants and bloating the model size, it may be desirable to skip folding them and allow them to be computed at runtime. Defaults to None.
- allow_onnxruntime_shape_inference (bool) – Allow ONNX-Runtime’s shape inference to be used if available instead of ONNX’s shape inference utilities. The former may provide performance or memory usage benefits. Has no effect if - do_shape_inferenceis False. Defaults to True.
 
- Returns:
- The new ONNX model with constants folded. 
- Return type:
- onnx.ModelProto 
 
- class SetUpperBound(model, upper_bounds, copy=None)[source]
- Bases: - BaseLoadOnnxCopy- Functor that sets upper bounds for tensors with unbounded DDS in an ONNX model. - Requires that the model has been constant folded and has shapes inferred. - Set upper bounds for tensors with unbounded DDS in an ONNX model. - Parameters:
- model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one. 
- upper_bounds (Union[int, Dict[str, int]]) – The upper bounds for tensors with unbounded DDS. If a single integer is provided, it will be used as the default upper bound for all tensors with unbounded DDS. This can also be provided on a per-tensor basis using a dictionary. In that case, use an empty string (“”) as the key to specify default upper bound for tensors not explicitly listed. 
- copy (bool) – Whether to create a copy of the model first. Defaults to False. 
 
 - __call__(*args, **kwargs)
- Invokes the loader by forwarding arguments to - call_impl.- Note: - call_implshould not be called directly - use this function instead.
 
- set_upper_bound(model, upper_bounds, copy=None)
- Immediately evaluated functional variant of - SetUpperBound.- Set upper bounds for tensors with unbounded DDS in an ONNX model. - Parameters:
- model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one. 
- upper_bounds (Union[int, Dict[str, int]]) – The upper bounds for tensors with unbounded DDS. If a single integer is provided, it will be used as the default upper bound for all tensors with unbounded DDS. This can also be provided on a per-tensor basis using a dictionary. In that case, use an empty string (“”) as the key to specify default upper bound for tensors not explicitly listed. 
- copy (bool) – Whether to create a copy of the model first. Defaults to False. 
 
- Returns:
- The new ONNX model. 
- Return type:
- onnx.ModelProto 
 
- class InferShapes(model, error_ok=None, external_data_dir=None, save_to_disk_threshold_bytes=None, allow_onnxruntime=None)[source]
- Bases: - BaseLoader- Functor that runs shape inference on an ONNX model. - Run shape inference on an ONNX model. - Parameters:
- model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto, str, Callable() -> str]) – An ONNX model or a callable that returns one, or a path to a model. Supports models larger than the 2 GB protobuf limit. 
- error_ok (bool) – Whether errors during shape inference should be suppressed. Defaults to True. 
- external_data_dir (str) – The directory where external data for the model is stored. Only used if the model is provided via a path rather than a loader. 
- save_to_disk_threshold_bytes (int) – The size in bytes above which a ModelProto will be serialized to the disk before running shape inference. This can be used to work around the 2 GB protobuf limitation. Defaults to 2 GB. 
- allow_onnxruntime (bool) – Allow ONNX-Runtime’s shape inference to be used if available instead of ONNX’s shape inference utilities. The former may provide performance or memory usage benefits. Defaults to True. 
 
 - __call__(*args, **kwargs)
- Invokes the loader by forwarding arguments to - call_impl.- Note: - call_implshould not be called directly - use this function instead.
 
- infer_shapes(model, error_ok=None, external_data_dir=None, save_to_disk_threshold_bytes=None, allow_onnxruntime=None)
- Immediately evaluated functional variant of - InferShapes.- Run shape inference on an ONNX model. - Parameters:
- model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto, str, Callable() -> str]) – An ONNX model or a callable that returns one, or a path to a model. Supports models larger than the 2 GB protobuf limit. 
- error_ok (bool) – Whether errors during shape inference should be suppressed. Defaults to True. 
- external_data_dir (str) – The directory where external data for the model is stored. Only used if the model is provided via a path rather than a loader. 
- save_to_disk_threshold_bytes (int) – The size in bytes above which a ModelProto will be serialized to the disk before running shape inference. This can be used to work around the 2 GB protobuf limitation. Defaults to 2 GB. 
- allow_onnxruntime (bool) – Allow ONNX-Runtime’s shape inference to be used if available instead of ONNX’s shape inference utilities. The former may provide performance or memory usage benefits. Defaults to True. 
 
- Returns:
- The new ONNX model with shapes inferred. 
- Return type:
- onnx.ModelProto 
 
- class ExtractSubgraph(model, input_metadata=None, output_metadata=None, check_meta=None)[source]
- Bases: - BaseLoader- Functor that extracts a subgraph from an ONNX model. - Extracts a subgraph from an ONNX model. - Parameters:
- model (Union[Union[onnx.ModelProto, onnx_graphsurgeon.Graph], Callable() -> Union[onnx.ModelProto, onnx_graphsurgeon.Graph]]) – An ONNX model or ONNX-GraphSurgeon Graph or a callable that returns one. 
- input_metadata (TensorMetadata) – Metadata for the inputs of the subgraph. Name, shape, and data type are required. If not provided, the graph outputs are not modified. 
- output_metadata (TensorMetadata) – Metadata for the outputs of the subgraph. Name and data type are required. If not provided, the graph outputs are not modified. 
- check_meta (bool) – Whether to check that the provided input and output metadata include all the expected fields. Defaults to True. 
 
 - call_impl()[source]
- Returns:
- The new ONNX model or ONNX-GraphSurgeon Graph. 
- Return type:
- Union[onnx.ModelProto, onnx_graphsurgeon.Graph] 
 
 - __call__(*args, **kwargs)
- Invokes the loader by forwarding arguments to - call_impl.- Note: - call_implshould not be called directly - use this function instead.
 
- extract_subgraph(model, input_metadata=None, output_metadata=None, check_meta=None)
- Immediately evaluated functional variant of - ExtractSubgraph.- Extracts a subgraph from an ONNX model. - Parameters:
- model (Union[Union[onnx.ModelProto, onnx_graphsurgeon.Graph], Callable() -> Union[onnx.ModelProto, onnx_graphsurgeon.Graph]]) – An ONNX model or ONNX-GraphSurgeon Graph or a callable that returns one. 
- input_metadata (TensorMetadata) – Metadata for the inputs of the subgraph. Name, shape, and data type are required. If not provided, the graph outputs are not modified. 
- output_metadata (TensorMetadata) – Metadata for the outputs of the subgraph. Name and data type are required. If not provided, the graph outputs are not modified. 
- check_meta (bool) – Whether to check that the provided input and output metadata include all the expected fields. Defaults to True. 
 
- Returns:
- The new ONNX model or ONNX-GraphSurgeon Graph. 
- Return type:
- Union[onnx.ModelProto, onnx_graphsurgeon.Graph] 
 
- class SaveOnnx(model, path, external_data_path=None, size_threshold=None, all_tensors_to_one_file=None)[source]
- Bases: - BaseLoader- Functor that saves an ONNX model to the specified path. - Saves an ONNX model to the specified path. - Parameters:
- model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one. 
- path (str) – Path at which to write the ONNX model. 
- external_data_path (str) – Path to save external data. This is always a relative path; external data is always written to the same directory as the model. Set to an empty string to use the default path. Set to None to disable. Defaults to None if the model is within the protobuf size threshold and an empty string otherwise. 
- size_threshold (int) – Tensor size threshold, in bytes, above which tensor data will be stored in the external file. Tensors smaller that this threshold will remain in the ONNX file. Has no effect if external_data_path is not set. Defaults to 1024. 
- all_tensors_to_one_file (bool) – Whether to write all tensors to one file when saving external data. Has no effect if external_data_path is not set. Defaults to True. 
 
 - __call__(*args, **kwargs)
- Invokes the loader by forwarding arguments to - call_impl.- Note: - call_implshould not be called directly - use this function instead.
 
- save_onnx(model, path, external_data_path=None, size_threshold=None, all_tensors_to_one_file=None)
- Immediately evaluated functional variant of - SaveOnnx.- Saves an ONNX model to the specified path. - Parameters:
- model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one. 
- path (str) – Path at which to write the ONNX model. 
- external_data_path (str) – Path to save external data. This is always a relative path; external data is always written to the same directory as the model. Set to an empty string to use the default path. Set to None to disable. Defaults to None if the model is within the protobuf size threshold and an empty string otherwise. 
- size_threshold (int) – Tensor size threshold, in bytes, above which tensor data will be stored in the external file. Tensors smaller that this threshold will remain in the ONNX file. Has no effect if external_data_path is not set. Defaults to 1024. 
- all_tensors_to_one_file (bool) – Whether to write all tensors to one file when saving external data. Has no effect if external_data_path is not set. Defaults to True. 
 
- Returns:
- The model, after saving it. 
- Return type:
- onnx.ModelProto 
 
- class BytesFromOnnx(model)[source]
- Bases: - BaseLoader- Functor that serializes an ONNX model. - Serializes an ONNX model. - Parameters:
- model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one. 
 - __call__(*args, **kwargs)
- Invokes the loader by forwarding arguments to - call_impl.- Note: - call_implshould not be called directly - use this function instead.
 
- bytes_from_onnx(model)
- Immediately evaluated functional variant of - BytesFromOnnx.- Serializes an ONNX model. - Parameters:
- model (Union[onnx.ModelProto, Callable() -> onnx.ModelProto]) – An ONNX model or a callable that returns one. 
- Returns:
- The serialized model. 
- Return type:
- bytes 
 
- class OnnxFromBytes(serialized_onnx)[source]
- Bases: - BaseLoader- Functor that deserializes an ONNX model. - Deserializes an ONNX model. - Parameters:
- serialized_onnx (Union[bytes, Callable() -> bytes]) – A serialized ONNX model or a callable that returns one. 
 - __call__(*args, **kwargs)
- Invokes the loader by forwarding arguments to - call_impl.- Note: - call_implshould not be called directly - use this function instead.
 
- onnx_from_bytes(serialized_onnx)
- Immediately evaluated functional variant of - OnnxFromBytes.- Deserializes an ONNX model. - Parameters:
- serialized_onnx (Union[bytes, Callable() -> bytes]) – A serialized ONNX model or a callable that returns one. 
- Returns:
- The ONNX model. 
- Return type:
- onnx.ModelProto