Runners¶
Module: polygraphy.backend.onnxrt
- class OnnxrtRunner(sess, name=None)[source]¶
- Bases: - BaseRunner- Runs inference using an ONNX-Runtime inference session. - Parameters:
- sess (Union[onnxruntime.InferenceSession, Callable() -> onnxruntime.InferenceSession]) – An ONNX-Runtime inference session or a callable that returns one. 
 - infer_impl(feed_dict)[source]¶
- Implementation for running inference with ONNX-Runtime. Do not call this method directly - use - infer()instead, which will forward unrecognized arguments to this method.- Parameters:
- feed_dict (OrderedDict[str, Union[numpy.ndarray, torch.Tensor]]) – A mapping of input tensor names to corresponding input NumPy arrays or PyTorch tensors. If PyTorch tensors are provided in the feed_dict, then this function will return the outputs also as PyTorch tensors. 
- Returns:
- A mapping of output tensor names to corresponding output NumPy arrays or PyTorch tensors. 
- Return type:
- OrderedDict[str, Union[numpy.ndarray, torch.Tensor]] 
 
 - __enter__()¶
- Activate the runner for inference. For example, this may involve allocating CPU or GPU memory. 
 - __exit__(exc_type, exc_value, traceback)¶
- Deactivate the runner. For example, this may involve freeing CPU or GPU memory. 
 - activate()¶
- Activate the runner for inference. For example, this may involve allocating CPU or GPU memory. - Generally, you should use a context manager instead of manually activating and deactivating. For example: - with RunnerType(...) as runner: runner.infer(...) 
 - deactivate()¶
- Deactivate the runner. For example, this may involve freeing CPU or GPU memory. - Generally, you should use a context manager instead of manually activating and deactivating. For example: - with RunnerType(...) as runner: runner.infer(...) 
 - get_input_metadata(use_numpy_dtypes=None)¶
- Returns information about the inputs of the model. Shapes here may include dynamic dimensions, represented by - None. Must be called only after- activate()and before- deactivate().- Parameters:
- use_numpy_dtypes (bool) – [DEPRECATED] Whether to return NumPy data types instead of Polygraphy - DataTypes. This is provided to retain backwards compatibility. In the future, this parameter will be removed and Polygraphy- DataTypes will always be returned. These can be converted to NumPy data types by calling the numpy() method. Defaults to True.
- Returns:
- Input names, shapes, and data types. 
- Return type:
 
 - infer(feed_dict, check_inputs=True, *args, **kwargs)¶
- Runs inference using the provided feed_dict. - Must be called only after - activate()and before- deactivate().- NOTE: Some runners may accept additional parameters in infer(). For details on these, see the documentation for their infer_impl() methods. - Parameters:
- feed_dict (OrderedDict[str, numpy.ndarray]) – A mapping of input tensor names to corresponding input NumPy arrays. 
- check_inputs (bool) – Whether to check that the provided - feed_dictincludes the expected inputs with the expected data types and shapes. Disabling this may improve performance. Defaults to True.
 
 - inference_time¶
- The time required to run inference in seconds. - Type:
- float 
 
 - Returns:
- A mapping of output tensor names to their corresponding NumPy arrays. - IMPORTANT: Runners may reuse these output buffers. Thus, if you need to save outputs from multiple inferences, you should make a copy with - copy.deepcopy(outputs).
- Return type:
- OrderedDict[str, numpy.ndarray] 
 
 - last_inference_time()¶
- Returns the total inference time in seconds required during the last call to - infer().- Must be called only after - activate()and before- deactivate().- Returns:
- The time in seconds, or None if runtime was not measured by the runner. 
- Return type:
- float 
 
 - is_active¶
- Whether this runner has been activated, either via context manager, or by calling - activate().- Type:
- bool