Loaders¶
Module: polygraphy.backend.trt
-
class
LoadPlugins
(plugins=None, obj=None)[source]¶ Bases:
polygraphy.backend.base.loader.BaseLoader
A passthrough loader that loads plugins from the specified paths. Passthrough here means that it can be used to wrap any other loader. The purpose of wrapping another loader is that you can control the order of execution when lazily evaluating.
For immediate evaluation, use load_plugins instead:
load_plugins(plugins=["/path/to/my/plugin.so", "/path/to/my/other_plugin.so"])
Loads plugins from the specified paths.
- Parameters
plugins (List[str]) – A list of paths to plugin libraries to load before inference.
obj (BaseLoader) – An object or callable to return or call respectively. If
obj
is callable, extra parameters will be forwarded toobj
. Ifobj
is not callable, it will be returned.
-
call_impl
(*args, **kwargs)[source]¶ - Returns
The provided
obj
argument, or its return value if it is callable. ReturnsNone
ifobj
was not set.- Return type
object
-
__call__
(*args, **kwargs)¶ Invokes the loader by forwarding arguments to
call_impl
.Note:
call_impl
should not be called directly - use this function instead.
-
class
CreateNetwork
(explicit_precision=None, explicit_batch=None)[source]¶ Bases:
polygraphy.backend.base.loader.BaseLoader
Functor that creates an empty TensorRT network.
Creates an empty TensorRT network.
- Parameters
explicit_precision (bool) – Whether to create the network with explicit precision enabled. Defaults to False
explicit_batch (bool) – Whether to create the network with explicit batch mode. Defaults to True.
-
call_impl
()[source]¶ - Returns
The builder and empty network.
- Return type
(trt.Builder, trt.INetworkDefinition)
-
__call__
(*args, **kwargs)¶ Invokes the loader by forwarding arguments to
call_impl
.Note:
call_impl
should not be called directly - use this function instead.
-
class
NetworkFromOnnxBytes
(model_bytes, explicit_precision=None)[source]¶ Bases:
polygraphy.backend.trt.loader.BaseNetworkFromOnnx
Functor that parses an ONNX model to create a trt.INetworkDefinition.
Parses an ONNX model.
- Parameters
model_bytes (Callable() -> bytes) – A loader that can supply a serialized ONNX model.
explicit_precision (bool) – Whether to construct the TensorRT network with explicit precision enabled.
-
call_impl
()[source]¶ - Returns
A TensorRT network, as well as the builder used to create it, and the parser used to populate it.
- Return type
(trt.IBuilder, trt.INetworkDefinition, trt.OnnxParser)
-
__call__
(*args, **kwargs)¶ Invokes the loader by forwarding arguments to
call_impl
.Note:
call_impl
should not be called directly - use this function instead.
-
class
NetworkFromOnnxPath
(path, explicit_precision=None)[source]¶ Bases:
polygraphy.backend.trt.loader.BaseNetworkFromOnnx
Functor that parses an ONNX model to create a trt.INetworkDefinition. This loader supports models with weights stored in an external location.
Parses an ONNX model from a file.
- Parameters
path (str) – The path from which to load the model.
-
call_impl
()[source]¶ - Returns
A TensorRT network, as well as the builder used to create it, and the parser used to populate it.
- Return type
(trt.IBuilder, trt.INetworkDefinition, trt.OnnxParser)
-
__call__
(*args, **kwargs)¶ Invokes the loader by forwarding arguments to
call_impl
.Note:
call_impl
should not be called directly - use this function instead.
-
class
ModifyNetworkOutputs
(network, outputs=None, exclude_outputs=None)[source]¶ Bases:
polygraphy.backend.base.loader.BaseLoader
Functor that modifies outputs in a TensorRT
INetworkDefinition
.Modifies outputs in a TensorRT
INetworkDefinition
.- Parameters
network (Callable() -> trt.Builder, trt.INetworkDefinition) – A callable capable of returning a TensorRT Builder and INetworkDefinition. The callable may have at most 3 return values if another object needs to be kept alive for the duration of the network, e.g., in the case of a parser. The first and second return values must always be the builder and network respectively. ModifyNetworkOutputs will never take ownership of these.
outputs (Sequence[str]) – Names of tensors to mark as outputs. If provided, this will override the outputs already marked in the network. If a value of constants.MARK_ALL is used instead of a list, all tensors in the network are marked.
exclude_outputs (Sequence[str]) – Names of tensors to exclude as outputs. This can be useful in conjunction with
outputs=constants.MARK_ALL
to omit outputs.
-
__call__
(*args, **kwargs)¶ Invokes the loader by forwarding arguments to
call_impl
.Note:
call_impl
should not be called directly - use this function instead.
-
class
CreateConfig
(max_workspace_size=None, tf32=None, fp16=None, int8=None, profiles=None, calibrator=None, strict_types=None, load_timing_cache=None, algorithm_selector=None, sparse_weights=None, tactic_sources=None, restricted=None)[source]¶ Bases:
polygraphy.backend.base.loader.BaseLoader
Functor that creates a TensorRT IBuilderConfig.
Creates a TensorRT IBuilderConfig that can be used by EngineFromNetwork.
- Parameters
max_workspace_size (int) – The maximum workspace size, in bytes, when building the engine. Defaults to 16 MiB.
tf32 (bool) – Whether to build the engine with TF32 precision enabled. Defaults to False.
fp16 (bool) – Whether to build the engine with FP16 precision enabled. Defaults to False.
int8 (bool) – Whether to build the engine with INT8 precision enabled. Defaults to False.
profiles (List[Profile]) – A list of optimization profiles to add to the configuration. Only needed for networks with dynamic input shapes. If this is omitted for a network with dynamic shapes, a default profile is created, where dynamic dimensions are replaced with Polygraphy’s DEFAULT_SHAPE_VALUE (defined in constants.py). A partially populated profile will be automatically filled using values from
Profile.fill_defaults()
SeeProfile
for details.calibrator (trt.IInt8Calibrator) – An int8 calibrator. Only required in int8 mode when the network does not have explicit precision. For networks with dynamic shapes, the last profile provided (or default profile if no profiles are provided) is used during calibration.
strict_types (bool) – Whether to enable strict types in the builder. This will constrain the builder from using data types other than those specified in the network. Defaults to False.
load_timing_cache (Union[str, file-like]) – A path or file-like object from which to load a tactic timing cache. Providing a tactic timing cache can speed up the engine building process. Caches can be generated while building an engine with, for example, EngineFromNetwork.
algorithm_selector (trt.IAlgorithmSelector) – An algorithm selector. Allows the user to control how tactics are selected instead of letting TensorRT select them automatically.
sparse_weights (bool) – Whether to enable optimizations for sparse weights. Defaults to False.
tactic_sources (List[trt.TacticSource]) – The tactic sources to enable. This controls which libraries (e.g. cudnn, cublas, etc.) TensorRT is allowed to load tactics from. Use an empty list to disable all tactic sources. Defaults to TensorRT’s default tactic sources.
restricted (bool) – Whether to enable safety scope checking in the builder. This will check if the network and builder configuration are compatible with safety scope. Defaults to False.
-
call_impl
(builder, network)[source]¶ - Parameters
builder (trt.Builder) – The TensorRT builder to use to create the configuration.
network (trt.INetworkDefinition) – The TensorRT network for which to create the config. The network is used to automatically create a default optimization profile if none are provided.
- Returns
The TensorRT builder configuration.
- Return type
trt.IBuilderConfig
-
__call__
(*args, **kwargs)¶ Invokes the loader by forwarding arguments to
call_impl
.Note:
call_impl
should not be called directly - use this function instead.
-
class
EngineBytesFromNetwork
(network, config=None, save_timing_cache=None)[source]¶ Bases:
polygraphy.backend.base.loader.BaseLoader
Functor that uses a TensorRT
INetworkDefinition
to build a serialized engine.Builds and serializes TensorRT engine.
- Parameters
network (Callable() -> trt.Builder, trt.INetworkDefinition) – A callable capable of returning a TensorRT Builder and INetworkDefinition. The returned builder and network are owned by EngineFromNetwork and should not be freed manually. The callable may have at most 3 return values if another object needs to be kept alive for the duration of the network, e.g., in the case of a parser. EngineFromNetwork will take ownership of the third return value, and, like the network, it should not be freed by the callable. The first and second return values must always be the builder and network respectively. If instead of a loader, the network, builder, and optional parser arguments are provided directly, then EngineFromNetwork will not deallocate them.
config (Callable(trt.Builder, trt.INetworkDefinition) -> trt.IBuilderConfig) – A callable that returns a TensorRT builder configuration. If not supplied, a CreateConfig instance with default parameters is used.
save_timing_cache (Union[str, file-like]) – A path or file-like object at which to save a tactic timing cache. Any existing cache will be overwritten. Note that if the provided config includes a tactic timing cache, the data from that cache will be copied into the new cache.
-
__call__
(*args, **kwargs)¶ Invokes the loader by forwarding arguments to
call_impl
.Note:
call_impl
should not be called directly - use this function instead.
-
class
EngineFromNetwork
(network, config=None, save_timing_cache=None)[source]¶ Bases:
polygraphy.backend.trt.loader.EngineBytesFromNetwork
Similar to EngineBytesFromNetwork, but returns an ICudaEngine instance instead of a serialized engine.
Builds and serializes TensorRT engine.
- Parameters
network (Callable() -> trt.Builder, trt.INetworkDefinition) – A callable capable of returning a TensorRT Builder and INetworkDefinition. The returned builder and network are owned by EngineFromNetwork and should not be freed manually. The callable may have at most 3 return values if another object needs to be kept alive for the duration of the network, e.g., in the case of a parser. EngineFromNetwork will take ownership of the third return value, and, like the network, it should not be freed by the callable. The first and second return values must always be the builder and network respectively. If instead of a loader, the network, builder, and optional parser arguments are provided directly, then EngineFromNetwork will not deallocate them.
config (Callable(trt.Builder, trt.INetworkDefinition) -> trt.IBuilderConfig) – A callable that returns a TensorRT builder configuration. If not supplied, a CreateConfig instance with default parameters is used.
save_timing_cache (Union[str, file-like]) – A path or file-like object at which to save a tactic timing cache. Any existing cache will be overwritten. Note that if the provided config includes a tactic timing cache, the data from that cache will be copied into the new cache.
-
__call__
(*args, **kwargs)¶ Invokes the loader by forwarding arguments to
call_impl
.Note:
call_impl
should not be called directly - use this function instead.
-
class
EngineFromBytes
(serialized_engine)[source]¶ Bases:
polygraphy.backend.base.loader.BaseLoader
Functor that deserializes an engine from a buffer.
Deserializes an engine from a buffer.
- Parameters
serialized_engine (Callable() -> Union[str, bytes]) – Either a loader that can supply a memory buffer, or a memory buffer itself.
-
__call__
(*args, **kwargs)¶ Invokes the loader by forwarding arguments to
call_impl
.Note:
call_impl
should not be called directly - use this function instead.
-
class
BytesFromEngine
(engine)[source]¶ Bases:
polygraphy.backend.base.loader.BaseLoader
Functor that serializes an engine.
Serializes an engine.
- Parameters
engine (Callable() -> trt.ICudaEngine) – Either a loader that can supply an engine, or the engine itself.
-
__call__
(*args, **kwargs)¶ Invokes the loader by forwarding arguments to
call_impl
.Note:
call_impl
should not be called directly - use this function instead.
-
class
SaveEngine
(engine, path)[source]¶ Bases:
polygraphy.backend.base.loader.BaseLoader
Functor that saves an engine to the provided path.
Saves an engine to the provided path.
- Parameters
engine (Callable() -> trt.ICudaEngine) – A callable that can supply a TensorRT engine.
path (str) – The path at which to save the engine.
-
__call__
(*args, **kwargs)¶ Invokes the loader by forwarding arguments to
call_impl
.Note:
call_impl
should not be called directly - use this function instead.
-
class
OnnxLikeFromNetwork
(network)[source]¶ Bases:
polygraphy.backend.base.loader.BaseLoader
Functor that creates an ONNX-like, but not valid ONNX, model based on a TensorRT network.
[HIGHLY EXPERIMENTAL] Creates an ONNX-like, but not valid ONNX, model from a TensorRT network. This uses the ONNX format, but generates nodes that are not valid ONNX operators. Hence, the resulting model is not valid ONNX. This should be used only for visualization or debugging purposes.
The resulting model does not include enough information to faithfully reconstruct the TensorRT network.
- Parameters
network (Callable() -> trt.Builder, trt.INetworkDefinition) – A callable capable of returning a TensorRT Builder and INetworkDefinition. The callable may have at most 3 return values if another object needs to be kept alive for the duration of the network, e.g., in the case of a parser. The first and second return values must always be the builder and network respectively. If instead of a loader, the network, builder, and optional parser arguments are provided directly, then OnnxLikeFromNetwork will not deallocate them.
-
__call__
(*args, **kwargs)¶ Invokes the loader by forwarding arguments to
call_impl
.Note:
call_impl
should not be called directly - use this function instead.
-
load_plugins
(plugins=None, obj=None, *args, **kwargs)¶ Immediately evaluated functional variant of
LoadPlugins
.Loads plugins from the specified paths.
- Parameters
plugins (List[str]) – A list of paths to plugin libraries to load before inference.
obj (BaseLoader) – An object or callable to return or call respectively. If
obj
is callable, extra parameters will be forwarded toobj
. Ifobj
is not callable, it will be returned.
- Returns
The provided
obj
argument, or its return value if it is callable. ReturnsNone
ifobj
was not set.- Return type
object
-
create_network
(explicit_precision=None, explicit_batch=None)¶ Immediately evaluated functional variant of
CreateNetwork
.Creates an empty TensorRT network.
- Parameters
explicit_precision (bool) – Whether to create the network with explicit precision enabled. Defaults to False
explicit_batch (bool) – Whether to create the network with explicit batch mode. Defaults to True.
- Returns
The builder and empty network.
- Return type
(trt.Builder, trt.INetworkDefinition)
-
network_from_onnx_bytes
(model_bytes, explicit_precision=None)¶ Immediately evaluated functional variant of
NetworkFromOnnxBytes
.Parses an ONNX model.
- Parameters
model_bytes (Callable() -> bytes) – A loader that can supply a serialized ONNX model.
explicit_precision (bool) – Whether to construct the TensorRT network with explicit precision enabled.
- Returns
A TensorRT network, as well as the builder used to create it, and the parser used to populate it.
- Return type
(trt.IBuilder, trt.INetworkDefinition, trt.OnnxParser)
-
network_from_onnx_path
(path, explicit_precision=None)¶ Immediately evaluated functional variant of
NetworkFromOnnxPath
.Parses an ONNX model from a file.
- Parameters
path (str) – The path from which to load the model.
- Returns
A TensorRT network, as well as the builder used to create it, and the parser used to populate it.
- Return type
(trt.IBuilder, trt.INetworkDefinition, trt.OnnxParser)
-
modify_network_outputs
(network, outputs=None, exclude_outputs=None)¶ Immediately evaluated functional variant of
ModifyNetworkOutputs
.Modifies outputs in a TensorRT
INetworkDefinition
.- Parameters
network (Callable() -> trt.Builder, trt.INetworkDefinition) – A callable capable of returning a TensorRT Builder and INetworkDefinition. The callable may have at most 3 return values if another object needs to be kept alive for the duration of the network, e.g., in the case of a parser. The first and second return values must always be the builder and network respectively. ModifyNetworkOutputs will never take ownership of these.
outputs (Sequence[str]) – Names of tensors to mark as outputs. If provided, this will override the outputs already marked in the network. If a value of constants.MARK_ALL is used instead of a list, all tensors in the network are marked.
exclude_outputs (Sequence[str]) – Names of tensors to exclude as outputs. This can be useful in conjunction with
outputs=constants.MARK_ALL
to omit outputs.
- Returns
The modified network.
- Return type
trt.INetworkDefinition
-
ModifyNetwork
¶ alias of
polygraphy.mod.exporter.deprecate.<locals>.deprecate_impl.<locals>.Deprecated
-
ModifyNetwork
alias of
polygraphy.mod.exporter.deprecate.<locals>.deprecate_impl.<locals>.Deprecated
-
create_config
(builder, network, max_workspace_size=None, tf32=None, fp16=None, int8=None, profiles=None, calibrator=None, strict_types=None, load_timing_cache=None, algorithm_selector=None, sparse_weights=None, tactic_sources=None, restricted=None)¶ Immediately evaluated functional variant of
CreateConfig
.Creates a TensorRT IBuilderConfig that can be used by EngineFromNetwork.
- Parameters
max_workspace_size (int) – The maximum workspace size, in bytes, when building the engine. Defaults to 16 MiB.
tf32 (bool) – Whether to build the engine with TF32 precision enabled. Defaults to False.
fp16 (bool) – Whether to build the engine with FP16 precision enabled. Defaults to False.
int8 (bool) – Whether to build the engine with INT8 precision enabled. Defaults to False.
profiles (List[Profile]) – A list of optimization profiles to add to the configuration. Only needed for networks with dynamic input shapes. If this is omitted for a network with dynamic shapes, a default profile is created, where dynamic dimensions are replaced with Polygraphy’s DEFAULT_SHAPE_VALUE (defined in constants.py). A partially populated profile will be automatically filled using values from
Profile.fill_defaults()
SeeProfile
for details.calibrator (trt.IInt8Calibrator) – An int8 calibrator. Only required in int8 mode when the network does not have explicit precision. For networks with dynamic shapes, the last profile provided (or default profile if no profiles are provided) is used during calibration.
strict_types (bool) – Whether to enable strict types in the builder. This will constrain the builder from using data types other than those specified in the network. Defaults to False.
load_timing_cache (Union[str, file-like]) – A path or file-like object from which to load a tactic timing cache. Providing a tactic timing cache can speed up the engine building process. Caches can be generated while building an engine with, for example, EngineFromNetwork.
algorithm_selector (trt.IAlgorithmSelector) – An algorithm selector. Allows the user to control how tactics are selected instead of letting TensorRT select them automatically.
sparse_weights (bool) – Whether to enable optimizations for sparse weights. Defaults to False.
tactic_sources (List[trt.TacticSource]) – The tactic sources to enable. This controls which libraries (e.g. cudnn, cublas, etc.) TensorRT is allowed to load tactics from. Use an empty list to disable all tactic sources. Defaults to TensorRT’s default tactic sources.
restricted (bool) – Whether to enable safety scope checking in the builder. This will check if the network and builder configuration are compatible with safety scope. Defaults to False.
builder (trt.Builder) – The TensorRT builder to use to create the configuration.
network (trt.INetworkDefinition) – The TensorRT network for which to create the config. The network is used to automatically create a default optimization profile if none are provided.
- Returns
The TensorRT builder configuration.
- Return type
trt.IBuilderConfig
-
engine_bytes_from_network
(network, config=None, save_timing_cache=None)¶ Immediately evaluated functional variant of
EngineBytesFromNetwork
.Builds and serializes TensorRT engine.
- Parameters
network (Callable() -> trt.Builder, trt.INetworkDefinition) – A callable capable of returning a TensorRT Builder and INetworkDefinition. The returned builder and network are owned by EngineFromNetwork and should not be freed manually. The callable may have at most 3 return values if another object needs to be kept alive for the duration of the network, e.g., in the case of a parser. EngineFromNetwork will take ownership of the third return value, and, like the network, it should not be freed by the callable. The first and second return values must always be the builder and network respectively. If instead of a loader, the network, builder, and optional parser arguments are provided directly, then EngineFromNetwork will not deallocate them.
config (Callable(trt.Builder, trt.INetworkDefinition) -> trt.IBuilderConfig) – A callable that returns a TensorRT builder configuration. If not supplied, a CreateConfig instance with default parameters is used.
save_timing_cache (Union[str, file-like]) – A path or file-like object at which to save a tactic timing cache. Any existing cache will be overwritten. Note that if the provided config includes a tactic timing cache, the data from that cache will be copied into the new cache.
- Returns
The serialized engine that was created.
- Return type
bytes
-
engine_from_network
(network, config=None, save_timing_cache=None)¶ Immediately evaluated functional variant of
EngineFromNetwork
.Builds and serializes TensorRT engine.
- Parameters
network (Callable() -> trt.Builder, trt.INetworkDefinition) – A callable capable of returning a TensorRT Builder and INetworkDefinition. The returned builder and network are owned by EngineFromNetwork and should not be freed manually. The callable may have at most 3 return values if another object needs to be kept alive for the duration of the network, e.g., in the case of a parser. EngineFromNetwork will take ownership of the third return value, and, like the network, it should not be freed by the callable. The first and second return values must always be the builder and network respectively. If instead of a loader, the network, builder, and optional parser arguments are provided directly, then EngineFromNetwork will not deallocate them.
config (Callable(trt.Builder, trt.INetworkDefinition) -> trt.IBuilderConfig) – A callable that returns a TensorRT builder configuration. If not supplied, a CreateConfig instance with default parameters is used.
save_timing_cache (Union[str, file-like]) – A path or file-like object at which to save a tactic timing cache. Any existing cache will be overwritten. Note that if the provided config includes a tactic timing cache, the data from that cache will be copied into the new cache.
- Returns
The engine that was created.
- Return type
trt.ICudaEngine
-
engine_from_bytes
(serialized_engine)¶ Immediately evaluated functional variant of
EngineFromBytes
.Deserializes an engine from a buffer.
- Parameters
serialized_engine (Callable() -> Union[str, bytes]) – Either a loader that can supply a memory buffer, or a memory buffer itself.
- Returns
The deserialized engine.
- Return type
trt.ICudaEngine
-
bytes_from_engine
(engine)¶ Immediately evaluated functional variant of
BytesFromEngine
.Serializes an engine.
- Parameters
engine (Callable() -> trt.ICudaEngine) – Either a loader that can supply an engine, or the engine itself.
- Returns
The serialized engine.
- Return type
bytes
-
save_engine
(engine, path)¶ Immediately evaluated functional variant of
SaveEngine
.Saves an engine to the provided path.
- Parameters
engine (Callable() -> trt.ICudaEngine) – A callable that can supply a TensorRT engine.
path (str) – The path at which to save the engine.
- Returns
The engine that was saved.
- Return type
trt.ICudaEngine
-
onnx_like_from_network
(network)¶ Immediately evaluated functional variant of
OnnxLikeFromNetwork
.[HIGHLY EXPERIMENTAL] Creates an ONNX-like, but not valid ONNX, model from a TensorRT network. This uses the ONNX format, but generates nodes that are not valid ONNX operators. Hence, the resulting model is not valid ONNX. This should be used only for visualization or debugging purposes.
The resulting model does not include enough information to faithfully reconstruct the TensorRT network.
- Parameters
network (Callable() -> trt.Builder, trt.INetworkDefinition) – A callable capable of returning a TensorRT Builder and INetworkDefinition. The callable may have at most 3 return values if another object needs to be kept alive for the duration of the network, e.g., in the case of a parser. The first and second return values must always be the builder and network respectively. If instead of a loader, the network, builder, and optional parser arguments are provided directly, then OnnxLikeFromNetwork will not deallocate them.
- Returns
The ONNX-like, but not valid ONNX, model.
- Return type
onnx.ModelProto