ICudaEngine
- tensorrt.TensorIOMode
IO tensor modes for TensorRT.
Members:
NONE : Tensor is not an input or output.
INPUT : Tensor is input to the engine.
OUTPUT : Tensor is output to the engine.
- class tensorrt.ICudaEngine
An
ICudaEngine
for executing inference on a built network.The engine can be indexed with
[]
. When indexed in this way with an integer, it will return the corresponding binding name. When indexed with a string, it will return the corresponding binding index.- Variables
num_io_tensors –
int
The number of IO tensors.has_implicit_batch_dimension –
bool
[DEPRECATED] Deprecated in TensorRT 10.0. Always flase since the implicit batch dimensions support has been removed.num_layers –
int
The number of layers in the network. The number of layers in the network is not necessarily the number in the originalINetworkDefinition
, as layers may be combined or eliminated as theICudaEngine
is optimized. This value can be useful when building per-layer tables, such as when aggregating profiling data over a number of executions.max_workspace_size –
int
The amount of workspace theICudaEngine
uses. The workspace size will be no greater than the value provided to theBuilder
when theICudaEngine
was built, and will typically be smaller. Workspace will be allocated for eachIExecutionContext
.device_memory_size –
int
The amount of device memory required by anIExecutionContext
.refittable –
bool
Whether the engine can be refit.name –
str
The name of the network associated with the engine. The name is set during network creation and is retrieved after building or deserialization.num_optimization_profiles –
int
The number of optimization profiles defined for this engine. This is always at least 1.error_recorder –
IErrorRecorder
Application-implemented error reporting interface for TensorRT objects.engine_capability –
EngineCapability
The engine capability. SeeEngineCapability
for details.tactic_sources –
int
The tactic sources required by this engine.profiling_verbosity – The profiling verbosity the builder config was set to when the engine was built.
hardware_compatibility_level – The hardware compatibility level of the engine.
num_aux_streams – Read-only. The number of auxiliary streams used by this engine, which will be less than or equal to the maximum allowed number of auxiliary streams by setting builder_config.max_aux_streams when the engine is built.
- __del__(self: tensorrt.tensorrt.ICudaEngine) None
- __exit__(exc_type, exc_value, traceback)
Context managers are deprecated and have no effect. Objects are automatically freed when the reference count reaches 0.
- __getitem__(self: tensorrt.tensorrt.ICudaEngine, arg0: int) str
- __init__(*args, **kwargs)
- create_engine_inspector(self: tensorrt.tensorrt.ICudaEngine) tensorrt.tensorrt.EngineInspector
Create an
IEngineInspector
which prints out the layer information of an engine or an execution context.- Returns
The
IEngineInspector
.
- create_execution_context(self: tensorrt.tensorrt.ICudaEngine, strategy: tensorrt.tensorrt.ExecutionContextAllocationStrategy = <ExecutionContextAllocationStrategy.STATIC: 0>) tensorrt.tensorrt.IExecutionContext
Create an
IExecutionContext
and specify the device memory allocation strategy.- Returns
The newly created
IExecutionContext
.
- create_execution_context_without_device_memory(self: tensorrt.tensorrt.ICudaEngine) tensorrt.tensorrt.IExecutionContext
Create an
IExecutionContext
without any device memory allocated The memory for execution of this device context must be supplied by the application.- Returns
An
IExecutionContext
without device memory allocated.
- create_serialization_config(self: tensorrt.tensorrt.ICudaEngine) tensorrt.tensorrt.ISerializationConfig
Create a serialization configuration object.
- get_device_memory_size_for_profile(self: tensorrt.tensorrt.ICudaEngine, profile_index: int) int
Return the device memory size required for a certain profile.
- Parameters
profile_index – The index of the profile.
- get_profile_shape(*args, **kwargs)
Overloaded function.
get_profile_shape(self: tensorrt.tensorrt.ICudaEngine, profile_index: int, binding: int) -> List[tensorrt.tensorrt.Dims]
Get the minimum/optimum/maximum dimensions for a particular binding under an optimization profile.
- arg profile_index
The index of the profile.
- arg binding
The binding index or name.
- returns
A
List[Dims]
of length 3, containing the minimum, optimum, and maximum shapes, in that order.
get_profile_shape(self: tensorrt.tensorrt.ICudaEngine, profile_index: int, binding: str) -> List[tensorrt.tensorrt.Dims]
Get the minimum/optimum/maximum dimensions for a particular binding under an optimization profile.
- arg profile_index
The index of the profile.
- arg binding
The binding index or name.
- returns
A
List[Dims]
of length 3, containing the minimum, optimum, and maximum shapes, in that order.
- get_tensor_bytes_per_component(*args, **kwargs)
Overloaded function.
get_tensor_bytes_per_component(self: tensorrt.tensorrt.ICudaEngine, name: str) -> int
Return the number of bytes per component of an element.
The vector component size is returned if
get_tensor_vectorized_dim()
!= -1.- arg name
The tensor name.
get_tensor_bytes_per_component(self: tensorrt.tensorrt.ICudaEngine, name: str, profile_index: int) -> int
Return the number of bytes per component of an element.
The vector component size is returned if
get_tensor_vectorized_dim()
!= -1.- arg name
The tensor name.
- get_tensor_components_per_element(*args, **kwargs)
Overloaded function.
get_tensor_components_per_element(self: tensorrt.tensorrt.ICudaEngine, name: str) -> int
Return the number of components included in one element.
The number of elements in the vectors is returned if
get_tensor_vectorized_dim()
!= -1.- arg name
The tensor name.
get_tensor_components_per_element(self: tensorrt.tensorrt.ICudaEngine, name: str, profile_index: int) -> int
Return the number of components included in one element.
The number of elements in the vectors is returned if
get_tensor_vectorized_dim()
!= -1.- arg name
The tensor name.
- get_tensor_dtype(self: tensorrt.tensorrt.ICudaEngine, name: str) tensorrt.tensorrt.DataType
Return the required data type for a buffer from its tensor name.
- Parameters
name – The tensor name.
- get_tensor_format(*args, **kwargs)
Overloaded function.
get_tensor_format(self: tensorrt.tensorrt.ICudaEngine, name: str) -> tensorrt.tensorrt.TensorFormat
Return the tensor format.
- arg name
The tensor name.
get_tensor_format(self: tensorrt.tensorrt.ICudaEngine, name: str, profile_index: int) -> tensorrt.tensorrt.TensorFormat
Return the tensor format.
- arg name
The tensor name.
- get_tensor_format_desc(*args, **kwargs)
Overloaded function.
get_tensor_format_desc(self: tensorrt.tensorrt.ICudaEngine, name: str) -> str
Return the human readable description of the tensor format.
The description includes the order, vectorization, data type, strides, etc. For example:
Example 1: CHW + FP32“Row major linear FP32 format”Example 2: CHW2 + FP16“Two wide channel vectorized row major FP16 format”Example 3: HWC8 + FP16 + Line Stride = 32“Channel major FP16 format where C % 8 == 0 and H Stride % 32 == 0”- arg name
The tensor name.
get_tensor_format_desc(self: tensorrt.tensorrt.ICudaEngine, name: str, profile_index: int) -> str
Return the human readable description of the tensor format.
The description includes the order, vectorization, data type, strides, etc. For example:
Example 1: CHW + FP32“Row major linear FP32 format”Example 2: CHW2 + FP16“Two wide channel vectorized row major FP16 format”Example 3: HWC8 + FP16 + Line Stride = 32“Channel major FP16 format where C % 8 == 0 and H Stride % 32 == 0”- arg name
The tensor name.
- get_tensor_location(self: tensorrt.tensorrt.ICudaEngine, name: str) tensorrt.tensorrt.TensorLocation
Determine whether an input or output tensor must be on GPU or CPU.
- Parameters
name – The tensor name.
- get_tensor_mode(self: tensorrt.tensorrt.ICudaEngine, name: str) tensorrt.tensorrt.TensorIOMode
Determine whether a tensor is an input or output tensor.
- Parameters
name – The tensor name.
- get_tensor_name(self: tensorrt.tensorrt.ICudaEngine, index: int) str
Return the name of an input or output tensor.
- Parameters
index – The tensor index.
- get_tensor_profile_shape(self: tensorrt.tensorrt.ICudaEngine, name: str, profile_index: int) List[tensorrt.tensorrt.Dims]
Get the minimum/optimum/maximum dimensions for a particular tensor under an optimization profile.
- Parameters
name – The tensor name.
profile_index – The index of the profile.
- get_tensor_profile_values(self: tensorrt.tensorrt.ICudaEngine, name: int, profile_index: str) List[List[int]]
Get minimum/optimum/maximum values for an input shape binding under an optimization profile. If the specified binding is not an input shape binding, an exception is raised.
- Parameters
name – The tensor name.
profile_index – The index of the profile.
- Returns
A
List[List[int]]
of length 3, containing the minimum, optimum, and maximum values, in that order. If the values have not been set yet, an empty list is returned.
- get_tensor_shape(self: tensorrt.tensorrt.ICudaEngine, name: str) tensorrt.tensorrt.Dims
Return the shape of an input or output tensor.
- Parameters
name – The tensor name.
- get_tensor_vectorized_dim(*args, **kwargs)
Overloaded function.
get_tensor_vectorized_dim(self: tensorrt.tensorrt.ICudaEngine, name: str) -> int
Return the dimension index that the buffer is vectorized.
Specifically -1 is returned if scalars per vector is 1.
- arg name
The tensor name.
get_tensor_vectorized_dim(self: tensorrt.tensorrt.ICudaEngine, name: str, profile_index: int) -> int
Return the dimension index that the buffer is vectorized.
Specifically -1 is returned if scalars per vector is 1.
- arg name
The tensor name.
- is_debug_tensor(self: tensorrt.tensorrt.ICudaEngine, name: str) bool
Determine whether the given name corresponds to a debug tensor.
- Parameters
name – The tensor name.
- is_shape_inference_io(self: tensorrt.tensorrt.ICudaEngine, name: str) bool
Determine whether a tensor is read or written by infer_shapes.
- Parameters
name – The tensor name.
- serialize(self: tensorrt.tensorrt.ICudaEngine) tensorrt.tensorrt.IHostMemory
Serialize the engine to a stream.
- Returns
An
IHostMemory
object containing the serializedICudaEngine
.
- serialize_with_config(self: tensorrt.tensorrt.ICudaEngine, arg0: tensorrt.tensorrt.ISerializationConfig) tensorrt.tensorrt.IHostMemory
Serialize the network to a stream.