ICudaEngine
- class tensorrt.ICudaEngine
An
ICudaEngine
for executing inference on a built network.The engine can be indexed with
[]
. When indexed in this way with an integer, it will return the corresponding binding name. When indexed with a string, it will return the corresponding binding index.- Variables
num_bindings –
int
The number of binding indices.max_batch_size –
int
The maximum batch size which can be used for inference. For an engine built from anINetworkDefinition
without an implicit batch dimension, this will always be1
.has_implicit_batch_dimension –
bool
Whether the engine was built with an implicit batch dimension.. This is an engine-wide property. Either all tensors in the engine have an implicit batch dimension or none of them do. This is True if and only if theINetworkDefinition
from which this engine was built was created with theNetworkDefinitionCreationFlag.EXPLICIT_BATCH
flag.num_layers –
int
The number of layers in the network. The number of layers in the network is not necessarily the number in the originalINetworkDefinition
, as layers may be combined or eliminated as theICudaEngine
is optimized. This value can be useful when building per-layer tables, such as when aggregating profiling data over a number of executions.max_workspace_size –
int
The amount of workspace theICudaEngine
uses. The workspace size will be no greater than the value provided to theBuilder
when theICudaEngine
was built, and will typically be smaller. Workspace will be allocated for eachIExecutionContext
.device_memory_size –
int
The amount of device memory required by anIExecutionContext
.refittable –
bool
Whether the engine can be refit.name –
str
The name of the network associated with the engine. The name is set during network creation and is retrieved after building or deserialization.num_optimization_profiles –
int
The number of optimization profiles defined for this engine. This is always at least 1.error_recorder –
IErrorRecorder
Application-implemented error reporting interface for TensorRT objects.engine_capability –
EngineCapability
The engine capability. SeeEngineCapability
for details.tactic_sources –
int
The tactic sources required by this engine.profiling_verbosity – The profiling verbosity the builder config was set to when the engine was built.
- __del__(self: tensorrt.tensorrt.ICudaEngine) None
- __exit__(exc_type, exc_value, traceback)
Context managers are deprecated and have no effect. Objects are automatically freed when the reference count reaches 0.
- __getitem__(*args, **kwargs)
Overloaded function.
__getitem__(self: tensorrt.tensorrt.ICudaEngine, arg0: str) -> int
__getitem__(self: tensorrt.tensorrt.ICudaEngine, arg0: int) -> str
- __init__(*args, **kwargs)
- __len__(self: tensorrt.tensorrt.ICudaEngine) int
- binding_is_input(*args, **kwargs)
Overloaded function.
binding_is_input(self: tensorrt.tensorrt.ICudaEngine, index: int) -> bool
Determine whether a binding is an input binding.
- index
The binding index.
- returns
True if the index corresponds to an input binding and the index is in range.
binding_is_input(self: tensorrt.tensorrt.ICudaEngine, name: str) -> bool
Determine whether a binding is an input binding.
- name
The name of the tensor corresponding to an engine binding.
- returns
True if the index corresponds to an input binding and the index is in range.
- create_engine_inspector(self: tensorrt.tensorrt.ICudaEngine) nvinfer1::IEngineInspector
Create an
IEngineInspector
which prints out the layer information of an engine or an execution context.- Returns
The
IEngineInspector
.
- create_execution_context(self: tensorrt.tensorrt.ICudaEngine) tensorrt.tensorrt.IExecutionContext
Create an
IExecutionContext
.- Returns
The newly created
IExecutionContext
.
- create_execution_context_without_device_memory(self: tensorrt.tensorrt.ICudaEngine) tensorrt.tensorrt.IExecutionContext
Create an
IExecutionContext
without any device memory allocated The memory for execution of this device context must be supplied by the application.- Returns
An
IExecutionContext
without device memory allocated.
- get_binding_bytes_per_component(self: tensorrt.tensorrt.ICudaEngine, index: int) int
Return the number of bytes per component of an element. The vector component size is returned if
get_binding_vectorized_dim()
!= -1.- Parameters
index – The binding index.
- get_binding_components_per_element(self: tensorrt.tensorrt.ICudaEngine, index: int) int
Return the number of components included in one element.
The number of elements in the vectors is returned if
get_binding_vectorized_dim()
!= -1.- Parameters
index – The binding index.
- get_binding_dtype(*args, **kwargs)
Overloaded function.
get_binding_dtype(self: tensorrt.tensorrt.ICudaEngine, index: int) -> tensorrt.tensorrt.DataType
Determine the required data type for a buffer from its binding index.
- index
The binding index.
- Returns
The type of data in the buffer.
get_binding_dtype(self: tensorrt.tensorrt.ICudaEngine, name: str) -> tensorrt.tensorrt.DataType
Determine the required data type for a buffer from its binding index.
- name
The name of the tensor corresponding to an engine binding.
- Returns
The type of data in the buffer.
- get_binding_format(self: tensorrt.tensorrt.ICudaEngine, index: int) tensorrt.tensorrt.TensorFormat
Return the binding format.
- Parameters
index – The binding index.
- get_binding_format_desc(self: tensorrt.tensorrt.ICudaEngine, index: int) str
Return the human readable description of the tensor format.
The description includes the order, vectorization, data type, strides, etc. For example:
Example 1: kCHW + FP32“Row major linear FP32 format”Example 2: kCHW2 + FP16“Two wide channel vectorized row major FP16 format”Example 3: kHWC8 + FP16 + Line Stride = 32“Channel major FP16 format where C % 8 == 0 and H Stride % 32 == 0”- Parameters
index – The binding index.
- get_binding_index(self: tensorrt.tensorrt.ICudaEngine, name: str) int
Retrieve the binding index for a named tensor.
You can also use engine’s
__getitem__()
withengine[name]
. When invoked with astr
, this will return the corresponding binding index.IExecutionContext.execute_async()
andIExecutionContext.execute()
require an array of buffers. Engine bindings map from tensor names to indices in this array. Binding indices are assigned atICudaEngine
build time, and take values in the range [0 … n-1] where n is the total number of inputs and outputs.- Parameters
name – The tensor name.
- Returns
The binding index for the named tensor, or -1 if the name is not found.
- get_binding_name(self: tensorrt.tensorrt.ICudaEngine, index: int) str
Retrieve the name corresponding to a binding index.
You can also use engine’s
__getitem__()
withengine[index]
. When invoked with anint
, this will return the corresponding binding name.This is the reverse mapping to that provided by
get_binding_index()
.- Parameters
index – The binding index.
- Returns
The name corresponding to the binding index.
- get_binding_shape(*args, **kwargs)
Overloaded function.
get_binding_shape(self: tensorrt.tensorrt.ICudaEngine, index: int) -> tensorrt.tensorrt.Dims
Get the shape of a binding.
- index
The binding index.
- Returns
The shape of the binding if the index is in range, otherwise Dims()
get_binding_shape(self: tensorrt.tensorrt.ICudaEngine, name: str) -> tensorrt.tensorrt.Dims
Get the shape of a binding.
- name
The name of the tensor corresponding to an engine binding.
- Returns
The shape of the binding if the tensor is present, otherwise Dims()
- get_binding_vectorized_dim(self: tensorrt.tensorrt.ICudaEngine, index: int) int
Return the dimension index that the buffer is vectorized.
Specifically -1 is returned if scalars per vector is 1.
- Parameters
index – The binding index.
- get_location(*args, **kwargs)
Overloaded function.
get_location(self: tensorrt.tensorrt.ICudaEngine, index: int) -> tensorrt.tensorrt.TensorLocation
Get location of binding. This lets you know whether the binding should be a pointer to device or host memory.
- index
The binding index.
- returns
The location of the bound tensor with given index.
get_location(self: tensorrt.tensorrt.ICudaEngine, name: str) -> tensorrt.tensorrt.TensorLocation
Get location of binding. This lets you know whether the binding should be a pointer to device or host memory.
- name
The name of the tensor corresponding to an engine binding.
- returns
The location of the bound tensor with given index.
- get_profile_shape(*args, **kwargs)
Overloaded function.
get_profile_shape(self: tensorrt.tensorrt.ICudaEngine, profile_index: int, binding: int) -> List[tensorrt.tensorrt.Dims]
Get the minimum/optimum/maximum dimensions for a particular binding under an optimization profile.
- arg profile_index
The index of the profile.
- arg binding
The binding index or name.
- returns
A
List[Dims]
of length 3, containing the minimum, optimum, and maximum shapes, in that order.
get_profile_shape(self: tensorrt.tensorrt.ICudaEngine, profile_index: int, binding: str) -> List[tensorrt.tensorrt.Dims]
Get the minimum/optimum/maximum dimensions for a particular binding under an optimization profile.
- arg profile_index
The index of the profile.
- arg binding
The binding index or name.
- returns
A
List[Dims]
of length 3, containing the minimum, optimum, and maximum shapes, in that order.
- get_profile_shape_input(*args, **kwargs)
Overloaded function.
get_profile_shape_input(self: tensorrt.tensorrt.ICudaEngine, profile_index: int, binding: int) -> List[List[int]]
Get minimum/optimum/maximum values for an input shape binding under an optimization profile. If the specified binding is not an input shape binding, an exception is raised.
- arg profile_index
The index of the profile.
- arg binding
The binding index or name.
- returns
A
List[List[int]]
of length 3, containing the minimum, optimum, and maximum values, in that order. If the values have not been set yet, an empty list is returned.
get_profile_shape_input(self: tensorrt.tensorrt.ICudaEngine, profile_index: int, binding: str) -> List[List[int]]
Get minimum/optimum/maximum values for an input shape binding under an optimization profile. If the specified binding is not an input shape binding, an exception is raised.
- arg profile_index
The index of the profile.
- arg binding
The binding index or name.
- returns
A
List[List[int]]
of length 3, containing the minimum, optimum, and maximum values, in that order. If the values have not been set yet, an empty list is returned.
- is_execution_binding(self: tensorrt.tensorrt.ICudaEngine, binding: int) bool
Returns
True
if tensor is required for execution phase, false otherwise.For example, if a network uses an input tensor with binding i ONLY as the reshape dimensions for an
IShuffleLayer
, thenis_execution_binding(i) == False
, and a binding of 0 can be supplied for it when callingIExecutionContext.execute()
orIExecutionContext.execute_async()
.- Parameters
binding – The binding index.
- is_shape_binding(self: tensorrt.tensorrt.ICudaEngine, binding: int) bool
Returns
True
if tensor is required as input for shape calculations or output from them.TensorRT evaluates a network in two phases:
Compute shape information required to determine memory allocation requirements and validate that runtime sizes make sense.
Process tensors on the device.
Some tensors are required in phase 1. These tensors are called “shape tensors”, and always have type
tensorrt.int32
and no more than one dimension. These tensors are not always shapes themselves, but might be used to calculate tensor shapes for phase 2.is_shape_binding()
returns true if the tensor is a required input or an output computed in phase 1.is_execution_binding()
returns true if the tensor is a required input or an output computed in phase 2.For example, if a network uses an input tensor with binding
i
as an input to an IElementWiseLayer that computes the reshape dimensions for anIShuffleLayer
,is_shape_binding(i) == True
It’s possible to have a tensor be required by both phases. For instance, a tensor can be used as a shape in an
IShuffleLayer
and as the indices for anIGatherLayer
collecting floating-point data.It’s also possible to have a tensor required by neither phase that shows up in the engine’s inputs. For example, if an input tensor is used only as an input to an
IShapeLayer
, only its shape matters and its values are irrelevant.- Parameters
binding – The binding index.
- serialize(self: tensorrt.tensorrt.ICudaEngine) tensorrt.tensorrt.IHostMemory
Serialize the engine to a stream.
- Returns
An
IHostMemory
object containing the serializedICudaEngine
.