IExecutionContext¶
-
class
tensorrt.
IExecutionContext
¶ Context for executing inference using an
ICudaEngine
. MultipleIExecutionContext
s may exist for oneICudaEngine
instance, allowing the sameICudaEngine
to be used for the execution of multiple batches simultaneously.Variables: - debug_sync –
bool
The debug sync flag. If this flag is set to true, theICudaEngine
will log the successful execution for each kernel during execute(). It has no effect when using execute_async(). - profiler –
IProfiler
The profiler in use by thisIExecutionContext
. - engine –
ICudaEngine
The associatedICudaEngine
. - name –
str
The name of theIExecutionContext
. - device_memory –
capsule
The device memory for use by this execution context. The memory must be aligned on a 256-byte boundary, and its size must be at leastengine.device_memory_size
. If usingexecute_async()
to run the network, The memory is in use from the invocation ofexecute_async()
until network execution is complete. If usingexecute()
, it is in use untilexecute()
returns. Releasing or otherwise using the memory for other purposes during this time will result in undefined behavior. .
-
execute
(self: tensorrt.tensorrt.IExecutionContext, batch_size: int, bindings: List[int]) → bool¶ Synchronously execute inference on a batch. This method requires a array of input and output buffers. The mapping from tensor names to indices can be queried using
ICudaEngine.get_binding_index()
.Parameters: - batch_size – The batch size. This is at most the value supplied when the
ICudaEngine
was built. - bindings – A list of integers represetings input and output buffer addresses for the network.
Returns: True if execution succeeded.
- batch_size – The batch size. This is at most the value supplied when the
-
execute_async
(self: tensorrt.tensorrt.IExecutionContext, batch_size: int = 1, bindings: List[int], stream_handle: int, input_consumed: capsule = None) → bool¶ Asynchronously execute inference on a batch. This method requires a array of input and output buffers. The mapping from tensor names to indices can be queried using
ICudaEngine::get_binding_index()
.Parameters: - batch_size – The batch size. This is at most the value supplied when the
ICudaEngine
was built. - bindings – A list of integers represetings input and output buffer addresses for the network.
- stream_handle – A handle for a CUDA stream on which the inference kernels will be executed.
- input_consumed – An optional event which will be signaled when the input buffers can be refilled with new data
Returns: True if the kernels were executed successfully.
- batch_size – The batch size. This is at most the value supplied when the
- debug_sync –