IExecutionContext

class tensorrt.IExecutionContext

Context for executing inference using an ICudaEngine . Multiple IExecutionContext s may exist for one ICudaEngine instance, allowing the same ICudaEngine to be used for the execution of multiple batches simultaneously.

Variables:
  • debug_syncbool The debug sync flag. If this flag is set to true, the ICudaEngine will log the successful execution for each kernel during execute(). It has no effect when using execute_async().
  • profilerIProfiler The profiler in use by this IExecutionContext .
  • engineICudaEngine The associated ICudaEngine .
  • namestr The name of the IExecutionContext .
  • device_memorycapsule The device memory for use by this execution context. The memory must be aligned on a 256-byte boundary, and its size must be at least engine.device_memory_size. If using execute_async() to run the network, The memory is in use from the invocation of execute_async() until network execution is complete. If using execute(), it is in use until execute() returns. Releasing or otherwise using the memory for other purposes during this time will result in undefined behavior. .
execute(self: tensorrt.tensorrt.IExecutionContext, batch_size: int, bindings: List[int]) → bool

Synchronously execute inference on a batch. This method requires a array of input and output buffers. The mapping from tensor names to indices can be queried using ICudaEngine.get_binding_index() .

Parameters:
  • batch_size – The batch size. This is at most the value supplied when the ICudaEngine was built.
  • bindings – A list of integers represetings input and output buffer addresses for the network.
Returns:

True if execution succeeded.

execute_async(self: tensorrt.tensorrt.IExecutionContext, batch_size: int = 1, bindings: List[int], stream_handle: int, input_consumed: capsule = None) → bool

Asynchronously execute inference on a batch. This method requires a array of input and output buffers. The mapping from tensor names to indices can be queried using ICudaEngine::get_binding_index() .

Parameters:
  • batch_size – The batch size. This is at most the value supplied when the ICudaEngine was built.
  • bindings – A list of integers represetings input and output buffer addresses for the network.
  • stream_handle – A handle for a CUDA stream on which the inference kernels will be executed.
  • input_consumed – An optional event which will be signaled when the input buffers can be refilled with new data
Returns:

True if the kernels were executed successfully.