holoscan::GPUResidentExecutor
holoscan::GPUResidentExecutor
holoscan::GPUResidentExecutor
Inherits from: holoscan::Executor (public)
Construct a new GPUResidentExecutor object.
Parameters
The pointer to the fragment of the executor.
Run the graph.
Parameters
The reference to the graph.
Run the graph asynchronously.
Returns: The future object.
Parameters
The reference to the graph.
Set the context.
Parameters
The context.
Initialize the fragment_ in this Executor.
This method is called by run() to initialize the fragment and the graph of operators in the fragment before execution.
Returns: true if fragment initialization is successful. Otherwise, false.
Initialize the given operator.
This method is called by Operator::initialize() to initialize the operator.
Depending on the type of the operator, this method may be overridden to initialize the operator. For example, the default executor (GXFExecutor) initializes the operator using the GXF API and sets the operator’s ID to the ID of the GXF codelet.
Returns: true if the operator is initialized successfully. Otherwise, false.
Parameters
The pointer to the operator.
Initialize the given scheduler.
This method is called by Scheduler::initialize() to initialize the operator.
Depending on the type of the scheduler, this method may be overridden to initialize the scheduler. For example, the default executor (GXFExecutor) initializes the scheduler using the GXF API and sets the operator’s ID to the ID of the GXF scheduler.
Returns: true if the scheduler is initialized successfully. Otherwise, false.
Parameters
The pointer to the scheduler.
Initialize the given network context.
This method is called by NetworkContext::initialize() to initialize the operator.
Depending on the type of the network context, this method may be overridden to initialize the network context. For example, the default executor (GXFExecutor) initializes the network context using the GXF API and sets the operator’s ID to the ID of the GXF network context.
Returns: true if the network context is initialized successfully. Otherwise, false.
Parameters
The pointer to the network context.
Initialize the fragment services for the executor.
This method is called during executor initialization to set up any required fragment services.
Depending on the type of executor, this method may be overridden to initialize specific fragment services. For example, the default executor (GXFExecutor) may initialize fragment services using the GXF API.
Returns: true if the fragment services are initialized successfully. Otherwise, false.
Prepare data flow connections for a topologically ordered GPU-resident graph.
This initializes operator specs, assigns per-port unique IDs, and allocates/connects device memory for every supported edge in the graph.
Parameters
The operator graph.
Operators flattened in deterministic topological order.
This function initializes CUDA.
Currently, it sets the device to 0 by default. Setting a different GPU device for GPU-resident graph execution is not yet supported.
This function returns the device memory address of an input or output port corresponding to a given port name.
GPU-resident operators use this function to get the device memory address of the input or output port.
Returns: The device memory address of the input or output port
Parameters
The operator
The name of the input or output port
Verify the graph topology and flatten it in topological order.
GPU-resident execution currently supports acyclic graphs with exactly one source operator. This method only validates and flattens the operator graph itself.
Returns: True if the graph topology is supported by GPU-resident execution, false otherwise.
Parameters
The operator graph.
Output vector populated in deterministic topological order.
Sends a tear down signal to the GPU-resident CUDA graph.
Indicates whether the result of a single iteration of the GPU-resident CUDA graph is ready or not.
Returns: true if the result is ready, false otherwise.
This function informs GPU-resident CUDA graph that the data is ready for the main workload.
Indicates whether the GPU-resident CUDA graph has been launched.
Returns: true if the CUDA graph has been launched, false otherwise.
Get the execution context - currently, this has no meaning for GPU-resident graph execution When we need to store something for execution context, we will store a pointer in the exec_context_ for a ExecutionContext object.
Get the CUDA device pointer for the data_ready signal.
Returns: Pointer to the device memory location for data_ready signal.
Get the CUDA device pointer for the result_ready signal.
Returns: Pointer to the device memory location for result_ready signal.
Get the CUDA device pointer for the tear_down signal.
Returns: Pointer to the device memory location for tear_down signal.
Register a data ready handler fragment.
This function stores a reference to the fragment that will handle data ready events.
Parameters
The fragment to register as the data ready handler.
Get the registered data ready handler fragment.
Returns: The data ready handler fragment, or nullptr if none is registered.
Set the sleep interval on device when data is not ready.
Parameters
The sleep interval in microseconds. Default is 500 us.
Enable or disable a system-wide fence in the while-end-marker kernel.
Parameters
True to enable, false to disable.
See also: Fragment::GPUResidentAccessor::sync_with_host for the public-facing API and full documentation.
Enable execution time measurement.
Execution time is the time between the start of a streaming data iteration and the end of the same iteration. Execution time is not measured when the data is not marked as ready.
Parameters
The total number of samples to collect. Default is 100.
Get the host pointer to the execution times in microseconds.
Returns: a pair of the host pointer to the execution times in microseconds and the number of samples collected.
Interrupt the execution.
Returns: true if the interrupt was successful (graph was running), false if the graph was not running (already stopped or not started).
Wait for the execution to complete.
This method blocks until the graph execution (started by run_async or interrupted by interrupt()) completes. Should be called after interrupt() to ensure the scheduler has fully stopped before performing cleanup operations.
Only call this if interrupt() returned true. Calling wait() when the graph is not running can cause issues with concurrent cleanup.
Get a pointer to Fragment object
Set the pointer to the fragment of the executor.
Parameters
The pointer to the fragment of the executor.
Get whether the context is owned by the executor.
Returns: true if the context is owned by the executor. Otherwise, false.
Get the extension manager.
Returns: The shared pointer of the extension manager.
Set the exception.
This method is called by the framework to store the exception that occurred during the execution of the fragment. If the exception is set, this exception is rethrown by the framework after the execution of the fragment.
Parameters
The exception to store.
Inspect the port specs of a single source_port -> destination_port connection and either allocate a shared device buffer or wire an externally-owned device pointer.
Parameters
The upstream operator (owns the output port).
The downstream operator (owns the input port).
Name of the output port on source_op.
Name of the input port on dest_op.
This function creates the full GPU-resident CUDA graph.
It also instantiates the CUDA graph to be ready for launch.
This function verifies that the operator names are distinct between the main workload fragment and the data ready handler fragment.
Assumes topologically ordered operators are already created before calling this function.
Returns: True if the operator names are distinct, false otherwise.
Add the receivers as input ports of the given operator.
This method is to be called by the Fragment::add_flow() method to support for the case where the destination input port label points to the parameter name of the downstream operator, and the parameter type is ‘std::vector<holoscan::IOSpec*>’. This finds a parameter with with ‘std::vector<holoscan::IOSpec*>’ type and create a new input port with a specific label (‘parameter name:index’. e.g, ‘receivers:0’).
Returns: true if the receivers are added successfully. Otherwise, false.
Parameters
The reference to the shared pointer of the operator.
The name of the receivers whose parameter type is ‘std::vector<holoscan::IOSpec*>’.
The reference to the vector of input port labels to which the input port labels are added. In the case of multiple receivers, the input port label is updated to ‘parameter name:index’ (e.g. ‘receivers’ => ‘receivers:0’).
The reference to the vector of IOSpec pointers.
Add a control flow between two operators.
This method is called by Fragment::add_flow() to add a control flow between two operators.
Returns: true if the control flow is added successfully. Otherwise, false.
Parameters
The shared pointer to the upstream operator.
The shared pointer to the downstream operator.