Class CudaObjectHandler
- Defined in File gxf_cuda.hpp 
Base Type
- public holoscan::CudaObjectHandler(Class CudaObjectHandler)
- 
class CudaObjectHandler : public holoscan::CudaObjectHandler
- This class handles usage of CUDA streams for operators. - When using CUDA operations the default stream ‘0’ synchronizes with all other streams in the same context, see https://docs.nvidia.com/cuda/cuda-runtime-api/stream-sync-behavior.html#stream-sync-behavior. This can reduce performance. The CudaObjectHandler class manages CUDA streams and events across operators and makes sure that CUDA operations are properly chained. - Usage: - This class is automatically added as an internal data member of each operators - ExecutionContext. It will be automatically configured by- ExecutionContext::init_cuda_object_handler(op)by- GXFWrapper::start(), just before- Operator::startis called.
- A stream pool for use by - CudaObjectHandlercan be added to the operator either by explicitly adding a parameter with type- std::shared_ptr<CudaStreamPool>and name- cuda_stream_poolor by passing an- Arg<std::shared_ptr<CudaStreamPool>>to- Fragment::make_operatorwhen creating the operator. It is not required to provide a stream pool, but allocation of an internal stream or allocation of additional streams via- allocate_cuda_streamis only possible if a stream pool is present.
- This class is not intended for direct use by Application authors, but instead to support the public methods available on - InputContext,- OutputContextand- ExecutionContextas described below.
- When the - InputContext::receivemethod is called for a given port, the operator’s- CudaObjectHandlerclass will update its internal mapping of the streams available on the input ports.
- When - InputContext::receive_cuda_streamis called, any received streams found by the prior- receivecall for the specified port will be synchronized to the operator’s internal stream and then that internal stream will returned as a standard CUDA Runtime API- cudaStream_t. If no- CudaStreamPoolwas configured, it will not be possible to create the internal stream, so in that case, the first CUDA stream found on the input will be returned and any remaining streams on the input are synchronized to it. If there are no streams on the input port and there is no internal- CudaStreamPool, then- cudaStreamDefaultis returned. When a non-default stream is returned, this method calls- cudaSetDeviceto set the active device to match the stream that is returned. When a non-default stream is returned, this method also will have automatically configured the output ports of the operator to emit that stream, so manually calling- OutputContext::set_cuda_streamis not necessary when using this method.
- The - InputContext::receive_cuda_streamsmethod is intended for advanced use cases where the user wants to handle all streams found and their synchronization manually. It just returns a- vector<std::optional<cudaStream_t>>where the size of the vector is equal to the number of messages found on the input port. Any messages without a stream will have a- std::nulloptentry in the vector.
- The - ExecutionContext::allocate_cuda_streammethod can be used if it is necessary to allocate an additional stream for use by the operator. In most cases, this will not be necessary and the stream that is returned by- InputContext::receive_cuda_streamcan be used.
- The - ExecutionContext::device_from_streammethod can be used to determined which CUDA device id a given- cudaStream_treturned by- InputContext::receive_cuda_streamor- InputContext::receive_cuda_streamsbelongs to.
- The - OutputContext::set_cuda_streammethod can be used to emit specific streams on specific output ports. Any non-default stream received by- InputContext::receive_cuda_streamwould already automatically be output, so this method is mainly useful if doing manual management of the streams received via- InputContext::receive_cuda_streamsor if additional internal streams were allocated via- ExecutionContext::allocate_cuda_stream.
 - Public Functions - 
virtual ~CudaObjectHandler() override
- Destroy the CudaObjectHandler object. 
 - 
virtual void init_from_operator(Operator *op) override
- Use a CudaStreamPool from the specified Operator if one is present. - Parameters
- op – : The operator this instance of CudaObjectHandler is attached to. This operator must have already been initialized. 
 
 - 
gxf_result_t add_stream(const CudaStreamHandle &stream_handle, const std::string &output_port_name)
- Add stream to output port (must be called before any emit call using that port) - Parameters
- stream_handle – The stream to add 
- output_port_name – The name of the output port 
 
- Returns
- gxf_result_t 
 
 - 
virtual int add_stream(const cudaStream_t stream, const std::string &output_port_name) override
- Add stream to output port (must be called before any emit call using that port) - Parameters
- stream – The stream to add 
- output_port_name – The name of the output port 
 
- Returns
- gxf_result_t 
 
 - 
expected<CudaStreamHandle, RuntimeError> get_cuda_stream_handle(gxf_context_t context, const std::string &input_port_name, bool allocate = true, bool sync_to_default = false)
- Get the CUDA stream handle which should be used for CUDA commands involving data from the specified input port. - For multi-receivers or input ports with queue size > 1, the first stream found is returned after any remaining streams are synchronized to it. - See - get_cuda_stream_handles()instead to receive a vector of (optional) CUDA stream handles (one for each message).- If no message stream is set and the - allocateflag is true, a stream will be allocated from the internal CudaStreamPool. Only if this allocation fails, would an unexpected be returned.- Parameters
- context – The GXF context of the operator. 
- input_port_name – The name of the input port from which to retrieve the stream. 
- allocate – If true, allocate a new stream via a cuda_stream_pool parameter if no stream is found. 
- sync_to_default – If true, synchronize any streams to the default stream. If false, synchronization is done to the internal stream instead. 
 
- Returns
- CudaStreamHandle 
 
 - 
expected<std::vector<std::optional<CudaStreamHandle>>, RuntimeError> get_cuda_stream_handles(gxf_context_t context, const std::string &input_port_name)
- Get the CUDA stream handles which should be used for CUDA commands involving data from the specified input port. - The size of the vector returned will be equal to the number of messages received on the input port. Any messages which did not contain a stream will result in a std::nullopt in the vector. - Parameters
- context – The GXF context of the operator. 
- input_port_name – The name of the input port from which to retrieve the stream. 
 
- Returns
- vector<std::optional<CudaStreamHandle>> 
 
 - 
virtual cudaStream_t get_cuda_stream(void *context, const std::string &input_port_name, bool allocate = false, bool sync_to_default = true) override
- Get the CUDA stream which should be used for CUDA commands involving data from the specified input port. - For multi-receivers or input ports with queue size > 1, see - get_cuda_streams()instead to receive a vector of CUDA streams (one for each message).- If no message stream is set and no stream can be allocated from the internal CudaStreamPool, returns CudaStreamDefault. - Parameters
- context – The GXF context of the operator. 
- input_port_name – The name of the input port from which to retrieve the stream 
- allocate – If true, allocate a new stream via a cuda_stream_pool parameter if none is found on the input port. Otherwise, cudaStreamDefault will be returned. 
- sync_to_default – If true, synchronize any streams to the default stream. If false, synchronization is done to the first stream found on the port instead. 
 
- Returns
- cudaStream_t 
 
 - 
virtual std::vector<std::optional<cudaStream_t>> get_cuda_streams(void *context, const std::string &input_port_name) override
- Get the CUDA stream which should be used for CUDA commands involving data from the specified input port. - The size of the vector returned will be equal to the number of messages received on the input port. Any messages which did not contain a stream will result in a cudaStreamDefault in the vector. - Parameters
- context – The GXF context of the operator. 
- input_port_name – The name of the input port from which to retrieve the stream 
 
- Returns
- vector<std::optional<cudaStream_t>> 
 
 - 
gxf_result_t synchronize_streams(std::vector<std::optional<CudaStreamHandle>> stream_handles, CudaStreamHandle target_stream_handle, bool sync_to_default_stream = true)
- Sync all streams in stream_handles with target_stream_handle. - Any streams in stream_handles that are not valid will be ignored. - Parameters
- stream_handles – The vector of streams to sync. 
- target_stream_handle – The stream to sync to. 
- sync_to_default_stream – If true, also synchronize the target stream to the default stream 
 
- Returns
- gxf_result_t GXF_SUCCESS if all streams were successfully synced. 
 
 - 
virtual int synchronize_streams(std::vector<cudaStream_t> cuda_streams, cudaStream_t target_stream, bool sync_to_default_stream = true) override
- Sync all streams in stream_handles with target_stream_handle. - Any streams in stream_handles that are not valid will be ignored. - Parameters
- cuda_streams – The vector of streams to sync. 
- target_stream – The stream to sync to. 
- sync_to_default_stream – If true, also synchronize the target stream to the default stream 
 
- Returns
- int 0 if all streams were successfully synced, otherwise an error code 
 
 - 
cudaStream_t stream_from_stream_handle(CudaStreamHandle stream_handle)
- Get the cudaStream_t value corresponding to a CudaStreamHandle. - Parameters
- stream_handle – The CudaStreamHandle 
- Returns
- The CUDA stream contained within the CudaStream object 
 
 - 
expected<CudaStreamHandle, RuntimeError> stream_handle_from_stream(cudaStream_t stream)
- Get the CudaStreamHandle corresponding to a cudaStream_t. - Parameters
- stream – The CUDA stream 
- Returns
- GXF Handle to the CudaStream object if found, otherwise an unexpected is returned. 
 
 - 
expected<gxf_uid_t, ErrorCode> get_output_stream_cid(const std::string &output_port_name)
- Get the GXF component ID for any stream to be emitted on the specified output port. - Parameters
- output_port_name – The name of the output port 
- Returns
- expected<gxf_uid_t> 
 
 - 
gxf_result_t streams_from_message(gxf_context_t context, const nvidia::gxf::Entity &message, const std::string &input_name)
- Get the GXF component IDs for any events to be emitted on the specified output port. - Parameters
- context – The GXF context 
- message – The GXF message entity 
- input_name – The name of the input port 
 
- Returns
- expected<std::vector<gxf_uid_t>> 
 
 - 
expected<CudaStreamHandle, RuntimeError> allocate_internal_stream(gxf_context_t context, const std::string &stream_name)
- Allocate an internal CUDA stream and store it in the mapping for the given input port - Parameters
- context – The GXF context 
- stream_name – The name of the stream 
 
- Returns
- GXF Handle to the allocated CudaStream component 
 
 - 
virtual int release_internal_streams(void *context) override
- Release all internally allocated CUDA streams. 
 - 
virtual void clear_received_streams() override
- Retain the existing unordered_maps and vectors of received streams, but clear the contents. - This is used to refresh the state of the received streams before each - Operator::computecall.