Declaration of functions for CUDA interoperability. More...

VPIStatus	vpiImageCreateCUDAMemWrapper (const VPIImageData cudaData, uint32_t flags, VPIImage img)
	Create an image object by wrapping around an existing device (CUDA) memory block. More...

VPIStatus	vpiArrayCreateCUDAMemWrapper (const VPIArrayData arrayData, uint32_t flags, VPIArray array)
	Create an array object by wrapping an existing device (CUDA) memory block. More...

VPIStatus	vpiArraySetWrappedCUDAMem (VPIArray array, const VPIArrayData *arrayData)
	Redefines the wrapped device (CUDA) memory in an existing VPIArray wrapper. More...

VPIStatus	vpiStreamCreateCUDAStreamWrapper (CUstream cudaStream, uint32_t flags, VPIStream *stream)
	Wraps an existing `cudaStream_t` into a VPI stream. More...

VPIStatus	vpiImageSetWrappedCUDAMem (VPIImage img, const VPIImageData *hostData)
	Redefines the wrapped device (CUDA) memory in an existing VPIImage wrapper. More...

VPIStatus	vpiEventCreateCUDAEventWrapper (CUevent cudaEvent, VPIEvent *event)
	Create an event object by wrapping around an existing CUDA `CUevent` object. More...

Detailed Description

Declaration of functions for CUDA interoperability.

The provided methods allows wrapping CUDA objects external to VPI. They can then be used efficiently in VPI compute pipelines.

Function Documentation

◆ vpiArrayCreateCUDAMemWrapper()

VPIStatus vpiArrayCreateCUDAMemWrapper	(	const VPIArrayData *	arrayData,
		uint32_t	flags,
		VPIArray *	array
	)

#include <vpi/CUDAInterop.h>

Create an array object by wrapping an existing device (CUDA) memory block.

Stride between elements has to be at least as large as the element structure size. It also has to respect alignment requirements of the element data structure.

The returned handle must be destroyed when not being used anymore by calling vpiArrayDestroy.

The object doesn't own the wrapped memory. The user is still responsible for wrapped memory lifetime, which must be valid until the array object is destroyed.

Parameters

[in]	arrayData	VPIArrayData pointing to the device (CUDA) memory block to be wrapped.
[in]	flags	Array flags. Here it can be specified in what backends the array can be used by or-ing together VPIBackend flags. Set flags to 0 to enable it in all backends supported by the active VPI context.
[out]	array	Pointer to memory that will receive the created array handle.

Returns: an error code on failure else VPI_SUCCESS

◆ vpiArraySetWrappedCUDAMem()

VPIStatus vpiArraySetWrappedCUDAMem	(	VPIArray	array,
		const VPIArrayData *	arrayData
	)

#include <vpi/CUDAInterop.h>

Redefines the wrapped device (CUDA) memory in an existing VPIArray wrapper.

The old wrapped memory and the new one must have same capacity, element format and must point to device-side memory. The VPIArray must have been created by vpiArrayCreateCUDAMemWrapper.

This operation is efficient and does not allocate memory. The wrapped memory will be accessible to the same backends specified during wrapper creation.

The wrapped memory must not be deallocated while it's still being wrapped.

Parameters

[in]	array	Handle to array created by vpiArrayCreateCUDAMemWrapper.
[in]	arrayData	VPIArrayData pointing to the new host memory block to be wrapped.

Returns: an error code on failure else VPI_SUCCESS.

◆ vpiEventCreateCUDAEventWrapper()

VPIStatus vpiEventCreateCUDAEventWrapper	(	CUevent	cudaEvent,
		VPIEvent *	event
	)

#include <vpi/CUDAInterop.h>

Create an event object by wrapping around an existing CUDA CUevent object.

The created event can be used by vpiEventSync / vpiStreamWaitEvent to synchronize on a previously recorded CUDA event, or CUDA synchronization functions can be used to synchronize on events captured with vpiEventRecord().

Warning: This function is currently not implemented.

Parameters

[in]	cudaEvent	CUDA event handle to be wrapped.
[out]	event	Pointer to memory that will receive the created event handle.

Returns: Always returns VPI_ERROR_NOT_IMPLEMENTED.

◆ vpiImageCreateCUDAMemWrapper()

VPIStatus vpiImageCreateCUDAMemWrapper	(	const VPIImageData *	cudaData,
		uint32_t	flags,
		VPIImage *	img
	)

#include <vpi/CUDAInterop.h>

Create an image object by wrapping around an existing device (CUDA) memory block.

Only pitch-linear format is supported. The underlying image object does not own/claim the memory block.

Parameters

[in]	cudaData	Pointer to structure with cuda memory to be wrapped.
[in]	flags	Image flags. Here it can be specified in what backends the image can be used by or-ing together VPIBackend flags. Set flags to 0 to enable it in all backends supported by the active VPI context.
[out]	img	Pointer to memory that will receive the created image handle.

Returns: an error code on failure else VPI_SUCCESS

◆ vpiImageSetWrappedCUDAMem()

VPIStatus vpiImageSetWrappedCUDAMem	(	VPIImage	img,
		const VPIImageData *	hostData
	)

#include <vpi/CUDAInterop.h>

Redefines the wrapped device (CUDA) memory in an existing VPIImage wrapper.

The old wrapped memory and the new one must have same dimensions, format and must point to device-side (cuda-accessible) memory.

The VPIImage must have been created by vpiImageCreateCUDAMemWrapper.

This operation is efficient and does not allocate memory. The wrapped memory will be accessible to the same backends specified during wrapper creation.

The wrapped memory must not be deallocated while it's still being wrapped.

Parameters

[in]	img	Handle to image created by vpiImageCreateCUDAMemWrapper.
[in]	hostData	VPIImageData pointing to the new device memory block to be wrapped.

Returns: an error code on failure else VPI_SUCCESS

◆ vpiStreamCreateCUDAStreamWrapper()

VPIStatus vpiStreamCreateCUDAStreamWrapper	(	CUstream	cudaStream,
		uint32_t	flags,
		VPIStream *	stream
	)

#include <vpi/CUDAInterop.h>

Wraps an existing cudaStream_t into a VPI stream.

CUDA algorithms are submitted for execution in the wrapped cudaStream_t. This allows to insert a VPI-driven processing into an existing CUDA pipeline. Algorithms can still be submitted to other backends.

The VPIStream doesn't own the cudaStream_t. It must be valid during VPIStream lifetime.

CUDA kernels can only be submitted directly to cudaStream_t if it's guaranteed that all tasks submitted to VPIStream are finished.

Parameters

[in]	cudaStream	The CUDA stream handle to be wrapped.
[in]	flags	Stream flags. VPI_BACKEND_CUDA is always added, but other backends can be specified as well by or-ing together VPIBackend flags.
[out]	stream	Pointer that will receive the newly created VPIStream.

VPI - Vision Programming Interface

1.2 Release

Detailed Description

Function Documentation

◆ vpiArrayCreateCUDAMemWrapper()

◆ vpiArraySetWrappedCUDAMem()

◆ vpiEventCreateCUDAEventWrapper()

◆ vpiImageCreateCUDAMemWrapper()

◆ vpiImageSetWrappedCUDAMem()

◆ vpiStreamCreateCUDAStreamWrapper()