The Wrap Raw Pointers sample demonstrates how to wrap externally allocated memory buffers into VPI images for integration with existing pipelines. This is particularly useful when integrating VPI with other frameworks or when working with pre-allocated memory from hardware accelerators.
Wrapping external memory can avoid unnecessary memory copies when the selected backend can access the wrapped allocation directly. The sample applies a vertical image flip operation to demonstrate processing on the wrapped buffers.
The wrapper itself does not copy the external allocation. Copies can still happen later if an algorithm backend needs a memory representation that cannot be shared with the wrapped allocation.
To avoid these copies in CUDA/VPI interop pipelines, prefer the reverse direction: allocate the image with vpiImageCreate, run VPI operations on that VPI-owned image, and lock out the CUDA view with vpiImageLockData using VPI_IMAGE_BUFFER_CUDA_PITCH_LINEAR when CUDA code needs to access it. Keep the returned VPIImageBufferPitchLinear data only while the image is locked. For multimedia or NvSci pipelines, apply the same rule: create in VPI first, then lock or extract the interop handle that the external code needs.
The sample produces a vertically flipped grayscale version of the input image saved as output.png.
For convenience, here's the code that is also installed in the samples directory.
29 #include <opencv2/core/version.hpp>
30 #if CV_MAJOR_VERSION >= 3
31 # include <opencv2/imgcodecs.hpp>
33 # include <opencv2/contrib/contrib.hpp>
34 # include <opencv2/highgui/highgui.hpp>
37 #include <cuda_runtime.h>
38 #include <nvbufsurface.h>
54 #define CHECK_STATUS(STMT) \
57 VPIStatus status = (STMT); \
58 if (status != VPI_SUCCESS) \
60 char buffer[VPI_MAX_STATUS_MESSAGE_LENGTH]; \
61 vpiGetLastStatusMessage(buffer, sizeof(buffer)); \
62 std::ostringstream ss; \
63 ss << "line " << __LINE__ << " " << vpiStatusGetName(status) << ": " << buffer; \
64 throw std::runtime_error(ss.str()); \
89 int main(
int argc,
char *argv[])
95 throw std::runtime_error(std::string(
"Usage: ") + argv[0] +
" " +
96 "<input_image> <cuda_ptr|cpu_ptr|nvbufsurface>");
98 std::string imgName = argv[1];
99 std::string ptrToWrap = argv[2];
102 if (ptrToWrap ==
"cuda_ptr")
106 else if (ptrToWrap ==
"cpu_ptr")
110 else if (ptrToWrap ==
"nvbufsurface")
116 throw std::runtime_error(
"Pointer to wrap must be one of cuda_ptr, cpu_ptr or nvbufsurface");
125 cv::Mat image = cv::imread(imgName, cv::IMREAD_GRAYSCALE);
128 throw std::runtime_error(
"Failed to read input image: " + imgName);
130 uint32_t imgWidth = image.cols;
131 uint32_t imgHeight = image.rows;
135 VPIImage bufIn =
nullptr, bufOut =
nullptr;
137 uint8_t *cudaIn =
nullptr, *cudaOut =
nullptr;
138 NvBufSurface *surfIn =
nullptr, *surfOut =
nullptr;
139 uint8_t *cpuIn =
nullptr, *cpuOut =
nullptr;
142 if (ptrToWrap ==
"nvbufsurface")
146 NvBufSurfaceCreateParams nv_buf_params{};
147 nv_buf_params.width = imgWidth;
148 nv_buf_params.height = imgHeight;
149 nv_buf_params.memType = NVBUF_MEM_SURFACE_ARRAY;
150 nv_buf_params.layout = NVBUF_LAYOUT_PITCH;
151 nv_buf_params.colorFormat = NVBUF_COLOR_FORMAT_GRAY8;
152 if (0 != NvBufSurfaceCreate(&surfIn, 1, &nv_buf_params))
154 throw std::runtime_error(
"NvBufSurfaceCreate failed");
156 if (0 != NvBufSurfaceCreate(&surfOut, 1, &nv_buf_params))
158 throw std::runtime_error(
"NvBufSurfaceCreate failed");
163 if (0 != Raw2NvBufSurface(image.data, 0, 0, imgWidth, imgHeight, surfIn))
165 throw std::runtime_error(
"Copying to NvBufSurface failed");
171 dataIn.
buffer.
fd = surfIn->surfaceList[0].bufferDesc;
175 dataOut.buffer.fd = surfOut->surfaceList[0].bufferDesc;
178 else if (ptrToWrap ==
"cuda_ptr")
182 if (0 != cudaMalloc(&cudaIn, imgWidth * imgHeight *
sizeof(uint8_t)))
184 throw std::runtime_error(
"Allocating CUDA pitch failed");
186 if (0 != cudaMalloc(&cudaOut, imgWidth * imgHeight *
sizeof(uint8_t)))
188 throw std::runtime_error(
"Allocating CUDA pitch failed");
193 if (0 != cudaMemcpy2D(cudaIn, imgWidth, image.data, imgWidth, imgWidth, imgHeight, cudaMemcpyHostToDevice))
195 throw std::runtime_error(
"Copying to CUDA pitch failed");
209 cpuIn =
static_cast<uint8_t *
>(malloc(imgWidth * imgHeight *
sizeof(uint8_t)));
210 cpuOut =
static_cast<uint8_t *
>(malloc(imgWidth * imgHeight *
sizeof(uint8_t)));
214 memcpy(cpuIn, image.data, imgWidth * imgHeight);
239 if (ptrToWrap ==
"nvbufsurface")
241 if (0 != NvBufSurface2Raw(surfOut, 0, 0, imgWidth, imgHeight, image.data))
243 throw std::runtime_error(
"Copying from NvBufSurface failed");
246 else if (ptrToWrap ==
"cuda_ptr")
248 if (0 != cudaMemcpy2D(image.data, imgWidth, cudaOut, imgWidth, imgWidth, imgHeight, cudaMemcpyDeviceToHost))
250 throw std::runtime_error(
"Copying to CUDA pitch failed");
255 memcpy(image.data, cpuOut, imgWidth * imgHeight);
257 cv::imwrite(
"output.png", image);
264 if (ptrToWrap ==
"nvbufsurface")
266 NvBufSurfaceDestroy(surfIn);
267 NvBufSurfaceDestroy(surfOut);
269 else if (ptrToWrap ==
"cuda_ptr")
281 if (ptrToWrap ==
"cuda_ptr")
288 std::_Exit(EXIT_SUCCESS);
Functions and structures for dealing with VPI contexts.
Declares functions that implement Image flip algorithms.
Functions and structures for dealing with VPI images.
#define VPI_PIXEL_TYPE_INVALID
Signal format conversion errors.
Declares functions dealing with VPI streams.
unsigned char VPIByte
Definition of a byte type.
@ VPI_FLIP_VERT
Flip vertically.
@ VPI_COLOR_SPEC_DEFAULT
Default color spec.
VPIStatus vpiContextCreate(uint64_t flags, VPIContext *ctx)
Create a context instance.
VPIStatus vpiContextSetCurrent(VPIContext ctx)
Sets the context for the calling thread.
void vpiContextDestroy(VPIContext ctx)
Destroy a context instance as well as all resources it owns.
struct VPIContextImpl * VPIContext
A handle to a context.
VPIStatus vpiSubmitImageFlip(VPIStream stream, uint64_t backend, VPIImage input, VPIImage output, VPIFlipMode flipMode)
Flips a 2D image either horizontally, vertically or both.
VPIImageBuffer buffer
Stores the image contents.
VPIImagePlanePitchLinear planes[VPI_MAX_PLANE_COUNT]
Data of all image planes in pitch-linear layout.
VPIImageBufferPitchLinear pitch
Image stored in pitch-linear layout.
int32_t numPlanes
Number of planes.
VPIImageFormat format
Image format.
VPIPixelType pixelType
Type of each pixel within this plane.
VPIColorSpec colorSpec
Color spec to override the one defined by the VPIImageData wrapper.
VPIImageBufferType bufferType
Type of image buffer.
int fd
Image stored as an NvBuffer file descriptor.
int64_t offsetBytes
Offset in bytes from pBase to the first column of the first plane row.
int32_t height
Height of this plane in pixels.
VPIByte * pBase
Pointer to the memory buffer which contains the plane data.
int32_t width
Width of this plane in pixels.
int32_t pitchBytes
Difference in bytes of beginning of one row and the beginning of the previous.
void vpiImageDestroy(VPIImage img)
Destroy an image instance.
struct VPIImageImpl * VPIImage
A handle to an image.
VPIStatus vpiImageCreateWrapper(const VPIImageData *data, const VPIImageWrapperParams *params, uint64_t flags, VPIImage *img)
Create an image object by wrapping an existing memory block.
VPIImageBufferType
Represents how the image data is stored.
@ VPI_IMAGE_BUFFER_CUDA_PITCH_LINEAR
CUDA-accessible with planes in pitch-linear memory layout.
@ VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR
Host-accessible with planes in pitch-linear memory layout.
@ VPI_IMAGE_BUFFER_NVBUFFER
NvBuffer.
Stores the image plane contents.
Stores information about image characteristics and content.
Parameters for customizing image wrapping.
struct VPIStreamImpl * VPIStream
A handle to a stream.
VPIStatus vpiStreamSync(VPIStream stream)
Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
VPIBackend
VPI Backend types.
void vpiStreamDestroy(VPIStream stream)
Destroy a stream instance and deallocate all HW resources.
VPIStatus vpiStreamCreate(uint64_t flags, VPIStream *stream)
Create a stream instance.
@ VPI_BACKEND_CUDA
CUDA backend.
@ VPI_BACKEND_VIC
VIC backend.
@ VPI_BACKEND_CPU
CPU backend.