DriveWorks SDK Reference
3.0.4260 Release
For Test and Development only

Image
Note
SW Release Applicability: This module is available in both NVIDIA DriveWorks and NVIDIA DRIVE Software releases.

About This Module

The image module is composed of 3 submodules

Image

The image module contains structures and methods that allow the user to create and set images handles that are compatible with NVIDIA® DriveWorks modules. An image is represented generically as a handle dwImageHandle_t, which can be passed to a DriveWorks module for processing, or more specifically as a C struct. The struct differs in content based on the type of image and the properties. All images share common properties:

Image Properties

The image properties are:

  • the type (see next section)
  • the size in pixel (width and height):
  • the format represented as an enum DW_IMAGE_FORMAT_COLORSPACE_PIXELTYPE(_PIXELORDER) where the COLORSPACE (RGB, YUV, RAW etc...) describes the appearance of the individual pixel, the PIXELTYPE (UINT8, FLOAT16 etc...) describes the trivial datatype of each pixel and the PIXELORDER (PLANAR, SEMIPLANAR) which describes how the specific color space is arranged in memory. The PIXELORDER can be INTERLEAVED (name is omitted) in which case the individual channels of the COLORSPACE are contiguous, PLANAR in which the channels are separated in different planes or SEMIPLANAR in which some channels are contiguous, some not (see YUV420)
  • the meta which is the collection of metadata information and pointers to datalines. This field is filled by the dwSensorCamera and contains parsed datalines (in the form of sensor specific info) and uint8_t pointers pointing to the memory location of the raw datalines. This is supported not supported in GL images and it will appear only when dealing with RAW images
  • the memoryLayout can be: DW_IMAGE_MEMORY_TYPE_DEFAULT, DW_IMAGE_MEMORY_TYPE_PITCH, DW_IMAGE_MEMORY_TYPE_BLOCK and represents the arrangement of data in memory. Only CUDA and NVMEDIA can handle both types, CPU is stricktly pitch and GL is stricktly block, The default memory layout will automatically choose the proper layout (once given to a DW module)

Image Types

Any image can be created by calling dwImage_create() and should be followed by a dwImage_destroy() when the image is not needed anymore. The creation is specific to the type of image and there are 4 supported types. After the image is created it is possible to pass the handle to DriveWorks modules, if they accept the opaque handle, otherwise it's possible to retrieve a struct specific to the image type. The struct allows direct access to the content of the image and any modification will affect original image.

CPU Images

A CPU image is stored as a pitch memory buffer represented by an array of pointers, an array of pitches and properties. Its content can be retrieved from a dwImageHandle_t by calling dwImage_getCPU() and it will return a dwImageCPU and it contains:

  • dwImageProperties prop: image properties
  • size_t pitch[] : the pitches, one per image plane
  • uint8_t *data[]: the pointers to the actual data, one pointer per image plane
  • dwTime_t timestamp_us : the timestamp of acquisition from a sensor. If the image is created by the user, it is 0

The CPU image is created by specifying DW_IMAGE_CPU type in the properties and calling

  • dwImage_create(): will create the handle and also allocate memory for data[] based on the properties. Destroying such image will also free the memory
  • dwImage_createAndBindBuffer(): will create the handle but not allocate memory, instead data[] will point to the buffers allocated and passed by the user. The function trusts that the user buffers match the properties specified. Destroying such image will only destroy the handle, the ownership of the buffer remains on the user
CUDA Images

A CUDA Image can have 2 forms, a Pitch pointer or CUDA Array form. The two forms are allocated and occupy different domains on GPU memory, one being a Pitch linear pointer, the other being a Block memory cuda Array (thought of as a Texture). It is possible to retrieve the content by calling dwImage_getCUDA() and receiving a dwImageCUDA struct, containing:

  • dwImageProperties prop: image properties
  • size_t pitch[] : the pitches, one per image plane. The pitches are used to access content only for pitch linear pointers
  • void *dptr[]: the pointers to the actual pitch linear data on GPU device, one per image plane. Valid if prop.memoryLayout is DW_IMAGE_MEMORY_TYPE_PITCH (or DEFAULT)
  • cudaArray_t array[]: the block memory cuda arrays, one per image plane. Valid if prop.memoryLayout is DW_IMAGE_MEMORY_TYPE_BLOCK
  • dwTime_t timestamp_us : the timestamp of acquisition from a sensor. If the image is created by the user, it is 0

The CUDA image is created by specifying DW_IMAGE_CUDA type in the properties and calling

  • dwImage_create(): will create the handle and also allocate memory for dptr[] for pitch layouts and for arrays[] for block layouts, based on properties. Destroying such image will also free the device memory
  • dwImage_createAndBindBuffer(): will create the handle but not allocate memory, instead dptr[] will point to the buffers allocated using cuda functions (ie cudaMallocPitch()) and passed by the user. The function trusts that the user buffers match the properties specified. Destroying such image will only destroy the handle, the ownership of the buffer remains on the user
  • dwImage_createAndBindCUDAArray(): will create the handle but not allocate cudaArrays, instead array[] will point to the cudaArray allocated by the user calling cudaMallocArray(). The behavior is analogous to the function above
GL Images

A GL image is stored as a GLuint texture present on the GPU. An invalid texture has texID of 0 but it has a positive value when properly created. It is possible to retrieve the ocntent by calling dwImage_getGL() and will receive a dwImageGL and it contains:

  • dwImageProperties prop: image properties
  • GLuint tex: the index of the texture on the GPU
  • GLenum target: the GL texture target. In almost all use cases it is a GL_TEXTURE_2D
  • dwTime_t timestamp_us : the timestamp of acquisition from a sensor. If the image is created by the user, it is 0

The GL image is created by specifying DW_IMAGE_GL type in the properties and calling

  • dwImage_create(): will create the handle and also generate a GL texture, based on properties and target. Destroying such image will also destroys the GL texture
  • dwImage_createAndBindGLTexture(): will create the handle and use the GL texture created by the user. The function trusts that the user buffers match the properties specified. Destroying such image will only destroy the handle, the ownership of the texture remains on the user
NvMedia Images

An NvMedia image is stored as a pointer to the low level NvMedia API image struct. For specific information on NvMedia images, see the following information in NVIDIA DRIVE 5.1 PDK:

  • "Image Processing and Management" in "Understanding NvMedia".
  • "NvMedia API for Tegra" in the API Reference. (Click the API tab to access the API Reference.)

It is possible to access the pointer by calling dwImage_getNvMedia() and receive a dwImageNvMedia that contains:

  • dwImageProperties prop: image properties
  • NvMediaImage *img: pointer to the low level NvMedia image
  • dwTime_t timestamp_us : the timestamp of acquisition from a sensor. If the image is created by the user, it is 0

The NvMedia image is created by specifying DW_IMAGE_NVMEDIA type in the properties and calling

  • dwImage_create(): will create the handle and also create a NvMediaImage using low level NvMedia API calls, based on properties. Destroying such image will also destroys the NvMediaImage using the low level NvMedia API
  • dwImage_createAndBindNvMedia(): will create the handle and use NvMediaImage created by the user. The function trusts that the user NvMediaImage matches the properties specified. Destroying such image will only destroy the handle, the ownership of the NvMediaImage remains on the user

Storage in Memory

Images can be stored in memory in various formats. One dimension of this variation is interleaved vs planar storage for multi-channel images. For example, an interleaved RGB image has 1 plane with 3 channels. A YUV420 planar image has 3 planes, with 1 channel each.

Memory layout can be either pitch or block, depending on the type. CPU images are always pitch, GL images are always block, whereas CUDA and NvMedia images can be either.

ImageFormats

The image format describes data type, color space and arrangement of the pixels

Interleaved formats

  • DW_IMAGE_FORMAT_R: single channel grayScale
  • DW_IMAGE_FORMAT_RG: two-channels single plane (RGRGRGRG...) representing Red Green (or X Y coordinates)
  • DW_IMAGE_FORMAT_RGB: 3-channels single planae (RGBRGBRGBRGB)
  • DW_IMAGE_FORMAT_RGBA: 4-channels single plane (RGBARGBARGBA), Red Green Blue Alpha. Alpha channel used for color blending
  • DW_IMAGE_FORMAT_RGBX: 4-channels single plane (RGBXRGBXRGBX), Red Green Blue X-empty, for HW acceleration
  • DW_IMAGE_FORMAT_VUYX: 4-channles single plane (VUYXVUYXVUYX), V U Y-luminace X-empty, representing YUV444

Planar formats

  • DW_IMAGE_FORMAT_RGB_DATATYPE_PLANAR: 3 planes, 1 channel each, Red Green Blue
  • DW_IMAGE_FORMAT_RCB_DATATYPE_PLANAR: 3 planes, 1 channel each, Red Clear Blue, Clear represents unfiltered color channel, non tonemapped
  • DW_IMAGE_FORMAT_RCC_DATATYPE_PLANAR: 3 planes, 1 channel each, Red Clear Clear, Clear represents unfiltered color channel, non tonemapped
  • DW_IMAGE_FORMAT_YUV420_DATATYPE_PLANAR: 3 planes, 1 channel each, Y-luminance U V, representing YUV420 with U and V at half resolution
  • DW_IMAGE_FORMAT_YUV_DATATYPE_PLANAR: 3 planes, 1 channel each, Y-luminance U V, representing YUV444

other formats

  • DW_IMAGE_FORMAT_YUV420_DATATYPE_SEMIPLANAR: 2 planes, 1 channel + 2 channels (YYYYYYY... UVUVUVUVUV), representing YUV420 with U and V at half resolution

RAW

  • DW_IMAGE_FORMAT_RAW_UINT16: RAW color array from images arriving from the sensor. The color array is describet in dwCameraRawFormat
  • DW_IMAGE_FORMAT_RAW_FLOAT16: result of debayering of RAW_UINT16

Image Format Conversion

Images can be converted into a different format, while retaining the same type (for converting type, see Image Streamer). The user must allocate the output image and the conversion will be based on the properties of the input and output images. Only CUDA and NvMedia images support this operation. The converter will not change the size of the image. If all properties are identical, the converter will perform an identical copy.

Table for supported format conversions

From To
any format and layout same format and layout (simple copy)
any format, layout DW_IMAGE_MEMORY_TYPE_PITCH same format, layout DW_IMAGE_MEMORY_TYPE_BLOCK
any format, layout DW_IMAGE_MEMORY_TYPE_BLOCK same format, layout DW_IMAGE_MEMORY_TYPE_PITCH
DW_IMAGE_FORMAT_RGB_UINT8 DW_IMAGE_FORMAT_RGBA_UINT8
DW_IMAGE_FORMAT_RGB_UINT8 DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR
DW_IMAGE_FORMAT_RGB_UINT8 DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR
DW_IMAGE_FORMAT_RGB_UINT8_PLANAR DW_IMAGE_FORMAT_RGBA_UINT8
DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR DW_IMAGE_FORMAT_RGBA_UINT8
DW_IMAGE_FORMAT_RGB_UINT8_PLANAR DW_IMAGE_FORMAT_RGB_FLOAT16
DW_IMAGE_FORMAT_RGB_UINT8_PLANAR DW_IMAGE_FORMAT_RGB_UINT8_PLANAR
DW_IMAGE_FORMAT_RGB_UINT8_PLANAR DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR
DW_IMAGE_FORMAT_RGBA_UINT8 DW_IMAGE_FORMAT_RGB_UINT8
DW_IMAGE_FORMAT_RGBA_UINT8 DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR
DW_IMAGE_FORMAT_RGBA_UINT8 DW_IMAGE_FORMAT_YUV420_UINT8_SEMIPLANAR
DW_IMAGE_FORMAT_RGBA_UINT8 DW_IMAGE_FORMAT_R_UINT8
DW_IMAGE_FORMAT_RGBA_UINT8 DW_IMAGE_FORMAT_RGB_UINT8_PLANAR
DW_IMAGE_FORMAT_RGBA_UINT8 DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR
DW_IMAGE_FORMAT_RGBA_FLOAT16 DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR
DW_IMAGE_FORMAT_YUV420_UINT8_SEMIPLANAR DW_IMAGE_FORMAT_RGB_UINT8
DW_IMAGE_FORMAT_YUV420_UINT8_SEMIPLANAR DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR
DW_IMAGE_FORMAT_YUV420_UINT8_SEMIPLANAR DW_IMAGE_FORMAT_R_UINT8
DW_IMAGE_FORMAT_YUV420_UINT8_SEMIPLANAR DW_IMAGE_FORMAT_RGBA_UINT8
DW_IMAGE_FORMAT_YUV420_UINT16_SEMIPLANAR DW_IMAGE_FORMAT_RGBA_UINT8
DW_IMAGE_FORMAT_YUV420_UINT8_SEMIPLANAR DW_IMAGE_FORMAT_RGB_UINT8_PLANAR
DW_IMAGE_FORMAT_YUV420_UINT8_SEMIPLANAR DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR
DW_IMAGE_FORMAT_YUV420_UINT16_SEMIPLANAR DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR
DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR DW_IMAGE_FORMAT_RGBA_UINT8
DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR DW_IMAGE_FORMAT_RGB_UINT8
DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR DW_IMAGE_FORMAT_YUV420_UINT8_SEMIPLANAR
DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR DW_IMAGE_FORMAT_R_UINT8
DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR DW_IMAGE_FORMAT_RGB_UINT8_PLANAR
DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR
DW_IMAGE_FORMAT_VUYX_UINT8 DW_IMAGE_FORMAT_RGBA_UINT8
DW_IMAGE_FORMAT_VUYX_UINT8 DW_IMAGE_FORMAT_RGB_UINT8_PLANAR
DW_IMAGE_FORMAT_VUYX_UINT8 DW_IMAGE_FORMAT_YUV_UINT8_PLANAR
DW_IMAGE_FORMAT_VUYX_UINT16 DW_IMAGE_FORMAT_RGBA_FLOAT16
DW_IMAGE_FORMAT_VUYX_UINT16 DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR
DW_IMAGE_FORMAT_VUYX_UINT16 DW_IMAGE_FORMAT_YUV_UINT16_PLANAR
DW_IMAGE_FORMAT_YUV_UINT8_PLANAR DW_IMAGE_FORMAT_RGBA_UINT8
DW_IMAGE_FORMAT_YUV_UINT8_PLANAR DW_IMAGE_FORMAT_RGB_UINT8_PLANAR
DW_IMAGE_FORMAT_YUV_UINT8_PLANAR DW_IMAGE_FORMAT_VUYX_UINT8
DW_IMAGE_FORMAT_YUV_UINT16_PLANAR DW_IMAGE_FORMAT_RGBA_FLOAT16
DW_IMAGE_FORMAT_YUV_UINT16_PLANAR DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR
DW_IMAGE_FORMAT_YUV_UINT16_PLANAR DW_IMAGE_FORMAT_VUYX_UINT16

Image Streamer

An image streamer converts an image from a type X to a type Y, preserving the rest of the properties (see note A). All streamers (see note B) need to be initialized in order to allocate the necessary resources for the streaming (for example an image pool), depending on the type of streamer. On low level, all streamers differ in behavior and performance, so the choice and number of streamers should be planned wisely. The idea of streaming is based on the logic of producer and consumer.

  1. the producer sends an image through the stream. It is possible to send images in a sequence up to the maximum size of the streamer's internal pool (4). At this point the producer waits for the consumer
  2. the consumer receives the converted image and consumes it
  3. the consumer returns the image to the producer
  4. the producer returns the image, freeing the taken location in the pool. The producer will wait for the consumer based on a timeout

Supported Image Streamer Inputs and Outputs

The following table describes the possible streaming combinations, given by image type (dwImageType).

From (column) \ To (row) CPU GL CUDA NvMedia
CPU - X* X* X
GL X* - X X
CUDAX X* X X
NvMedia X X X X (ideal for cross-processing)
Note
* Supported on iGPU only.

Streamable Image Formats and Types

The following table describes image format (dwImageFormat) for each combination of image types (dwImageType).

From (column) \ To (row) CPU GL CUDA NvMedia
CPU - RGBA, R, UINT8 ALL RGBA, R, YUV420 p/s,
YUV422 p/s, RAW,
UINT8, UINT16
GL RGBA, UINT8 - RGBA, UINT8 RGBA, UINT8
CUDAALL RGBA, UINT8 ALL RGBA, YUV420 p/s,
YUV422 p/s, UINT8
NvMedia RGBA, YUV420 p/s,
YUV422 p/s, UINT8
RGBA, UINT8 RGBA, YUV420 p/s,
YUV422 p/s, UINT8
RGBA, YUV420 p/p,
YUV422 p/p, RAW,
UINT8, UINT16

Note A: In some cases (CPU->CUDA, CUDA->CPU, NvMedia->CUDA, CUDA->NvMedia) it is possible to stream into an image with a different memory layout

Note B: The streamer NvMedia->CPU is the only streamer that does not allocate any resources because it performs a direct mapping between source and destination. For this reason it has some limitations but also provides maximum performance. The streamer CUDA->GL on DGPU on a DrivePX2 platform, due to temporary technical limitations, allocates extra resources from the one needed and perform extra operations during the stream, leading to performance penalties.

Underlying streaming mechanism for each combination

The following table describes the mechanism for each streaming combination. 'X' indicates the combination is not available.

From (column) \ To (row) CPUCUDA PitchCUDA BlockGLNvMedia
CPUXcudaMemcpy2DAsynccudaMemcpy2DToArrayAsyncglBufferData - GL_STATIC_DRAWNvMediaImagePutBits
CUDA PitchcudaMemcpy2DAsyncXXcudaMemcpy3DAsync (iGPU, X86) - GL->CPU->CUDA (dGPU)EGL
CUDA BlockcudaMemcpy2DFromArrayAsyncXXcudaMemcpy3DAsync (iGPU, X86) - GL->CPU->CUDA (dGPU)EGL
GLglReadPixelscudaMemcpy3DAsync (iGPU, X86) - X (dGPU)cudaMemcpy3DAsync (iGPU, X86) - X (dGPU)XEGL
NvMediadirect map (only for pitch linear)EGLEGLEGLX

Expected performance on NVIDIA DRIVE AGX Developer Kit

The following table gives the streaming performance on NVIDIA DRIVE AGX Developer Kit. Values are given in microseconds and represent the average of 1000 runs; std and spike values are in parenthesis.

'D' indicates dGPU performance and 'I' iGPU. If 'D' or 'I' is not specified, then the performance is independent of the GPU.

RGBA 8bitRAW 16bitYUV 420 SP 8bit
CPU->CUDA20 D (4.2, 117)
402 I (38.8, 643)
20 D (5.1, 160)
364 I (38.0, 804)
34 D (8.6, 404)
426 I (38.3, 654)
CPU->GL11 (7.9, 263)NANA
CPU->NvMedia19 (3.5, 56) 690 (4.1, 711)NA
CUDA->CPU24 D (6.4, 139)
407 I (29.2. 616)
23 D (4.6, 147)
422 I (35.6, 798)
41 D (7.1, 168)
449 I (56.1, 632)
CUDA->GL175 (73.9, 1436)NANA
CUDA->NvMediaNANANA
NvMedia->CPU7 (3.9, 71)8 (3.1, 35)14 (5.3, 138)
NvMedia->CUDA52 D (11.6, 2161)
34 I (7.5, 908)
49 D (9.4, 2020)
37 I (11.8, 724)
71 D (12.6, 3786)
36 I (16.5, 923)
NvMedia->GL38 (13.2, 282)NANA
GL->CPU75 (25.1, 784) NANA
GL->CUDA1950 (146.4, 2411)NANA
GL->NvMedia136 (180.9, 1635)NANA

Note 1: GL-based times were taken on iGPU

Note 2: Some streamers, especially EGL-based, have spikes for the first few frames, due to hidden optimizations that are performed during the first few iterations. Similar spikes may also occur for CUDA images.

Frame Capture

A frame capture has 2 purposes:

  • capture the content of an onscreen Window into a dwImageGL
  • serialize a rogue dwImageGL or dwImageCUDA into a h264/h265 stream without needing to use a sensor camera This module is ideal for recording a video of a DW based application being rendered on screen

Relevant Tutorials

APIs