About This Module

The image module is composed of 3 submodules

Image

The image module contains structures and methods that allow the user to create and set images handles that are compatible with NVIDIA^® DriveWorks modules. An image is represented generically as a handle dwImageHandle_t, which can be passed to a DriveWorks module for processing, or more specifically as a C struct. The struct differs in content based on the type of image and the properties. All images share common properties:

Image Properties

The image properties are:

the type (see next section)
the size in pixel (width and height):
the format represented as an enum DW_IMAGE_FORMAT_COLORSPACE_PIXELTYPE(_PIXELORDER) where the COLORSPACE (RGB, YUV, RAW etc...) describes the appearance of the individual pixel, the PIXELTYPE (UINT8, FLOAT16 etc...) describes the trivial datatype of each pixel and the PIXELORDER (PLANAR, SEMIPLANAR) which describes how the specific color space is arranged in memory. The PIXELORDER can be INTERLEAVED (name is omitted) in which case the individual channels of the COLORSPACE are contiguous, PLANAR in which the channels are separated in different planes or SEMIPLANAR in which some channels are contiguous, some not (see YUV420)
the meta which is the collection of metadata information and pointers to datalines. This field is filled by the dwSensorCamera and contains parsed datalines (in the form of sensor specific info) and uint8_t pointers pointing to the memory location of the raw datalines. This is supported not supported in GL images and it will appear only when dealing with RAW images
the memoryLayout can be: DW_IMAGE_MEMORY_TYPE_DEFAULT, DW_IMAGE_MEMORY_TYPE_PITCH, DW_IMAGE_MEMORY_TYPE_BLOCK and represents the arrangement of data in memory. Only CUDA and NVMEDIA can handle both types, CPU is stricktly pitch and GL is stricktly block, The default memory layout will automatically choose the proper layout (once given to a DW module)

Image Types

Any image can be created by calling dwImage_create() and should be followed by a dwImage_destroy() when the image is not needed anymore. The creation is specific to the type of image and there are 4 supported types. After the image is created it is possible to pass the handle to DriveWorks modules, if they accept the opaque handle, otherwise it's possible to retrieve a struct specific to the image type. The struct allows direct access to the content of the image and any modification will affect original image.

CPU Images

A CPU image is stored as a pitch memory buffer represented by an array of pointers, an array of pitches and properties. Its content can be retrieved from a dwImageHandle_t by calling dwImage_getCPU() and it will return a dwImageCPU and it contains:

dwImageProperties prop: image properties
size_t pitch[] : the pitches, one per image plane
uint8_t *data[]: the pointers to the actual data, one pointer per image plane
dwTime_t timestamp_us : the timestamp of acquisition from a sensor. If the image is created by the user, it is 0

The CPU image is created by specifying DW_IMAGE_CPU type in the properties and calling

dwImage_create(): will create the handle and also allocate memory for data[] based on the properties. Destroying such image will also free the memory
dwImage_createAndBindBuffer(): will create the handle but not allocate memory, instead data[] will point to the buffers allocated and passed by the user. The function trusts that the user buffers match the properties specified. Destroying such image will only destroy the handle, the ownership of the buffer remains on the user

CUDA Images

A CUDA Image can have 2 forms, a Pitch pointer or CUDA Array form. The two forms are allocated and occupy different domains on GPU memory, one being a Pitch linear pointer, the other being a Block memory cuda Array (thought of as a Texture). It is possible to retrieve the content by calling dwImage_getCUDA() and receiving a dwImageCUDA struct, containing:

dwImageProperties prop: image properties
size_t pitch[] : the pitches, one per image plane. The pitches are used to access content only for pitch linear pointers
void *dptr[]: the pointers to the actual pitch linear data on GPU device, one per image plane. Valid if prop.memoryLayout is DW_IMAGE_MEMORY_TYPE_PITCH (or DEFAULT)
cudaArray_t array[]: the block memory cuda arrays, one per image plane. Valid if prop.memoryLayout is DW_IMAGE_MEMORY_TYPE_BLOCK
dwTime_t timestamp_us : the timestamp of acquisition from a sensor. If the image is created by the user, it is 0

The CUDA image is created by specifying DW_IMAGE_CUDA type in the properties and calling

dwImage_create(): will create the handle and also allocate memory for dptr[] for pitch layouts and for arrays[] for block layouts, based on properties. Destroying such image will also free the device memory
dwImage_createAndBindBuffer(): will create the handle but not allocate memory, instead dptr[] will point to the buffers allocated using cuda functions (ie cudaMallocPitch()) and passed by the user. The function trusts that the user buffers match the properties specified. Destroying such image will only destroy the handle, the ownership of the buffer remains on the user
dwImage_createAndBindCUDAArray(): will create the handle but not allocate cudaArrays, instead array[] will point to the cudaArray allocated by the user calling cudaMallocArray(). The behavior is analogous to the function above

GL Images

A GL image is stored as a GLuint texture present on the GPU. An invalid texture has texID of 0 but it has a positive value when properly created. It is possible to retrieve the ocntent by calling dwImage_getGL() and will receive a dwImageGL and it contains:

dwImageProperties prop: image properties
GLuint tex: the index of the texture on the GPU
GLenum target: the GL texture target. In almost all use cases it is a GL_TEXTURE_2D
dwTime_t timestamp_us : the timestamp of acquisition from a sensor. If the image is created by the user, it is 0

The GL image is created by specifying DW_IMAGE_GL type in the properties and calling

dwImage_create(): will create the handle and also generate a GL texture, based on properties and target. Destroying such image will also destroys the GL texture
dwImage_createAndBindGLTexture(): will create the handle and use the GL texture created by the user. The function trusts that the user buffers match the properties specified. Destroying such image will only destroy the handle, the ownership of the texture remains on the user

NvMedia Images

An NvMedia image is stored as a pointer to the low level NvMedia API image struct. For specific information on NvMedia images, see the following information in NVIDIA DRIVE 5.1 PDK:

"Image Processing and Management" in "Understanding NvMedia".
"NvMedia API for Tegra" in the API Reference. (Click the API tab to access the API Reference.)

It is possible to access the pointer by calling dwImage_getNvMedia() and receive a dwImageNvMedia that contains:

dwImageProperties prop: image properties
NvMediaImage *img: pointer to the low level NvMedia image
dwTime_t timestamp_us : the timestamp of acquisition from a sensor. If the image is created by the user, it is 0

The NvMedia image is created by specifying DW_IMAGE_NVMEDIA type in the properties and calling

dwImage_create(): will create the handle and also create a NvMediaImage using low level NvMedia API calls, based on properties. Destroying such image will also destroys the NvMediaImage using the low level NvMedia API
dwImage_createAndBindNvMedia(): will create the handle and use NvMediaImage created by the user. The function trusts that the user NvMediaImage matches the properties specified. Destroying such image will only destroy the handle, the ownership of the NvMediaImage remains on the user

Storage in Memory

Images can be stored in memory in various formats. One dimension of this variation is interleaved vs planar storage for multi-channel images. For example, an interleaved RGB image has 1 plane with 3 channels. A YUV420 planar image has 3 planes, with 1 channel each.

Memory layout can be either pitch or block, depending on the type. CPU images are always pitch, GL images are always block, whereas CUDA and NvMedia images can be either.

ImageFormats

The image format describes data type, color space and arrangement of the pixels

Interleaved formats

DW_IMAGE_FORMAT_R: single channel grayScale
DW_IMAGE_FORMAT_RG: two-channels single plane (RGRGRGRG...) representing Red Green (or X Y coordinates)
DW_IMAGE_FORMAT_RGB: 3-channels single planae (RGBRGBRGBRGB)
DW_IMAGE_FORMAT_RGBA: 4-channels single plane (RGBARGBARGBA), Red Green Blue Alpha. Alpha channel used for color blending
DW_IMAGE_FORMAT_RGBX: 4-channels single plane (RGBXRGBXRGBX), Red Green Blue X-empty, for HW acceleration
DW_IMAGE_FORMAT_VUYX: 4-channles single plane (VUYXVUYXVUYX), V U Y-luminace X-empty, representing YUV444

Planar formats

DW_IMAGE_FORMAT_RGB_DATATYPE_PLANAR: 3 planes, 1 channel each, Red Green Blue
DW_IMAGE_FORMAT_RCB_DATATYPE_PLANAR: 3 planes, 1 channel each, Red Clear Blue, Clear represents unfiltered color channel, non tonemapped
DW_IMAGE_FORMAT_RCC_DATATYPE_PLANAR: 3 planes, 1 channel each, Red Clear Clear, Clear represents unfiltered color channel, non tonemapped
DW_IMAGE_FORMAT_YUV420_DATATYPE_PLANAR: 3 planes, 1 channel each, Y-luminance U V, representing YUV420 with U and V at half resolution
DW_IMAGE_FORMAT_YUV_DATATYPE_PLANAR: 3 planes, 1 channel each, Y-luminance U V, representing YUV444

other formats

DW_IMAGE_FORMAT_YUV420_DATATYPE_SEMIPLANAR: 2 planes, 1 channel + 2 channels (YYYYYYY... UVUVUVUVUV), representing YUV420 with U and V at half resolution

RAW

DW_IMAGE_FORMAT_RAW_UINT16: RAW color array from images arriving from the sensor. The color array is describet in dwCameraRawFormat
DW_IMAGE_FORMAT_RAW_FLOAT16: result of debayering of RAW_UINT16

Image Format Conversion

Images can be converted into a different format, while retaining the same type (for converting type, see Image Streamer). The user must allocate the output image and the conversion will be based on the properties of the input and output images. Only CUDA and NvMedia images support this operation. The converter will not change the size of the image. If all properties are identical, the converter will perform an identical copy.

Table for supported format conversions

From	To
any format and layout	same format and layout (simple copy)
any format, layout `DW_IMAGE_MEMORY_TYPE_PITCH`	same format, layout `DW_IMAGE_MEMORY_TYPE_BLOCK`
any format, layout `DW_IMAGE_MEMORY_TYPE_BLOCK`	same format, layout `DW_IMAGE_MEMORY_TYPE_PITCH`
`DW_IMAGE_FORMAT_RGB_UINT8`	`DW_IMAGE_FORMAT_RGBA_UINT8`
`DW_IMAGE_FORMAT_RGB_UINT8`	`DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR`
`DW_IMAGE_FORMAT_RGB_UINT8`	`DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR`
`DW_IMAGE_FORMAT_RGB_UINT8_PLANAR`	`DW_IMAGE_FORMAT_RGBA_UINT8`
`DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR`	`DW_IMAGE_FORMAT_RGBA_UINT8`
`DW_IMAGE_FORMAT_RGB_UINT8_PLANAR`	`DW_IMAGE_FORMAT_RGB_FLOAT16`
`DW_IMAGE_FORMAT_RGB_UINT8_PLANAR`	`DW_IMAGE_FORMAT_RGB_UINT8_PLANAR`
`DW_IMAGE_FORMAT_RGB_UINT8_PLANAR`	`DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR`
`DW_IMAGE_FORMAT_RGBA_UINT8`	`DW_IMAGE_FORMAT_RGB_UINT8`
`DW_IMAGE_FORMAT_RGBA_UINT8`	`DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR`
`DW_IMAGE_FORMAT_RGBA_UINT8`	`DW_IMAGE_FORMAT_YUV420_UINT8_SEMIPLANAR`
`DW_IMAGE_FORMAT_RGBA_UINT8`	`DW_IMAGE_FORMAT_R_UINT8`
`DW_IMAGE_FORMAT_RGBA_UINT8`	`DW_IMAGE_FORMAT_RGB_UINT8_PLANAR`
`DW_IMAGE_FORMAT_RGBA_UINT8`	`DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR`
`DW_IMAGE_FORMAT_RGBA_FLOAT16`	`DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR`
`DW_IMAGE_FORMAT_YUV420_UINT8_SEMIPLANAR`	`DW_IMAGE_FORMAT_RGB_UINT8`
`DW_IMAGE_FORMAT_YUV420_UINT8_SEMIPLANAR`	`DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR`
`DW_IMAGE_FORMAT_YUV420_UINT8_SEMIPLANAR`	`DW_IMAGE_FORMAT_R_UINT8`
`DW_IMAGE_FORMAT_YUV420_UINT8_SEMIPLANAR`	`DW_IMAGE_FORMAT_RGBA_UINT8`
`DW_IMAGE_FORMAT_YUV420_UINT16_SEMIPLANAR`	`DW_IMAGE_FORMAT_RGBA_UINT8`
`DW_IMAGE_FORMAT_YUV420_UINT8_SEMIPLANAR`	`DW_IMAGE_FORMAT_RGB_UINT8_PLANAR`
`DW_IMAGE_FORMAT_YUV420_UINT8_SEMIPLANAR`	`DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR`
`DW_IMAGE_FORMAT_YUV420_UINT16_SEMIPLANAR`	`DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR`
`DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR`	`DW_IMAGE_FORMAT_RGBA_UINT8`
`DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR`	`DW_IMAGE_FORMAT_RGB_UINT8`
`DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR`	`DW_IMAGE_FORMAT_YUV420_UINT8_SEMIPLANAR`
`DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR`	`DW_IMAGE_FORMAT_R_UINT8`
`DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR`	`DW_IMAGE_FORMAT_RGB_UINT8_PLANAR`
`DW_IMAGE_FORMAT_YUV420_UINT8_PLANAR`	`DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR`
`DW_IMAGE_FORMAT_VUYX_UINT8`	`DW_IMAGE_FORMAT_RGBA_UINT8`
`DW_IMAGE_FORMAT_VUYX_UINT8`	`DW_IMAGE_FORMAT_RGB_UINT8_PLANAR`
`DW_IMAGE_FORMAT_VUYX_UINT8`	`DW_IMAGE_FORMAT_YUV_UINT8_PLANAR`
`DW_IMAGE_FORMAT_VUYX_UINT16`	`DW_IMAGE_FORMAT_RGBA_FLOAT16`
`DW_IMAGE_FORMAT_VUYX_UINT16`	`DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR`
`DW_IMAGE_FORMAT_VUYX_UINT16`	`DW_IMAGE_FORMAT_YUV_UINT16_PLANAR`
`DW_IMAGE_FORMAT_YUV_UINT8_PLANAR`	`DW_IMAGE_FORMAT_RGBA_UINT8`
`DW_IMAGE_FORMAT_YUV_UINT8_PLANAR`	`DW_IMAGE_FORMAT_RGB_UINT8_PLANAR`
`DW_IMAGE_FORMAT_YUV_UINT8_PLANAR`	`DW_IMAGE_FORMAT_VUYX_UINT8`
`DW_IMAGE_FORMAT_YUV_UINT16_PLANAR`	`DW_IMAGE_FORMAT_RGBA_FLOAT16`
`DW_IMAGE_FORMAT_YUV_UINT16_PLANAR`	`DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR`
`DW_IMAGE_FORMAT_YUV_UINT16_PLANAR`	`DW_IMAGE_FORMAT_VUYX_UINT16`

Image Streamer

An image streamer converts an image from a type X to a type Y, preserving the rest of the properties (see note A). All streamers (see note B) need to be initialized in order to allocate the necessary resources for the streaming (for example an image pool), depending on the type of streamer. On low level, all streamers differ in behavior and performance, so the choice and number of streamers should be planned wisely. The idea of streaming is based on the logic of producer and consumer.

the producer sends an image through the stream. It is possible to send images in a sequence up to the maximum size of the streamer's internal pool (4). At this point the producer waits for the consumer
the consumer receives the converted image and consumes it
the consumer returns the image to the producer
the producer returns the image, freeing the taken location in the pool. The producer will wait for the consumer based on a timeout

Supported Image Streamer Inputs and Outputs

The following table describes the possible streaming combinations, given by image type (dwImageType).

From (column) \ To (row)	CPU	GL	CUDA	NvMedia
CPU	-	X*	X*	X
GL	X*	-	X	X
CUDA	X	X*	X	X
NvMedia	X	X	X	X (ideal for cross-processing)

Note: * Supported on iGPU only.

Streamable Image Formats and Types

The following table describes image format (dwImageFormat) for each combination of image types (dwImageType).

From (column) \ To (row)	CPU	GL	CUDA	NvMedia
CPU	-	RGBA, R, UINT8	ALL	RGBA, R, YUV420 p/s, YUV422 p/s, RAW, UINT8, UINT16
GL	RGBA, UINT8	-	RGBA, UINT8	RGBA, UINT8
CUDA	ALL	RGBA, UINT8	ALL	RGBA, YUV420 p/s, YUV422 p/s, UINT8
NvMedia	RGBA, YUV420 p/s, YUV422 p/s, UINT8	RGBA, UINT8	RGBA, YUV420 p/s, YUV422 p/s, UINT8	RGBA, YUV420 p/p, YUV422 p/p, RAW, UINT8, UINT16

Note A: In some cases (CPU->CUDA, CUDA->CPU, NvMedia->CUDA, CUDA->NvMedia) it is possible to stream into an image with a different memory layout

Note B: The streamer NvMedia->CPU is the only streamer that does not allocate any resources because it performs a direct mapping between source and destination. For this reason it has some limitations but also provides maximum performance. The streamer CUDA->GL on DGPU on a DrivePX2 platform, due to temporary technical limitations, allocates extra resources from the one needed and perform extra operations during the stream, leading to performance penalties.

Underlying streaming mechanism for each combination

The following table describes the mechanism for each streaming combination. 'X' indicates the combination is not available.

From (column) \ To (row)	CPU	CUDA Pitch	CUDA Block	GL	NvMedia
CPU	X	cudaMemcpy2DAsync	cudaMemcpy2DToArrayAsync	glBufferData - GL_STATIC_DRAW	NvMediaImagePutBits
CUDA Pitch	cudaMemcpy2DAsync	X	X	cudaMemcpy3DAsync (iGPU, X86) - GL->CPU->CUDA (dGPU)	EGL
CUDA Block	cudaMemcpy2DFromArrayAsync	X	X	cudaMemcpy3DAsync (iGPU, X86) - GL->CPU->CUDA (dGPU)	EGL
GL	glReadPixels	cudaMemcpy3DAsync (iGPU, X86) - X (dGPU)	cudaMemcpy3DAsync (iGPU, X86) - X (dGPU)	X	EGL
NvMedia	direct map (only for pitch linear)	EGL	EGL	EGL	X

Expected performance on NVIDIA DRIVE AGX Developer Kit

The following table gives the streaming performance on NVIDIA DRIVE AGX Developer Kit. Values are given in microseconds and represent the average of 1000 runs; std and spike values are in parenthesis.

'D' indicates dGPU performance and 'I' iGPU. If 'D' or 'I' is not specified, then the performance is independent of the GPU.

	RGBA 8bit	RAW 16bit	YUV 420 SP 8bit
CPU->CUDA	20 D (4.2, 117) 402 I (38.8, 643)	20 D (5.1, 160) 364 I (38.0, 804)	34 D (8.6, 404) 426 I (38.3, 654)
CPU->GL	11 (7.9, 263)	NA	NA
CPU->NvMedia	19 (3.5, 56)	690 (4.1, 711)	NA
CUDA->CPU	24 D (6.4, 139) 407 I (29.2. 616)	23 D (4.6, 147) 422 I (35.6, 798)	41 D (7.1, 168) 449 I (56.1, 632)
CUDA->GL	175 (73.9, 1436)	NA	NA
CUDA->NvMedia	NA	NA	NA
NvMedia->CPU	7 (3.9, 71)	8 (3.1, 35)	14 (5.3, 138)
NvMedia->CUDA	52 D (11.6, 2161) 34 I (7.5, 908)	49 D (9.4, 2020) 37 I (11.8, 724)	71 D (12.6, 3786) 36 I (16.5, 923)
NvMedia->GL	38 (13.2, 282)	NA	NA
GL->CPU	75 (25.1, 784)	NA	NA
GL->CUDA	1950 (146.4, 2411)	NA	NA
GL->NvMedia	136 (180.9, 1635)	NA	NA

Note 1: GL-based times were taken on iGPU

Note 2: Some streamers, especially EGL-based, have spikes for the first few frames, due to hidden optimizations that are performed during the first few iterations. Similar spikes may also occur for CUDA images.

Frame Capture

A frame capture has 2 purposes:

capture the content of an onscreen Window into a dwImageGL
serialize a rogue dwImageGL or dwImageCUDA into a h264/h265 stream without needing to use a sensor camera This module is ideal for recording a video of a DW based application being rendered on screen

Relevant Tutorials

APIs

Image Interface

Table of Contents