About This Module

The Camera module defines a common interface that expands the categories of cameras that NVIDIA^® DriveWorks supports.

The Camera module supports both GMSL and USB interfaces. It also provides the ability to replay recorded camera data through the DriveWorks stack as a virtual sensor.

A selection of GMSL cameras are supported out-of-the-box; please refer to https://developer.nvidia.com/drive/ecosystem-hw-sw for a list of these sensors. DriveWorks also provides the ability to integrate non-natively supported GMSL cameras. Please refer to Custom Cameras (SIPL) for more information.

Creating and Using Camera

Specify the following when creating a camera using the Sensor Abstraction Layer (SAL):

Camera protocol, which can be camera.gmsl, camera.virtual, or camera.usb.
A set of parameters specific for each camera protocol.

All camera protocols have a common API, created via the SAL through a parameter string, which changes based on the protocol. Once all the camera objects have been successfully created, and the application's initialization phase is complete, start the sensors with dwSensor_start().

Note: Starting a sensor triggers asynchronous capture or prefetching, so it should be performed right before deciding to call dwSensorCamera_readFrame(). Camera sensors do not start right away, so check for DW_NOT_READY in a loop until the first instance of DW_SUCCESS or any failure.

dwSensorCamera_readFrame() provides a dwCameraFrameHandle_t() which is an opaque structure mapping to a Frame. A Frame is a container of images and support functions, unique to the protocol at use. The purpose of a Frame is to acquire a camera event and be able to retrieve an image suitable for the use case at hand. A call to such a function is always followed by a necessary timeout value in microseconds, due to the asynchronous nature of some of the protocols, to ensure no deadlocking happening. Frame handles are allocated into a pool of size frames-pool which is a parameter that can be set by the user and has value 16 by default. This means that the user can read up to frames-pool frames in flight without returning. If more than that are read, the camera will return DW_BUFFER_FULL

After a frame is grabbed to the application level, applications use dwSensorCamera_getImage() to get an image in the specified output type. The available images that can be returned by this operation depend on the setup and platform. DriveWorks allows three levels of outputs:

Native level: corresponds to the image that comes out of the lower level with as little extra operations done as possible. This is the most performant layer for memory and speed.
Streamed level: corresponds to images who have been streamed to CUDA in order to be unified in dwImageType, guaranteeing a common output in case of differences in setup or protocol.
RGBA level: a conveniently chosen format which is the result of streaming to CUDA and converting from the native format which is widely used.

For a detailed table on supported output types, please refer to Supported Output Types.

GMSL Cameras (camera.gmsl)

The camera.gmsl protocol describes GMSL cameras. These cameras acquire frames at a frequency based on the camera's frame rate For more information on image processing and management, please refer to "Understanding NvMedia" in the latest NVIDIA DRIVE OS SDK Development Guide.

This protocol is based on top of the NvSIPL library, which takes care of the driver loading and low level handling of the camera. NvSIPL connects the following components:

Image Sensor Control (ISC).
Image Sensor Processing (ISP).
Control Algorithm that regulates exposure and white balance based on sensor statistics coming from Sensor Control and ISP.

The parameters for camera.gmsl are:

camera-name = [name as specified in the SIPL database]
interface = [csi-a - csi-h | csi-ab - csi-gh | trio-a - trio-h | etc...]
link = [0-3]
output-format = [processed|raw]
isp-mode = [see camera.gmsl]
nito-file = [optional path to .nito file]
slave = [false/true]
CPHY-mode = [0: default DPHY Mode / 1: CPHY]
deserializer = [MAX96712: default, optional name of a deserializer to be explicitly selected]
encoder-instance=<0: default or 1>

Each dwSensorHandle_t corresponds to a camera described above by name, port and link. The choice for csi-port depend on the sensor, especially if it is a custom one. Refer to the NvSIPL guide for more info.

All cameras are attached to and controlled by a master controller. If one camera fails, it results in a failure for the whole system.

Once a sensor is started or stopped, it sends the corresponding signal to the master controller, which starts or stops all cameras simultaneously. Ensure that you always call dwSensor_start() and dwSensor_stop() for all sensors.

The camera-name parameter enables DriveWorks to tell NvSIPL which camera to use, and then load the camera drivers (under /usr/lib/nvsipl_drv). Once the drivers are loaded, the camera setup is read and communicated to DriveWorks, which then allocates the necessary resources needed by NvSIPL. Finally, a raw image coming from the sensor's ISC is fed to the ISP layer at runtime, converting it into a processed image.

Note: In general, DriveWorks modules require processed images to function properly.

DriveWorks allocates image buffers that are registered with the NvSIPL library. These are in equal amounts buffers for the raw images and buffers for the processed images. As the NvSIPL library gives out images to DW, those are stored in this fifo, retreived when the user reads the frame. When the fifo is empty and the user tries to read, the camera will wait for a duration of user's timeout and then return DW_TIME_OUT. You can control the number via the fifo-size parameter. Once the camera starts, NvSIPL grabs the images from the buffers and puts them to use, continuing at a rate based on the camera frame rate (in addition to some slight processing delay based on hardware specifications).

The asynchronous mechanism continues until the buffers run out, where all camera captures will not have an image in the buffer available for use. They are then dropped as a result, otherwise known as an ICP drop.

NvSIPL fills the raw image buffers and processed image buffers, and sends them back to DriveWorks. DriveWorks then stores them in a queue, in the order of their capture time.

You also have control over the dwSensorCamera_readFrame(), which temporarily retrieves the front of the frame queue for you. Ensure you eventually return it back to the sensor, in order to allow NvSIPL to have buffers to work with again. Generally, the buffer usage rate must be lower than the camera frame rate in order to ensure no frame is ever dropped.

The user can read as many frames as the fifo-size together and then returning them, as long as NvSIPL is not starved. A basic locking mechanism ensures that if reading is done in a multi-threaded fashion, no race condition should happen.

As mentioned in the table above, the image outputs available depend on the output-format which, by default, is processed+raw. Choosing only processed or raw reduces the number of resources allocated by a single camera. If you do not require raw images to be retrieved in the application level, then choosing raw in output-format is wasteful.

When the processed pipeline is set, NvSIPL allocates the resources needed to process the camera selected. This includes the NITO file (Nvmedia Isp Tuning Object), which contains code to correctly process a raw image. NITO files stored in the DDPX under /opt/nvmedia/nit, and the file corresponding to a certain camera is be computed based on the camera name.

For example, camera-name=SF3324 will search for /opt/nvmedia/nit/SF3324.nito and /opt/nvmedia/nit/sf3324.nito. If none of these are detected, DriveWorks attempts to use a generic NITO file: /opt/nvmedia/nit/default.nito. In most cases however, this NITO is not suitable and will likely cause an application crash. If this occurs, ensure you provide the right camera name or the path to its own NITO file. If you do not have a NITO file (e.g., working with a custom camera), you can still retrieve raw images (output-format=raw) but you will not be able to use DriveWorks to process them, will require a custom code to do so.

The dwSensorSerializer API allows you to record data. Depending on the output-format selected, the camera.gmsl allocates resources to record raw sensor feeds, processed video files, or both. A raw sensor feed is the unprocessed signal coming from the camera itself, and a processed video is an h264/h265/mp4 video of the processed output of the camera. A processed output is unchangeable and does not carry information about the sensor it was recorded with, whereas a raw signal feed can be reprocessed again, just like a live camera feed. It contains all information regarding the sensor, including its embedded metadata.

Relevant logging

Logging is crucial when working with NvSIPL, providing insights into whether things are working as intended. When initializing, the log should print:

devBlock: 1 Slave = 0 Interface = csi-a Camera_name = SF3324 Link = 0

This helps double check if the sensor is going to be searched correctly. A devblock corresponds to a camera port, with a total maximum of 4 blocks. If the camera is found in the database:

Camera Match Name: SF3324 Description: Sekonix SF3324 module - 120-deg FOV, DVP AR0231-RCCB, MAX96705 linkIndex: 4294967295 serInfo.Name: MAX96705

This means the driver search can be done. Ensure you should double check if this information is correct. Afterwards, when all cameras are detected, the master controller starts and prints the complete information for each camera detected if the driver was correctly loaded:

CameraGSMLMaster: starting...

...

Device Block : 0
    csiPort: 0
    i2cDevice: 0
    Deserializer Name: MAX96712
    Deserializer Description: Maxim 96712 Aggregator
    Deserializer i2cAddress: 41
    Simulator Mode: 0
    Slave Mode: 0
    Phy Mode: 0
    Number of camera modules: 1

...

These are confirmation prints, followed by prints coming directly from NvSIPL and the camera driver:

MAX96712: Revision 2 detected
MAX96712: Enable periodic AEQ on Link 0
MAX96705: Pre-emphasis set to 0xaa
MAX96705: Revision 1 detected!
Sensor AR0231 RCCB Rev7 detected!

These are completely different depending on the camera at use and the DRIVE OS version, but a good indicator that things are going well in general. Then the NITO file is searched:

CameraClient: no nito found at /opt/nvidia/nvmedia/nit/SF3324.nito
CameraClient: using nito found at /opt/nvidia/nvmedia/nit/sf3324.nito

If the NITO file is found, things can progress as explained in the sections before. Then, when dwSensor_start() is called, if things go correctly:

CameraMaster: bootstrap complete

At this point, each camera that starts acquiring frames will print:

CameraClient: Acquisition started

There are higher levels of verbosity that can be adjusted by exporting DW_SIPL_VERBOSITY=[0..4].

Virtual Cameras (camera.virtual)

The camera.virtual protocol handles recorded data in raw or processed format. The user specifies video= or file= with the location of the data. It is possible to specify multiple video= or file= parameters, e.g. to reference a raw and a processed video. The SAL will prefer video over file parameters and use the first valid filepath, i.e. the referenced file needs to exist or it is skipped. This may be useful in cases where some video files are not always available, e.g. because one video format is provided via mounted network paths which may be slow or unreliable.

Processed Videos

A h264/h265/mp4 video is a video file that is decoded by camera.virtual into processed frames, identical to how they were when the sensor was being processed. This data is unchangeable and can be completely artificial data. Potentially, DriveWorks can replay any h264/h265/mp4 data.

Below is the parameter string format for when working with processed videos:

params: video = <path.h264/h265/mp4>
output-format = [processed only]

Raw Signal Feeds (AKA Raw videos)

A raw video contains the same signals coming from the camera at the moment of capture. This means that the data is not interpretable, unless the same resources used to process the actual physical data are present at the time. In this case, the NvSIPL library uses the camera drivers to treat the frame blobs in the raw feed, as if they are sensors coming from that moment from a live camera.

This process is indistinguishable from a live camera; the logs, errors, sensor events, and asynchronous behavior all match the live case. For this reason, you can specify the extra parameters available in camera.gmsl: namely isp-mode, fifo-size, and nito-file. DriveWorks reads the file header to recognize the camera and its information. The name for the camera used to recover the driver is stored as a string in the header. However, you can override this by passing in the camera-name parameter. It is similar to the live case, where the camera requires a NITO file, and the process is the same as described above.

Below is the parameter string format for when working with Raw videos:

params: video = <path.raw/lraw>,output-format=[processed|raw],isp-mode=<see camera.gmsl>,nito-file=<path.nito>, fifo-size=[size],camera-name=[overridden camera name]

USB Cameras (camera.usb)

The camera.usb protocol describes USB cameras. These cameras have similar interfaces. When a sensor is started, the acquisition is performed by the OpenCV (USB) library. This library uses the OS drivers and stores the result in a buffer allocated in Camera. The driver automatically detects the frame rate for these USB cameras.

Table of Contents