About This Module

The Stereo module provides two functionalities: stereo rectification and disparity estimation. A graphical representation of the two functionalities is given in the following image.

The left and right cameras capture a world scene. As resulting images are not coplanar, the epipolar lines are slanted. Through the stereo rectifier module, rectification make the epipolar lines parallel and horizontal in both images. Once images are rectified, disparity can be computed using the disparity estimation module.

The module is agnostic of the device acquiring the images and can be used with NVIDIA^® DriveWorks if images are presented as dwImageCUDA objects.

Stereo Rectifier

The Stereo Rectifier can be used to rectify a pair of stereo images acquired from a calibrated stereo camera. Rectification is the transformation projecting two images onto a common plane that is parallel to the line joining the two optical centers. This transformation is such that all epipolar lines are horizontal and parallel, as it is visually represented in step I in the figure above.

The Stereo Rectifier algorithm must be initialized with the intrinsics and extrinsic parameters of the stereo camera pair, which can be extracted from a rig configuration. Furthermore, the Stereo Rectifier algorithm computes a region of interest (ROI) that can be used to crop the two rectified images to an area of equal size that has only valid pixels and no interpolated data.

Disparity Estimation

The Stereo algorithm can be used to estimate the disparity between two rectified images. It computes a map where each element represents the horizontal displacement between each pair of corresponding pixels, as it is visually represented in step II in the figure above. That is, each disparity map pixel d(x,y) is such that I_R(x,y) = I_L(x,y), where I_L(x,y) and I_R(x,y) are corresponding pixels in the left and right image, respectively. Disparity can be further used to compute the three-dimensional depth of the point.

The Stereo algorithm can be executed on GPU or on a combination of multiple hardware engines on Drive AGX boards, namely VIC (Video Image Compositor), PVA (Programmable Vision Accelerator), NVENC (Video Encoder), GPU and CPU.

Two Gaussian pyramids are required to perform the smoothed multi-resolution image decomposition on which the GPU-based stereo algorithm is based. The number of decomposition levels as well as the possibility to compute the disparity map on one or both images can be set at initialization time. The algorithm also returns a confidence map associated to the disparity map providing a reliability measure of its estimation.

The Stereo algorithm that runs on different hardware engines requires two single input images. The performance and the speed of the algorithm can be adjusted at the initialization time. Similar to the GPU-based algorithm, in addition to the disparity map, it returns a confidence map associated to the disparity map to provide a reliability measure of the estimation.

Relevant Tutorials

APIs

Stereo Interface