Overview
Given a pair of rectified images from a stereo camera, the Stereo Disparity algorithm uses high-quality dense stereo matching to produce an output image of the same resolution as the input with left-right disparity information. This allows for inferring the depth of the scene captured by the left and right images.
Left image | Right image |
| |
Disparity map |
|
Implementation
The stereo disparity estimator uses semi-global matching algorithm to compute the disparity. We deviate from the original algorithm by using as cost function the hamming distance of the census transforms of the stereo pair.
Usage
- Initialization phase
- Include the header that defines the needed functions and structures.
- Define the stream on which the algorithm will be executed, the input stereo pair, composed of two images, and the output disparity image.
- Create the payload that will contain all temporary buffers needed for processing.
- Processing phase
- Define the configuration parameters needed for algorithm execution.
- Submit the payload for execution on the stream associated with it.
- Optionally, wait until the processing is done.
- Cleanup phase
- Free resources held by the payload.
Consult the Stereo Disparity Sample for a complete example.
Limitations and Constraints
Constraints for specific backends superceed the ones specified for all backends.
All Backends
- Left and right input images must have same type and dimensions.
- Output image dimensions must match input's.
- Output disparity images must have type VPI_IMAGE_TYPE_U16.
- Maximum disparity parameter passed to algorithm submission must be the same as defined during payload creation.
- Input image dimensions must match what is defined during payload creation.
CUDA and CPU
- Left and right accepted input image types:
PVA
- Left and right accepted input image types:
- Input and output image dimensions must be 480x270.
- windowSize must be 5.
- maxDisparity must be 64.
Performance
For further information on how how performance benchmarked, see Performance Measurement.
Jetson AGX Xavier
size | type | max disp. | CPU | CUDA | PVA |
480x270 | u8 | 64 | 362 ms | 5.8191 ms | n/a |
480x270 | u8 | 32 | 205 ms | 6.988 ms | n/a |
480x270 | u16 | 64 | 360 ms | 5.8635 ms | 14.274 ms |
480x270 | u16 | 32 | 205 ms | 7.095 ms | n/a |
Jetson TX2
size | type | max disp. | CPU | CUDA | PVA |
480x270 | u8 | 64 | 989 ms | 23.0 ms | n/a |
480x270 | u8 | 32 | 506 ms | 25.3 ms | n/a |
480x270 | u16 | 64 | 1037 ms | 23.6 ms | n/a |
480x270 | u16 | 32 | 508.1 ms | 25.28 ms | n/a |
Jetson Nano
size | type | max disp. | CPU | CUDA | PVA |
480x270 | u8 | 64 | 1791 ms | 54.66 ms | n/a |
480x270 | u8 | 32 | 860 ms | 51.26 ms | n/a |
480x270 | u16 | 64 | 1792 ms | 54.86 ms | n/a |
480x270 | u16 | 32 | 863 ms | 51.62 ms | n/a |
References
- Hirschmüller, Heiko (2005). "Accurate and efficient stereo processing by semi-global matching and mutual information".
IEEE Conference on Computer Vision and Pattern Recognition. pp. 807–814.
- Zabih, Ramin; Woodfill, John (1994). "Non-parametric local transforms for computing visual correspondence".
European conference on computer vision. pp. 151–158.