Overview
Given a pair of rectified images from a stereo camera, the Stereo Disparity algorithm uses high-quality dense stereo matching to produce an output image of the same resolution as the input with left-right disparity information. This allows for inferring the depth of the scene captured by the left and right images.
Left image | Right image |
| |
Disparity map |
|
Implementation
The stereo disparity estimator uses semi-global matching algorithm to compute the disparity. We deviate from the original algorithm by using as cost function the hamming distance of the census transforms of the stereo pair.
Usage
- Initialization phase
- Include the header that defines the needed functions and structures.
- Define the stream on which the algorithm will be executed, the input stereo pair, composed of two images, and the output disparity image.
- Create the payload that will contain all temporary buffers needed for processing.
- Processing phase
- Define the configuration parameters needed for algorithm execution.
- Submit the payload for execution on the stream associated with it.
- Optionally, wait until the processing is done.
- Cleanup phase
- Free resources held by the payload.
Consult the Stereo Disparity Sample for a complete example.
For more details, consult the API reference.
Limitations and Constraints
Constraints for specific backends superceed the ones specified for all backends.
All Backends
- Left and right input images must have same type and dimensions.
- Output image dimensions must match input's.
- Output disparity images must have type VPI_IMAGE_TYPE_U16.
- Maximum disparity parameter passed to algorithm submission must be the same as defined during payload creation.
- Input image dimensions must match what is defined during payload creation.
CUDA and CPU
- Left and right accepted input image types:
PVA
- Left and right accepted input image types:
- Input and output image dimensions must be 480x270.
- windowSize must be 5.
- maxDisparity must be 64.
Performance
For further information on how how performance benchmarked, see Performance Measurement.
Jetson AGX Xavier
size | type | max disp. | CPU | CUDA | PVA |
480x270 | u8 | 64 | 365 ms | 5.770 ms | n/a |
480x270 | u8 | 32 | 206 ms | 6.926 ms | n/a |
480x270 | u16 | 64 | 357 ms | 5.823 ms | 14.336 ms |
480x270 | u16 | 32 | 206.9 ms | 7.0159 ms | n/a |
Jetson TX2
size | type | max disp. | CPU | CUDA | PVA |
480x270 | u8 | 64 | 1010 ms | 23.06 ms | n/a |
480x270 | u8 | 32 | 483.0 ms | 25.2 ms | n/a |
480x270 | u16 | 64 | 1002 ms | 23.2 ms | n/a |
480x270 | u16 | 32 | 484.5 ms | 25.1 ms | n/a |
Jetson Nano
size | type | max disp. | CPU | CUDA | PVA |
480x270 | u8 | 64 | 1791 ms | 54.88 ms | n/a |
480x270 | u8 | 32 | 863 ms | 51.3 ms | n/a |
480x270 | u16 | 64 | 1795 ms | 54.73 ms | n/a |
480x270 | u16 | 32 | 867 ms | 51.3 ms | n/a |
References
- Hirschmüller, Heiko (2005). "Accurate and efficient stereo processing by semi-global matching and mutual information".
IEEE Conference on Computer Vision and Pattern Recognition. pp. 807–814.
- Zabih, Ramin; Woodfill, John (1994). "Non-parametric local transforms for computing visual correspondence".
European conference on computer vision. pp. 151–158.