Overview

The Stereo Disparity application receives left and right stereo pair images and returns the disparity between them, which is a function of image depth. The result is saved as an image file to disk. If available, it'll also output the corresponding confidence map.

Instructions

The command line parameters are:

where

backend: either cuda, ofa or ofa-pva-vic; it defines the backend that will perform the processing. ofa-pva-vic and cuda allow output of the confidence map in addition to the disparity.
left image: left input image of a rectified stereo pair, it accepts png, jpeg and possibly others.
right image: right input image of a stereo pair.

Here's one example:

C++
./vpi_sample_02_stereo_disparity cuda ../assets/chair_stereo_left.png ../assets/chair_stereo_right.png
Python
python3 main.py cuda ../assets/chair_stereo_left.png ../assets/chair_stereo_right.png

This is using the CUDA backend and the provided sample images. You can try with other stereo pair images, respecting the constraints imposed by the algorithm.

The Python version of this sample also allow for setting various additional parameters, as well as additional input image extensions and to turn on verbose mode. The following command-line arguments can be passed to the Python sample:

Python
python3 main.py <backend> <left image> <right image> --width W --height H --downscale D --window_size WIN

--skip_confidence --conf_threshold T --conf_type absolute/relative -p1 P1 -p2 P2 --p2_alpha P2alpha

--uniqueness U --skip_diagonal --num_passes N --min_disparity MIN --max_disparity MAX --output_mode 0/1/2

-v/--verbose

where the additional optional arguments are:
width: set the width W when passing ".raw" input images
height: set the height H when passing ".raw" input images
downscale: to set the output downscale factor as D
window_size: to set the median filter window size as WIN
skip_confidence: to avoid calculating confidence and applying it as a mask
conf_threshold: set the confidence threshold as T
conf_type: set the confidence type as either absolute or relative
p1: set p1 penalty as P1
p2: set p2 penalty as P2
p2_alpha: set p2Alpha adaptive penalty as P2alpha
uniqueness: set uniqueness as U
skip_diagonal: to avoid using diagonal paths in CUDA or OFA backends
num_passes: to set the number of passes N in OFA backends
min_disparity: to set the minimum disparity MIN in CUDA backend
max_disparity: to set the maximum disparity MAX in backends
output_mode: 0 for colored output, 1 for grayscale and 2 for raw binary
verbose: to turn on verbose mode To understand in detail each of these aditional arguments related to the stereo disparity algorithm, please read the corresponding documentation.

Results

Left input image	Right input image

Stereo disparity	Confidence map

Source Code

For convenience, here's the code that is also installed in the samples directory.

Language: C++ Python

 import sys
 import vpi
 import numpy as np
 from PIL import Image
 from argparse import ArgumentParser
 import cv2
  
  
 def read_raw_file(fpath, resize_to=None, verbose=False):
     try:
         if verbose:
             print(f'I Reading: {fpath}', end=' ', flush=True)
         f = open(fpath, 'rb')
         np_arr = np.fromfile(f, dtype=np.uint16, count=-1)
         f.close()
         if verbose:
             print(f'done!\nI Raw array: shape: {np_arr.shape} dtype: {np_arr.dtype}')
         if resize_to is not None:
             np_arr = np_arr.reshape(resize_to, order='C')
         if verbose:
             print(f'I Reshaped array: shape: {np_arr.shape} dtype: {np_arr.dtype}')
         pil_img = Image.fromarray(np_arr, mode="I;16L")
         return pil_img
     except:
         raise ValueError(f'E Cannot process raw input: {fpath}')
  
  
 def process_arguments():
     parser = ArgumentParser()
  
     parser.add_argument('backend', choices=['cuda','ofa','ofa-pva-vic'],
                         help='Backend to be used for processing')
     parser.add_argument('left', help='Rectified left input image from a stereo pair')
     parser.add_argument('right', help='Rectified right input image from a stereo pair')
     parser.add_argument('--width', default=-1, type=int, help='Input width for raw input files')
     parser.add_argument('--height', default=-1, type=int, help='Input height for raw input files')
     parser.add_argument('--downscale', default=1, type=int, help='Output downscale factor')
     parser.add_argument('--window_size', default=5, type=int, help='Median filter window size')
     parser.add_argument('--skip_confidence', default=False, action='store_true', help='Do not calculate confidence')
     parser.add_argument('--conf_threshold', default=32767, type=int, help='Confidence threshold')
     parser.add_argument('--conf_type', default='best', choices=['best', 'absolute', 'relative', 'inference'],
                         help='Computation type to produce the confidence output. Default will pick best option given backend.')
     parser.add_argument('-p1', default=3, type=int, help='Penalty P1 on small disparities')
     parser.add_argument('-p2', default=48, type=int, help='Penalty P2 on large disparities')
     parser.add_argument('--p2_alpha', default=0, type=int, help='Alpha for adaptive P2 Penalty')
     parser.add_argument('--uniqueness', default=-1, type=float, help='Uniqueness ratio')
     parser.add_argument('--skip_diagonal', default=False, action='store_true', help='Do not use diagonal paths')
     parser.add_argument('--num_passes', default=3, type=int, help='Number of passes')
     parser.add_argument('--min_disparity', default=0, type=int, help='Minimum disparity')
     parser.add_argument('--max_disparity', default=256, type=int, help='Maximum disparity')
     parser.add_argument('--output_mode', default=0, type=int, help='0: color; 1: grayscale; 2: raw binary')
     parser.add_argument('-v', '--verbose', default=False, action='store_true', help='Verbose mode')
  
     return parser.parse_args()
  
  
 def main():
     args = process_arguments()
  
     scale = 1 # pixel value scaling factor when loading input
  
     if args.backend == 'cuda':
         backend = vpi.Backend.CUDA
     elif args.backend == 'ofa':
         backend = vpi.Backend.OFA
     elif args.backend == 'ofa-pva-vic':
         backend = vpi.Backend.OFA|vpi.Backend.PVA|vpi.Backend.VIC
     else:
         raise ValueError(f'E Invalid backend: {args.backend}')
  
     conftype = None
     if args.conf_type == 'best':
         conftype = vpi.ConfidenceType.INFERENCE if args.backend == 'ofa-pva-vic' else vpi.ConfidenceType.ABSOLUTE
     elif args.conf_type == 'absolute':
         conftype = vpi.ConfidenceType.ABSOLUTE
     elif args.conf_type == 'relative':
         conftype = vpi.ConfidenceType.RELATIVE
     elif args.conf_type == 'inference':
         conftype = vpi.ConfidenceType.INFERENCE
     else:
         raise ValueError(f'E Invalid confidence type: {args.conf_type}')
  
     minDisparity = args.min_disparity
     maxDisparity = args.max_disparity
     includeDiagonals = not args.skip_diagonal
     numPasses = args.num_passes
     calcConf = not args.skip_confidence
     downscale = args.downscale
     windowSize = args.window_size
     quality = 6
  
     if args.verbose:
         print(f'I Backend: {backend}\nI Left image: {args.left}\nI Right image: {args.right}\n'
               f'I Disparities (min, max): {(minDisparity, maxDisparity)}\n'
               f'I Input scale factor: {scale}\nI Output downscale factor: {downscale}\n'
               f'I Window size: {windowSize}\nI Quality: {quality}\n'
               f'I Calculate confidence: {calcConf}\nI Confidence threshold: {args.conf_threshold}\n'
               f'I Confidence type: {conftype}\nI Uniqueness ratio: {args.uniqueness}\n'
               f'I Penalty P1: {args.p1}\nI Penalty P2: {args.p2}\nI Adaptive P2 alpha: {args.p2_alpha}\n'
               f'I Include diagonals: {includeDiagonals}\nI Number of passes: {numPasses}\n'
               f'I Output mode: {args.output_mode}\nI Verbose: {args.verbose}\n'
               , end='', flush=True)
  
     if 'raw' in args.left:
         pil_left = read_raw_file(args.left, resize_to=[args.height, args.width], verbose=args.verbose)
         np_left = np.asarray(pil_left)
     else:
         try:
             pil_left = Image.open(args.left)
             if pil_left.mode == 'I':
                 np_left = np.asarray(pil_left).astype(np.int16)
             else:
                 np_left = np.asarray(pil_left)
         except:
             raise ValueError(f'E Cannot open left input image: {args.left}')
  
     if 'raw' in args.right:
         pil_right = read_raw_file(args.right, resize_to=[args.height, args.width], verbose=args.verbose)
         np_right = np.asarray(pil_right)
     else:
         try:
             pil_right = Image.open(args.right)
             if pil_right.mode == 'I':
                 np_right = np.asarray(pil_right).astype(np.int16)
             else:
                 np_right = np.asarray(pil_right)
         except:
             raise ValueError(f'E Cannot open right input image: {args.right}')
  
     # Streams for left and right independent pre-processing
     streamLeft = vpi.Stream()
     streamRight = vpi.Stream()
  
     # Load input into a vpi.Image and convert it to grayscale, 16bpp
     with vpi.Backend.CUDA:
         with streamLeft:
             left = vpi.asimage(np_left).convert(vpi.Format.Y16_ER, scale=scale)
         with streamRight:
             right = vpi.asimage(np_right).convert(vpi.Format.Y16_ER, scale=scale)
  
     # Preprocess input
     # Block linear format is needed for ofa backends
     # We use VIC backend for the format conversion because it is low power
     if args.backend in {'ofa-pva-vic', 'ofa'}:
         if args.verbose:
             print(f'W {args.backend} forces to convert input images to block linear', flush=True)
         with vpi.Backend.VIC:
             with streamLeft:
                 left = left.convert(vpi.Format.Y16_ER_BL)
             with streamRight:
                 right = right.convert(vpi.Format.Y16_ER_BL)
  
     if args.verbose:
         print(f'I Input left image: {left.size} {left.format}\n'
               f'I Input right image: {right.size} {right.format}', flush=True)
  
     confidenceU16 = None
  
     if calcConf:
         if args.backend not in {'cuda', 'ofa-pva-vic'}:
             # Only CUDA and OFA-PVA-VIC support confidence map
             calcConf = False
             if args.verbose:
                 print(f'W {args.backend} does not allow to calculate confidence', flush=True)
  
  
     outWidth = (left.size[0] + downscale - 1) // downscale
     outHeight = (left.size[1] + downscale - 1) // downscale
  
     if calcConf:
         confidenceU16 = vpi.Image((outWidth, outHeight), vpi.Format.U16)
  
     # Use stream left to consolidate actual stereo processing
     streamStereo = streamLeft
  
     if args.backend == 'ofa-pva-vic' and maxDisparity not in {128, 256}:
         maxDisparity = 128 if (maxDisparity // 128) < 1 else 256
         if args.verbose:
             print(f'W {args.backend} only supports 128 or 256 maxDisparity. Overriding to {maxDisparity}', flush=True)
  
     if args.verbose:
         if 'ofa' not in args.backend:
             print('W Ignoring P2 alpha and number of passes since not an OFA backend', flush=True)
         if args.backend != 'cuda':
             print('W Ignoring uniqueness since not a CUDA backend', flush=True)
         print('I Estimating stereo disparity ... ', end='', flush=True)
  
     # Estimate stereo disparity.
     with streamStereo, backend:
         disparityS16 = vpi.stereodisp(left, right, downscale=downscale, out_confmap=confidenceU16,
                                       window=windowSize, maxdisp=maxDisparity, confthreshold=args.conf_threshold,
                                       quality=quality, conftype=conftype, mindisp=minDisparity,
                                       p1=args.p1, p2=args.p2, p2alpha=args.p2_alpha, uniqueness=args.uniqueness,
                                       includediagonals=includeDiagonals, numpasses=numPasses)
  
     if args.verbose:
         print('done!\nI Post-processing ... ', end='', flush=True)
  
     # Postprocess results and save them to disk
     with streamStereo, vpi.Backend.CUDA:
         # Some backends outputs disparities in block-linear format, we must convert them to
         # pitch-linear for consistency with other backends.
         if disparityS16.format == vpi.Format.S16_BL:
             disparityS16 = disparityS16.convert(vpi.Format.S16, backend=vpi.Backend.VIC)
  
         # Scale disparity and confidence map so that values like between 0 and 255.
  
         # Disparities are in Q10.5 format, so to map it to float, it gets
         # divided by 32. Then the resulting disparity range, from 0 to
         # stereo.maxDisparity gets mapped to 0-255 for proper output.
         # Copy disparity values back to the CPU.
         disparityU8 = disparityS16.convert(vpi.Format.U8, scale=255.0/(32*maxDisparity)).cpu()
  
         # Apply JET colormap to turn the disparities into color, reddish hues
         # represent objects closer to the camera, blueish are farther away.
         disparityColor = cv2.applyColorMap(disparityU8, cv2.COLORMAP_JET)
  
         # Converts to RGB for output with PIL.
         disparityColor = cv2.cvtColor(disparityColor, cv2.COLOR_BGR2RGB)
  
         if calcConf:
             confidenceU8 = confidenceU16.convert(vpi.Format.U8, scale=255.0/65535).cpu()
  
             # When pixel confidence is 0, its color in the disparity is black.
             mask = cv2.threshold(confidenceU8, 1, 255, cv2.THRESH_BINARY)[1]
             mask = cv2.cvtColor(mask, cv2.COLOR_GRAY2BGR)
             disparityColor = cv2.bitwise_and(disparityColor, mask)
  
     fext = '.raw' if args.output_mode == 2 else '.png'
  
     disparity_fname = f'disparity_python{sys.version_info[0]}_{args.backend}' + fext
     confidence_fname = f'confidence_python{sys.version_info[0]}_{args.backend}' + fext
  
     if args.verbose:
         print(f'done!\nI Disparity output: {disparity_fname}', flush=True)
         if calcConf:
             print(f'I Confidence output: {confidence_fname}', flush=True)
  
     # Save results to disk.
     try:
         if args.output_mode == 0:
             Image.fromarray(disparityColor).save(disparity_fname)
             if args.verbose:
                 print(f'I Output disparity image: {disparityColor.shape} '
                       f'{disparityColor.dtype}', flush=True)
         elif args.output_mode == 1:
             Image.fromarray(disparityU8).save(disparity_fname)
             if args.verbose:
                 print(f'I Output disparity image: {disparityU8.shape} '
                       f'{disparityU8.dtype}', flush=True)
         elif args.output_mode == 2:
             disparityS16.cpu().tofile(disparity_fname)
             if args.verbose:
                 print(f'I Output disparity image: {disparityS16.size} '
                       f'{disparityS16.format}', flush=True)
  
         if calcConf:
             if args.output_mode == 0 or args.output_mode == 1:
                 Image.fromarray(confidenceU8).save(confidence_fname)
                 if args.verbose:
                     print(f'I Output confidence image: {confidenceU8.shape} '
                           f'{confidenceU8.dtype}', flush=True)
             else:
                 confidenceU16.cpu().tofile(confidence_fname)
                 if args.verbose:
                     print(f'I Output confidence image: {confidenceU16.size} '
                           f'{confidenceU16.format}', flush=True)
  
     except:
         raise ValueError(f'E Cannot write outputs: {disparity_fname}, {confidence_fname}\n'
                          f'E Using output mode: {args.output_mode}')
  
  
 if __name__ == '__main__':
     main()

 #include <opencv2/core/version.hpp>
 #if CV_MAJOR_VERSION >= 3
 #    include <opencv2/imgcodecs.hpp>
 #else
 #    include <opencv2/contrib/contrib.hpp> // for colormap
 #    include <opencv2/highgui/highgui.hpp>
 #endif
  
 #include <opencv2/imgproc/imgproc.hpp>
 #include <vpi/OpenCVInterop.hpp>
  
 #include <vpi/Image.h>
 #include <vpi/Status.h>
 #include <vpi/Stream.h>
 #include <vpi/algo/ConvertImageFormat.h>
 #include <vpi/algo/Rescale.h>
 #include <vpi/algo/StereoDisparity.h>
  
 #include <cstring>
 #include <iostream>
 #include <sstream>
  
 #define CHECK_STATUS(STMT)                                                                  \
     do                                                                                      \
     {                                                                                       \
         VPIStatus status = (STMT);                                                          \
         if (status != VPI_SUCCESS)                                                          \
         {                                                                                   \
             char buffer[VPI_MAX_STATUS_MESSAGE_LENGTH];                                     \
             vpiGetLastStatusMessage(buffer, sizeof(buffer));                                \
             std::ostringstream ss;                                                          \
             ss << "line " << __LINE__ << " " << vpiStatusGetName(status) << ": " << buffer; \
             throw std::runtime_error(ss.str());                                             \
         }                                                                                   \
     } while (0);
  
 int main(int argc, char *argv[])
 {
     // OpenCV image that will be wrapped by a VPIImage.
     // Define it here so that it's destroyed *after* wrapper is destroyed
     cv::Mat cvImageLeft, cvImageRight;
  
     // VPI objects that will be used
     VPIImage inLeft        = NULL;
     VPIImage inRight       = NULL;
     VPIImage tmpLeft       = NULL;
     VPIImage tmpRight      = NULL;
     VPIImage stereoLeft    = NULL;
     VPIImage stereoRight   = NULL;
     VPIImage disparity     = NULL;
     VPIImage confidenceMap = NULL;
     VPIStream stream       = NULL;
     VPIPayload stereo      = NULL;
  
     int retval = 0;
  
     try
     {
         // =============================
         // Parse command line parameters
  
         if (argc != 4)
         {
             throw std::runtime_error(std::string("Usage: ") + argv[0] +
                                      " <cuda|ofa|ofa-pva-vic> <left image> <right image>");
         }
  
         std::string strBackend       = argv[1];
         std::string strLeftFileName  = argv[2];
         std::string strRightFileName = argv[3];
  
         uint64_t backends;
  
         if (strBackend == "cuda")
         {
             backends = VPI_BACKEND_CUDA;
         }
         else if (strBackend == "ofa")
         {
             backends = VPI_BACKEND_OFA;
         }
         else if (strBackend == "ofa-pva-vic")
         {
             backends = VPI_BACKEND_OFA | VPI_BACKEND_PVA | VPI_BACKEND_VIC;
         }
         else
         {
             throw std::runtime_error("Backend '" + strBackend +
                                      "' not recognized, it must be either cuda, ofa or ofa-pva-vic.");
         }
  
         // =====================
         // Load the input images
         cvImageLeft = cv::imread(strLeftFileName);
         if (cvImageLeft.empty())
         {
             throw std::runtime_error("Can't open '" + strLeftFileName + "'");
         }
  
         cvImageRight = cv::imread(strRightFileName);
         if (cvImageRight.empty())
         {
             throw std::runtime_error("Can't open '" + strRightFileName + "'");
         }
  
         // =================================
         // Allocate all VPI resources needed
  
         int32_t inputWidth  = cvImageLeft.cols;
         int32_t inputHeight = cvImageLeft.rows;
  
         // Create the stream that will be used for processing.
         CHECK_STATUS(vpiStreamCreate(0, &stream));
  
         // We now wrap the loaded images into a VPIImage object to be used by VPI.
         // VPI won't make a copy of it, so the original image must be in scope at all times.
         CHECK_STATUS(vpiImageCreateWrapperOpenCVMat(cvImageLeft, 0, &inLeft));
         CHECK_STATUS(vpiImageCreateWrapperOpenCVMat(cvImageRight, 0, &inRight));
  
         // Format conversion parameters needed for input pre-processing
         VPIConvertImageFormatParams convParams;
         CHECK_STATUS(vpiInitConvertImageFormatParams(&convParams));
  
         // Initialize default parameters
         VPIStereoDisparityEstimatorCreationParams createParams;
         CHECK_STATUS(vpiInitStereoDisparityEstimatorCreationParams(&createParams));
  
         // Select max disparity that works well for the chair_stereo_{left,right}_1920.png files
         createParams.maxDisparity = 256;
  
         // Default format and size for input stereo pair (some backends require adjustments, see below)
         VPIImageFormat stereoFormat = VPI_IMAGE_FORMAT_Y8_ER;
  
         int stereoWidth  = inputWidth;
         int stereoHeight = inputHeight;
  
         // Default format and size for output
         VPIImageFormat disparityFormat = VPI_IMAGE_FORMAT_S16;
  
         int outputWidth  = inputWidth;
         int outputHeight = inputHeight;
  
         // Override some backend-dependent parameters
         if (strBackend.find("ofa") != std::string::npos)
         {
             // Implementations using OFA require BL input
             stereoFormat = VPI_IMAGE_FORMAT_Y8_ER_BL;
  
             if (strBackend == "ofa")
             {
                 // when using OFA alone, output must also be BL
                 disparityFormat = VPI_IMAGE_FORMAT_S16_BL;
             }
  
             // Using downscale factor with OFA improves performance
             createParams.downscaleFactor = 2;
             outputWidth  = (inputWidth + createParams.downscaleFactor - 1) / createParams.downscaleFactor;
             outputHeight = (inputHeight + createParams.downscaleFactor - 1) / createParams.downscaleFactor;
  
             // Output width including downscaleFactor must be at least max(64, maxDisparity/downscaleFactor) when the
             // OFA+PVA+VIC backend is used
             if (strBackend.find("pva") != std::string::npos)
             {
                 int minWidth = std::max(createParams.maxDisparity / createParams.downscaleFactor, outputWidth);
                 outputWidth  = std::max(64, minWidth);
                 outputHeight = (inputHeight * outputWidth) / inputWidth;
                 stereoWidth  = outputWidth * createParams.downscaleFactor;
                 stereoHeight = outputHeight * createParams.downscaleFactor;
             }
         }
  
         // Create the payload for Stereo Disparity algorithm.
         // Payload is created before the image objects so that non-supported backends can be trapped with an error.
         CHECK_STATUS(vpiCreateStereoDisparityEstimator(backends, stereoWidth, stereoHeight, stereoFormat, &createParams,
                                                        &stereo));
  
         // Create the output image where the disparity map will be stored.
         CHECK_STATUS(vpiImageCreate(outputWidth, outputHeight, disparityFormat, 0, &disparity));
  
         // Create the input stereo images
         CHECK_STATUS(vpiImageCreate(stereoWidth, stereoHeight, stereoFormat, 0, &stereoLeft));
         CHECK_STATUS(vpiImageCreate(stereoWidth, stereoHeight, stereoFormat, 0, &stereoRight));
  
         // Create the confidence image if the backend can support it
         if (strBackend == "ofa-pva-vic" || strBackend == "cuda")
         {
             CHECK_STATUS(vpiImageCreate(outputWidth, outputHeight, VPI_IMAGE_FORMAT_U16, 0, &confidenceMap));
         }
  
         // If a rescale of the input is required, create temporary images for the initial format conversion.
         bool const isRescaleRequired = (stereoWidth != inputWidth) || (stereoHeight != inputHeight);
         if (isRescaleRequired)
         {
             CHECK_STATUS(vpiImageCreate(inputWidth, inputHeight, stereoFormat, 0, &tmpLeft));
             CHECK_STATUS(vpiImageCreate(inputWidth, inputHeight, stereoFormat, 0, &tmpRight));
         }
  
         // ================
         // Processing stage
  
         // Start with default parameters, and override some values depending on what backend is used.
         VPIStereoDisparityEstimatorParams submitParams;
         CHECK_STATUS(vpiInitStereoDisparityEstimatorParams(&submitParams));
         if (strBackend == "ofa-pva-vic")
         {
             // The INFERENCE confidence type achieves better performance with OFA+PVA+VIC backend. The only tradeoff is
             // that the deep-learning based confidence map is not easily expressed as a function of left and right
             // disparity estimates, in contrast to ABSOLUTE or RELATIVE confidence type.
             submitParams.confidenceType = VPI_STEREO_CONFIDENCE_INFERENCE;
         }
         else if (strBackend == "cuda")
         {
             // The chair_stereo_{left,right}_1920.png inputs benefit from a higher confidence threshold with CUDA
             submitParams.confidenceThreshold = UINT16_MAX - 10000;
         }
  
         // -----------------
         // Pre-process input
         if (isRescaleRequired)
         {
             // We require a conversion with CUDA only because we loaded the images in the default BGR format from OpenCV
             // and the VIC backend does not support 3-channel RGB/BGR image formats.
             // Alternatively, we could load grayscale images and handle the conversion+rescale in one operation on VIC.
  
             // Convert opencv input to grayscale format using CUDA
             CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, inLeft, tmpLeft, &convParams));
             CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, inRight, tmpRight, &convParams));
  
             // Rescale on VIC
             CHECK_STATUS(
                 vpiSubmitRescale(stream, VPI_BACKEND_VIC, tmpLeft, stereoLeft, VPI_INTERP_LINEAR, VPI_BORDER_CLAMP, 0));
             CHECK_STATUS(vpiSubmitRescale(stream, VPI_BACKEND_VIC, tmpRight, stereoRight, VPI_INTERP_LINEAR,
                                           VPI_BORDER_CLAMP, 0));
         }
         else
         {
             // Convert opencv input to grayscale format using CUDA
             CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, inLeft, stereoLeft, &convParams));
             CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, inRight, stereoRight, &convParams));
         }
  
         // ------------------------------
         // Do stereo disparity estimation
  
         // Submit it with the input and output images
         CHECK_STATUS(vpiSubmitStereoDisparityEstimator(stream, backends, stereo, stereoLeft, stereoRight, disparity,
                                                        confidenceMap, &submitParams));
  
         // Wait until the algorithm finishes processing
         CHECK_STATUS(vpiStreamSync(stream));
  
         // ========================================
         // Output pre-processing and saving to disk
         // Lock output to retrieve its data on cpu memory
         VPIImageData data;
         CHECK_STATUS(vpiImageLockData(disparity, VPI_LOCK_READ, VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR, &data));
  
         // Make an OpenCV matrix out of this image
         cv::Mat cvDisparity;
         CHECK_STATUS(vpiImageDataExportOpenCVMat(data, &cvDisparity));
  
         // Scale result and write it to disk. Disparities are in Q10.5 format,
         // so to map it to float, it gets divided by 32. Then the resulting disparity range,
         // from 0 to maxDisparity gets mapped to 0-255 for proper output.
         cvDisparity.convertTo(cvDisparity, CV_8UC1, 255.0 / (32 * createParams.maxDisparity), 0);
  
         // Apply JET colormap to turn the disparities into color.
         // Reddish hues represent objects closer to the camera, blueish are farther away.
         cv::Mat cvDisparityColor;
         applyColorMap(cvDisparity, cvDisparityColor, cv::COLORMAP_JET);
  
         // Done handling output, don't forget to unlock it.
         CHECK_STATUS(vpiImageUnlock(disparity));
  
         // If we have a confidence map, adjust it for display and write it to disk too.
         if (confidenceMap)
         {
             // Lock the image data and export to cv::Mat
             VPIImageData data;
             CHECK_STATUS(vpiImageLockData(confidenceMap, VPI_LOCK_READ, VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR, &data));
             cv::Mat cvConfidence;
             CHECK_STATUS(vpiImageDataExportOpenCVMat(data, &cvConfidence));
  
             // Confidence map varies from 0 to 65535, we scale it to [0-255].
             cvConfidence.convertTo(cvConfidence, CV_8UC1, 255.0 / 65535, 0);
             imwrite("confidence_" + strBackend + ".png", cvConfidence);
  
             CHECK_STATUS(vpiImageUnlock(confidenceMap));
  
             // When pixel confidence is 0, we would like its color in the disparity image to be black.
             cv::Mat cvMask;
             threshold(cvConfidence, cvMask, 1, 255, cv::THRESH_BINARY);
             cvtColor(cvMask, cvMask, cv::COLOR_GRAY2BGR);
             bitwise_and(cvDisparityColor, cvMask, cvDisparityColor);
         }
  
         imwrite("disparity_" + strBackend + ".png", cvDisparityColor);
     }
     catch (std::exception &e)
     {
         std::cerr << e.what() << std::endl;
         retval = 1;
     }
  
     // ========
     // Clean up
  
     // Destroying stream first makes sure that all work submitted to
     // it is finished.
     vpiStreamDestroy(stream);
  
     // Only then we can destroy the other objects, as we're sure they
     // aren't being used anymore.
  
     vpiImageDestroy(inLeft);
     vpiImageDestroy(inRight);
     vpiImageDestroy(tmpLeft);
     vpiImageDestroy(tmpRight);
     vpiImageDestroy(stereoLeft);
     vpiImageDestroy(stereoRight);
     vpiImageDestroy(confidenceMap);
     vpiImageDestroy(disparity);
     vpiPayloadDestroy(stereo);
  
     return retval;
 }

VPI - Vision Programming Interface

3.2 Release

Overview

Instructions

Results

Source Code