VPI - Vision Programming Interface

4.0 Release

Stereo Disparity

Overview

The Stereo Disparity application receives left and right stereo pair images and returns the disparity between them, which is a function of image depth. The result is saved as an image file to disk. If available, it'll also output the corresponding confidence map.

Instructions

The command line parameters are:

<backend> <left image> <right image>

where

  • backend: either cuda, ofa or ofa-pva-vic; it defines the backend that will perform the processing. ofa-pva-vic and cuda allow output of the confidence map in addition to the disparity.
  • left image: left input image of a rectified stereo pair, it accepts png, jpeg and possibly others.
  • right image: right input image of a stereo pair.

Here's one example:

  • C++
    ./vpi_sample_02_stereo_disparity cuda ../assets/chair_stereo_left.png ../assets/chair_stereo_right.png
  • Python
    python3 main.py cuda ../assets/chair_stereo_left.png ../assets/chair_stereo_right.png

This is using the CUDA backend and the provided sample images. You can try with other stereo pair images, respecting the constraints imposed by the algorithm.

The Python version of this sample also allow for setting various additional parameters, as well as additional input image extensions and to turn on verbose mode. The following command-line arguments can be passed to the Python sample:

  • Python
    python3 main.py <backend> <left image> <right image> --width W --height H --downscale D --window-size WIN
    --skip-confidence --conf-threshold T --conf-type absolute/relative --p1 P1 --p2 P2 --p2-alpha P2alpha
    --uniqueness U --skip-diagonal --num-passes N --min-disparity MIN --max-disparity MAX --output-mode 0/1/2
    -v/--verbose
    where the additional optional arguments are:
  • width: set the width W when passing ".raw" input images
  • height: set the height H when passing ".raw" input images
  • downscale: to set the output downscale factor as D
  • window-size: to set the median filter window size as WIN
  • skip-confidence: to avoid calculating confidence and applying it as a mask
  • conf-threshold: set the confidence threshold as T
  • conf-type: set the confidence type as either absolute or relative
  • p1: set p1 penalty as P1
  • p2: set p2 penalty as P2
  • p2-alpha: set p2Alpha adaptive penalty as P2alpha
  • uniqueness: set uniqueness as U
  • skip-diagonal: to avoid using diagonal paths in CUDA or OFA backends
  • num-passes: to set the number of passes N in OFA backends
  • min-disparity: to set the minimum disparity MIN in CUDA backend
  • max-disparity: to set the maximum disparity MAX in backends
  • output-mode: 0 for colored output, 1 for grayscale and 2 for raw binary
  • verbose: to turn on verbose mode To understand in detail each of these additional arguments related to the stereo disparity algorithm, please read the corresponding documentation.

Results

Left input image Right input image
Stereo disparity Confidence map

Source Code

For convenience, here's the code that is also installed in the samples directory.

Language:
27 import sys
28 import vpi
29 import numpy as np
30 from PIL import Image
31 from argparse import ArgumentParser
32 import cv2
33 
34 
35 def read_raw_file(fpath, resize_to=None, verbose=False):
36  try:
37  if verbose:
38  print(f'I Reading: {fpath}', end=' ', flush=True)
39  f = open(fpath, 'rb')
40  np_arr = np.fromfile(f, dtype=np.uint16, count=-1)
41  f.close()
42  if verbose:
43  print(f'done!\nI Raw array: shape: {np_arr.shape} dtype: {np_arr.dtype}')
44  if resize_to is not None:
45  np_arr = np_arr.reshape(resize_to, order='C')
46  if verbose:
47  print(f'I Reshaped array: shape: {np_arr.shape} dtype: {np_arr.dtype}')
48  pil_img = Image.fromarray(np_arr, mode="I;16L")
49  return pil_img
50  except:
51  raise ValueError(f'E Cannot process raw input: {fpath}')
52 
53 
54 def process_arguments():
55  parser = ArgumentParser()
56 
57  parser.add_argument('backend', choices=['cuda','ofa','ofa-pva-vic'],
58  help='Backend to be used for processing')
59  parser.add_argument('left', help='Rectified left input image from a stereo pair')
60  parser.add_argument('right', help='Rectified right input image from a stereo pair')
61  parser.add_argument('--width', default=-1, type=int, help='Input width for raw input files')
62  parser.add_argument('--height', default=-1, type=int, help='Input height for raw input files')
63  parser.add_argument('--downscale', default=1, type=int, help='Output downscale factor')
64  parser.add_argument('--scale-factor', default=1.0, type=float, help='Input scale factor')
65  parser.add_argument('--window-size', default=5, type=int, help='Median filter window size')
66  parser.add_argument('--skip-confidence', default=False, action='store_true', help='Do not calculate confidence')
67  parser.add_argument('--conf-threshold', default=32767, type=int, help='Confidence threshold')
68  parser.add_argument('--conf-type', default='best', choices=['best', 'absolute', 'relative', 'inference'],
69  help='Computation type to produce the confidence output. Default will pick best option given backend.')
70  parser.add_argument('--p1', default=3, type=int, help='Penalty P1 on small disparities')
71  parser.add_argument('--p2', default=48, type=int, help='Penalty P2 on large disparities')
72  parser.add_argument('--p2-alpha', default=0, type=int, help='Alpha for adaptive P2 Penalty')
73  parser.add_argument('--uniqueness', default=-1, type=float, help='Uniqueness ratio')
74  parser.add_argument('--skip-diagonal', default=False, action='store_true', help='Do not use diagonal paths')
75  parser.add_argument('--num-passes', default=3, type=int, help='Number of passes')
76  parser.add_argument('--min-disparity', default=0, type=int, help='Minimum disparity')
77  parser.add_argument('--max-disparity', default=256, type=int, help='Maximum disparity')
78  parser.add_argument('-o', '--output-mode', default=0, type=int, help='0: color; 1: grayscale; 2: raw binary')
79  parser.add_argument('-v', '--verbose', default=False, action='store_true', help='Verbose mode')
80 
81  return parser.parse_args()
82 
83 
84 def main():
85  args = process_arguments()
86 
87  if args.backend == 'cuda':
88  backend = vpi.Backend.CUDA
89  elif args.backend == 'ofa':
90  backend = vpi.Backend.OFA
91  elif args.backend == 'ofa-pva-vic':
92  backend = vpi.Backend.OFA|vpi.Backend.PVA|vpi.Backend.VIC
93  else:
94  raise ValueError(f'E Invalid backend: {args.backend}')
95 
96  conftype = None
97  if args.conf_type == 'best':
98  conftype = vpi.ConfidenceType.INFERENCE if args.backend == 'ofa-pva-vic' else vpi.ConfidenceType.ABSOLUTE
99  elif args.conf_type == 'absolute':
100  conftype = vpi.ConfidenceType.ABSOLUTE
101  elif args.conf_type == 'relative':
102  conftype = vpi.ConfidenceType.RELATIVE
103  elif args.conf_type == 'inference':
104  conftype = vpi.ConfidenceType.INFERENCE
105  else:
106  raise ValueError(f'E Invalid confidence type: {args.conf_type}')
107 
108  scaleFactor = args.scale_factor
109  minDisparity = args.min_disparity
110  maxDisparity = args.max_disparity
111  includeDiagonals = not args.skip_diagonal
112  numPasses = args.num_passes
113  calcConf = not args.skip_confidence
114  downscale = args.downscale
115  windowSize = args.window_size
116  quality = 6
117 
118  if args.verbose:
119  print(f'I Backend: {backend}\nI Left image: {args.left}\nI Right image: {args.right}\n'
120  f'I Disparities (min, max): {(minDisparity, maxDisparity)}\n'
121  f'I Input size scale factor: {scaleFactor}\nI Output downscale factor: {downscale}\n'
122  f'I Window size: {windowSize}\nI Quality: {quality}\n'
123  f'I Calculate confidence: {calcConf}\nI Confidence threshold: {args.conf_threshold}\n'
124  f'I Confidence type: {conftype}\nI Uniqueness ratio: {args.uniqueness}\n'
125  f'I Penalty P1: {args.p1}\nI Penalty P2: {args.p2}\nI Adaptive P2 alpha: {args.p2_alpha}\n'
126  f'I Include diagonals: {includeDiagonals}\nI Number of passes: {numPasses}\n'
127  f'I Output mode: {args.output_mode}\nI Verbose: {args.verbose}\n'
128  , end='', flush=True)
129 
130  if 'raw' in args.left:
131  pil_left = read_raw_file(args.left, resize_to=[args.height, args.width], verbose=args.verbose)
132  np_left = np.asarray(pil_left)
133  else:
134  try:
135  pil_left = Image.open(args.left)
136  if pil_left.mode == 'I':
137  np_left = np.asarray(pil_left).astype(np.int16)
138  else:
139  np_left = np.asarray(pil_left)
140  except:
141  raise ValueError(f'E Cannot open left input image: {args.left}')
142 
143  if 'raw' in args.right:
144  pil_right = read_raw_file(args.right, resize_to=[args.height, args.width], verbose=args.verbose)
145  np_right = np.asarray(pil_right)
146  else:
147  try:
148  pil_right = Image.open(args.right)
149  if pil_right.mode == 'I':
150  np_right = np.asarray(pil_right).astype(np.int16)
151  else:
152  np_right = np.asarray(pil_right)
153  except:
154  raise ValueError(f'E Cannot open right input image: {args.right}')
155 
156  # Streams for left and right independent pre-processing
157  streamLeft = vpi.Stream()
158  streamRight = vpi.Stream()
159 
160  # Load input into a vpi.Image and convert it to grayscale, 16bpp
161  with vpi.Backend.CUDA:
162  left = vpi.asimage(np_left)
163  right = vpi.asimage(np_right)
164  if scaleFactor != 1:
165  with streamLeft:
166  left = left.rescale(factor=scaleFactor)
167  with streamRight:
168  right = right.rescale(factor=scaleFactor)
169  # Using scale=1 in convert below as there is no need to scale from 0-255 to 0-65535
170  with streamLeft:
171  left = left.convert(vpi.Format.Y16_ER, scale=1)
172  with streamRight:
173  right = right.convert(vpi.Format.Y16_ER, scale=1)
174 
175  # Preprocess input
176  # Block linear format is needed for ofa backends
177  # We use VIC backend for the format conversion because it is low power
178  if args.backend in {'ofa-pva-vic', 'ofa'}:
179  if args.verbose:
180  print(f'W {args.backend} forces to convert input images to block linear', flush=True)
181  with vpi.Backend.VIC:
182  with streamLeft:
183  left = left.convert(vpi.Format.Y16_ER_BL)
184  with streamRight:
185  right = right.convert(vpi.Format.Y16_ER_BL)
186 
187  if args.verbose:
188  print(f'I Input left image: {left.size} {left.format}\n'
189  f'I Input right image: {right.size} {right.format}', flush=True)
190 
191  confidenceU16 = None
192 
193  if calcConf:
194  if args.backend not in {'cuda', 'ofa-pva-vic'}:
195  # Only CUDA and OFA-PVA-VIC support confidence map
196  calcConf = False
197  if args.verbose:
198  print(f'W {args.backend} does not allow to calculate confidence', flush=True)
199 
200 
201  outWidth = (left.size[0] + downscale - 1) // downscale
202  outHeight = (left.size[1] + downscale - 1) // downscale
203 
204  if calcConf:
205  confidenceU16 = vpi.Image((outWidth, outHeight), vpi.Format.U16)
206 
207  # Use stream left to consolidate actual stereo processing
208  streamStereo = streamLeft
209 
210  if args.backend == 'ofa-pva-vic' and maxDisparity not in {128, 256}:
211  maxDisparity = 128 if (maxDisparity // 128) < 1 else 256
212  if args.verbose:
213  print(f'W {args.backend} only supports 128 or 256 maxDisparity. Overriding to {maxDisparity}', flush=True)
214 
215  if args.verbose:
216  if 'ofa' not in args.backend:
217  print('W Ignoring P2 alpha and number of passes since not an OFA backend', flush=True)
218  if args.backend != 'cuda':
219  print('W Ignoring uniqueness since not a CUDA backend', flush=True)
220  print('I Estimating stereo disparity ... ', end='', flush=True)
221 
222  # Estimate stereo disparity.
223  with streamStereo, backend:
224  disparityS16 = vpi.stereodisp(left, right, downscale=downscale, out_confmap=confidenceU16,
225  window=windowSize, maxdisp=maxDisparity, confthreshold=args.conf_threshold,
226  quality=quality, conftype=conftype, mindisp=minDisparity,
227  p1=args.p1, p2=args.p2, p2alpha=args.p2_alpha, uniqueness=args.uniqueness,
228  includediagonals=includeDiagonals, numpasses=numPasses)
229 
230  if args.verbose:
231  print('done!\nI Post-processing ... ', end='', flush=True)
232 
233  # Postprocess results and save them to disk
234  with streamStereo, vpi.Backend.CUDA:
235  # Some backends outputs disparities in block-linear format, we must convert them to
236  # pitch-linear for consistency with other backends.
237  if disparityS16.format == vpi.Format.S16_BL:
238  disparityS16 = disparityS16.convert(vpi.Format.S16, backend=vpi.Backend.VIC)
239 
240  # Scale disparity and confidence map so that values like between 0 and 255.
241 
242  # Disparities are in Q10.5 format, so to map it to float, it gets
243  # divided by 32. Then the resulting disparity range, from 0 to
244  # stereo.maxDisparity gets mapped to 0-255 for proper output.
245  # Copy disparity values back to the CPU.
246  disparityU8 = disparityS16.convert(vpi.Format.U8, scale=255.0/(32*maxDisparity)).cpu()
247 
248  # Apply JET colormap to turn the disparities into color, reddish hues
249  # represent objects closer to the camera, blueish are farther away.
250  disparityColor = cv2.applyColorMap(disparityU8, cv2.COLORMAP_JET)
251 
252  # Converts to RGB for output with PIL.
253  disparityColor = cv2.cvtColor(disparityColor, cv2.COLOR_BGR2RGB)
254 
255  if calcConf:
256  confidenceU8 = confidenceU16.convert(vpi.Format.U8, scale=255.0/65535).cpu()
257 
258  # When pixel confidence is 0, its color in the disparity is black.
259  mask = cv2.threshold(confidenceU8, 1, 255, cv2.THRESH_BINARY)[1]
260  mask = cv2.cvtColor(mask, cv2.COLOR_GRAY2BGR)
261  disparityColor = cv2.bitwise_and(disparityColor, mask)
262 
263  fext = '.raw' if args.output_mode == 2 else '.png'
264 
265  disparity_fname = f'disparity_python{sys.version_info[0]}_{args.backend}' + fext
266  confidence_fname = f'confidence_python{sys.version_info[0]}_{args.backend}' + fext
267 
268  if args.verbose:
269  print(f'done!\nI Disparity output: {disparity_fname}', flush=True)
270  if calcConf:
271  print(f'I Confidence output: {confidence_fname}', flush=True)
272 
273  # Save results to disk.
274  try:
275  if args.output_mode == 0:
276  Image.fromarray(disparityColor).save(disparity_fname)
277  if args.verbose:
278  print(f'I Output disparity image: {disparityColor.shape} '
279  f'{disparityColor.dtype}', flush=True)
280  elif args.output_mode == 1:
281  Image.fromarray(disparityU8).save(disparity_fname)
282  if args.verbose:
283  print(f'I Output disparity image: {disparityU8.shape} '
284  f'{disparityU8.dtype}', flush=True)
285  elif args.output_mode == 2:
286  disparityS16.cpu().tofile(disparity_fname)
287  if args.verbose:
288  print(f'I Output disparity image: {disparityS16.size} '
289  f'{disparityS16.format}', flush=True)
290 
291  if calcConf:
292  if args.output_mode == 0 or args.output_mode == 1:
293  Image.fromarray(confidenceU8).save(confidence_fname)
294  if args.verbose:
295  print(f'I Output confidence image: {confidenceU8.shape} '
296  f'{confidenceU8.dtype}', flush=True)
297  else:
298  confidenceU16.cpu().tofile(confidence_fname)
299  if args.verbose:
300  print(f'I Output confidence image: {confidenceU16.size} '
301  f'{confidenceU16.format}', flush=True)
302 
303  except:
304  raise ValueError(f'E Cannot write outputs: {disparity_fname}, {confidence_fname}\n'
305  f'E Using output mode: {args.output_mode}')
306 
307 
308 if __name__ == '__main__':
309  main()
29 #include <opencv2/core/version.hpp>
30 #if CV_MAJOR_VERSION >= 3
31 # include <opencv2/imgcodecs.hpp>
32 #else
33 # include <opencv2/contrib/contrib.hpp> // for colormap
34 # include <opencv2/highgui/highgui.hpp>
35 #endif
36 
37 #include <opencv2/imgproc/imgproc.hpp>
38 #include <vpi/OpenCVInterop.hpp>
39 
40 #include <vpi/Image.h>
41 #include <vpi/Status.h>
42 #include <vpi/Stream.h>
44 #include <vpi/algo/Rescale.h>
46 
47 #include <cstring>
48 #include <iostream>
49 #include <sstream>
50 
51 #define CHECK_STATUS(STMT) \
52  do \
53  { \
54  VPIStatus status = (STMT); \
55  if (status != VPI_SUCCESS) \
56  { \
57  char buffer[VPI_MAX_STATUS_MESSAGE_LENGTH]; \
58  vpiGetLastStatusMessage(buffer, sizeof(buffer)); \
59  std::ostringstream ss; \
60  ss << "line " << __LINE__ << " " << vpiStatusGetName(status) << ": " << buffer; \
61  throw std::runtime_error(ss.str()); \
62  } \
63  } while (0);
64 
65 int main(int argc, char *argv[])
66 {
67  // OpenCV image that will be wrapped by a VPIImage.
68  // Define it here so that it's destroyed *after* wrapper is destroyed
69  cv::Mat cvImageLeft, cvImageRight;
70 
71  // VPI objects that will be used
72  VPIImage inLeft = NULL;
73  VPIImage inRight = NULL;
74  VPIImage tmpLeft = NULL;
75  VPIImage tmpRight = NULL;
76  VPIImage stereoLeft = NULL;
77  VPIImage stereoRight = NULL;
78  VPIImage disparity = NULL;
79  VPIImage confidenceMap = NULL;
80  VPIStream stream = NULL;
81  VPIPayload stereo = NULL;
82 
83  int retval = 0;
84 
85  try
86  {
87  // =============================
88  // Parse command line parameters
89 
90  if (argc != 4)
91  {
92  throw std::runtime_error(std::string("Usage: ") + argv[0] +
93  " <cuda|ofa|ofa-pva-vic> <left image> <right image>");
94  }
95 
96  std::string strBackend = argv[1];
97  std::string strLeftFileName = argv[2];
98  std::string strRightFileName = argv[3];
99 
100  uint64_t backends;
101 
102  if (strBackend == "cuda")
103  {
104  backends = VPI_BACKEND_CUDA;
105  }
106  else if (strBackend == "ofa")
107  {
108  backends = VPI_BACKEND_OFA;
109  }
110  else if (strBackend == "ofa-pva-vic")
111  {
113  }
114  else
115  {
116  throw std::runtime_error("Backend '" + strBackend +
117  "' not recognized, it must be either cuda, ofa or ofa-pva-vic.");
118  }
119 
120  // =====================
121  // Load the input images
122  cvImageLeft = cv::imread(strLeftFileName);
123  if (cvImageLeft.empty())
124  {
125  throw std::runtime_error("Can't open '" + strLeftFileName + "'");
126  }
127 
128  cvImageRight = cv::imread(strRightFileName);
129  if (cvImageRight.empty())
130  {
131  throw std::runtime_error("Can't open '" + strRightFileName + "'");
132  }
133 
134  // =================================
135  // Allocate all VPI resources needed
136 
137  int32_t inputWidth = cvImageLeft.cols;
138  int32_t inputHeight = cvImageLeft.rows;
139 
140  // Create the stream that will be used for processing.
141  CHECK_STATUS(vpiStreamCreate(0, &stream));
142 
143  // We now wrap the loaded images into a VPIImage object to be used by VPI.
144  // VPI won't make a copy of it, so the original image must be in scope at all times.
145  CHECK_STATUS(vpiImageCreateWrapperOpenCVMat(cvImageLeft, 0, &inLeft));
146  CHECK_STATUS(vpiImageCreateWrapperOpenCVMat(cvImageRight, 0, &inRight));
147 
148  // Format conversion parameters needed for input pre-processing
149  VPIConvertImageFormatParams convParams;
150  CHECK_STATUS(vpiInitConvertImageFormatParams(&convParams));
151 
152  // Initialize default parameters
154  CHECK_STATUS(vpiInitStereoDisparityEstimatorCreationParams(&createParams));
155 
156  // Select max disparity that works well for the chair_stereo_{left,right}_1920.png files
157  createParams.maxDisparity = 256;
158 
159  // Default format and size for input stereo pair (some backends require adjustments, see below)
160  VPIImageFormat stereoFormat = VPI_IMAGE_FORMAT_Y8_ER;
161 
162  int stereoWidth = inputWidth;
163  int stereoHeight = inputHeight;
164 
165  // Default format and size for output
166  VPIImageFormat disparityFormat = VPI_IMAGE_FORMAT_S16;
167 
168  int outputWidth = inputWidth;
169  int outputHeight = inputHeight;
170 
171  // Override some backend-dependent parameters
172  if (strBackend.find("ofa") != std::string::npos)
173  {
174  // Implementations using OFA require BL input
175  stereoFormat = VPI_IMAGE_FORMAT_Y8_ER_BL;
176 
177  if (strBackend == "ofa")
178  {
179  // when using OFA alone, output must also be BL
180  disparityFormat = VPI_IMAGE_FORMAT_S16_BL;
181  }
182 
183  // Using downscale factor with OFA improves performance
184  createParams.downscaleFactor = 2;
185  outputWidth = (inputWidth + createParams.downscaleFactor - 1) / createParams.downscaleFactor;
186  outputHeight = (inputHeight + createParams.downscaleFactor - 1) / createParams.downscaleFactor;
187 
188  // Output width including downscaleFactor must be at least max(64, maxDisparity/downscaleFactor) when the
189  // OFA+PVA+VIC backend is used
190  if (strBackend.find("pva") != std::string::npos)
191  {
192  int minWidth = std::max(createParams.maxDisparity / createParams.downscaleFactor, outputWidth);
193  outputWidth = std::max(64, minWidth);
194  outputHeight = (inputHeight * outputWidth) / inputWidth;
195  stereoWidth = outputWidth * createParams.downscaleFactor;
196  stereoHeight = outputHeight * createParams.downscaleFactor;
197  }
198  }
199 
200  // Create the payload for Stereo Disparity algorithm.
201  // Payload is created before the image objects so that non-supported backends can be trapped with an error.
202  CHECK_STATUS(vpiCreateStereoDisparityEstimator(backends, stereoWidth, stereoHeight, stereoFormat, &createParams,
203  &stereo));
204 
205  // Create the output image where the disparity map will be stored.
206  CHECK_STATUS(vpiImageCreate(outputWidth, outputHeight, disparityFormat, 0, &disparity));
207 
208  // Create the input stereo images
209  CHECK_STATUS(vpiImageCreate(stereoWidth, stereoHeight, stereoFormat, 0, &stereoLeft));
210  CHECK_STATUS(vpiImageCreate(stereoWidth, stereoHeight, stereoFormat, 0, &stereoRight));
211 
212  // Create the confidence image if the backend can support it
213  if (strBackend == "ofa-pva-vic" || strBackend == "cuda")
214  {
215  CHECK_STATUS(vpiImageCreate(outputWidth, outputHeight, VPI_IMAGE_FORMAT_U16, 0, &confidenceMap));
216  }
217 
218  // If a rescale of the input is required, create temporary images for the initial format conversion.
219  bool const isRescaleRequired = (stereoWidth != inputWidth) || (stereoHeight != inputHeight);
220  if (isRescaleRequired)
221  {
222  CHECK_STATUS(vpiImageCreate(inputWidth, inputHeight, stereoFormat, 0, &tmpLeft));
223  CHECK_STATUS(vpiImageCreate(inputWidth, inputHeight, stereoFormat, 0, &tmpRight));
224  }
225 
226  // ================
227  // Processing stage
228 
229  // Start with default parameters, and override some values depending on what backend is used.
231  CHECK_STATUS(vpiInitStereoDisparityEstimatorParams(&submitParams));
232  if (strBackend == "ofa-pva-vic")
233  {
234  // The INFERENCE confidence type achieves better performance with OFA+PVA+VIC backend. The only tradeoff is
235  // that the deep-learning based confidence map is not easily expressed as a function of left and right
236  // disparity estimates, in contrast to ABSOLUTE or RELATIVE confidence type.
238  }
239  else if (strBackend == "cuda")
240  {
241  // The chair_stereo_{left,right}_1920.png inputs benefit from a higher confidence threshold with CUDA
242  submitParams.confidenceThreshold = UINT16_MAX - 10000;
243  }
244 
245  // -----------------
246  // Pre-process input
247  if (isRescaleRequired)
248  {
249  // We require a conversion with CUDA only because we loaded the images in the default BGR format from OpenCV
250  // and the VIC backend does not support 3-channel RGB/BGR image formats.
251  // Alternatively, we could load grayscale images and handle the conversion+rescale in one operation on VIC.
252 
253  // Convert opencv input to grayscale format using CUDA
254  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, inLeft, tmpLeft, &convParams));
255  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, inRight, tmpRight, &convParams));
256 
257  // Rescale on VIC
258  CHECK_STATUS(
259  vpiSubmitRescale(stream, VPI_BACKEND_VIC, tmpLeft, stereoLeft, VPI_INTERP_LINEAR, VPI_BORDER_CLAMP, 0));
260  CHECK_STATUS(vpiSubmitRescale(stream, VPI_BACKEND_VIC, tmpRight, stereoRight, VPI_INTERP_LINEAR,
261  VPI_BORDER_CLAMP, 0));
262  }
263  else
264  {
265  // Convert opencv input to grayscale format using CUDA
266  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, inLeft, stereoLeft, &convParams));
267  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, inRight, stereoRight, &convParams));
268  }
269 
270  // ------------------------------
271  // Do stereo disparity estimation
272 
273  // Submit it with the input and output images
274  CHECK_STATUS(vpiSubmitStereoDisparityEstimator(stream, backends, stereo, stereoLeft, stereoRight, disparity,
275  confidenceMap, &submitParams));
276 
277  // Wait until the algorithm finishes processing
278  CHECK_STATUS(vpiStreamSync(stream));
279 
280  // ========================================
281  // Output pre-processing and saving to disk
282  // Lock output to retrieve its data on cpu memory
283  VPIImageData data;
284  CHECK_STATUS(vpiImageLockData(disparity, VPI_LOCK_READ, VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR, &data));
285 
286  // Make an OpenCV matrix out of this image
287  cv::Mat cvDisparity;
288  CHECK_STATUS(vpiImageDataExportOpenCVMat(data, &cvDisparity));
289 
290  // Scale result and write it to disk. Disparities are in Q10.5 format,
291  // so to map it to float, it gets divided by 32. Then the resulting disparity range,
292  // from 0 to maxDisparity gets mapped to 0-255 for proper output.
293  cvDisparity.convertTo(cvDisparity, CV_8UC1, 255.0 / (32 * createParams.maxDisparity), 0);
294 
295  // Apply JET colormap to turn the disparities into color.
296  // Reddish hues represent objects closer to the camera, blueish are farther away.
297  cv::Mat cvDisparityColor;
298  applyColorMap(cvDisparity, cvDisparityColor, cv::COLORMAP_JET);
299 
300  // Done handling output, don't forget to unlock it.
301  CHECK_STATUS(vpiImageUnlock(disparity));
302 
303  // If we have a confidence map, adjust it for display and write it to disk too.
304  if (confidenceMap)
305  {
306  // Lock the image data and export to cv::Mat
307  VPIImageData data;
308  CHECK_STATUS(vpiImageLockData(confidenceMap, VPI_LOCK_READ, VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR, &data));
309  cv::Mat cvConfidence;
310  CHECK_STATUS(vpiImageDataExportOpenCVMat(data, &cvConfidence));
311 
312  // Confidence map varies from 0 to 65535, we scale it to [0-255].
313  cvConfidence.convertTo(cvConfidence, CV_8UC1, 255.0 / 65535, 0);
314  imwrite("confidence_" + strBackend + ".png", cvConfidence);
315 
316  CHECK_STATUS(vpiImageUnlock(confidenceMap));
317 
318  // When pixel confidence is 0, we would like its color in the disparity image to be black.
319  cv::Mat cvMask;
320  threshold(cvConfidence, cvMask, 1, 255, cv::THRESH_BINARY);
321  cvtColor(cvMask, cvMask, cv::COLOR_GRAY2BGR);
322  bitwise_and(cvDisparityColor, cvMask, cvDisparityColor);
323  }
324 
325  imwrite("disparity_" + strBackend + ".png", cvDisparityColor);
326  }
327  catch (std::exception &e)
328  {
329  std::cerr << e.what() << std::endl;
330  retval = 1;
331  }
332 
333  // ========
334  // Clean up
335 
336  // Destroying stream first makes sure that all work submitted to
337  // it is finished.
338  vpiStreamDestroy(stream);
339 
340  // Only then we can destroy the other objects, as we're sure they
341  // aren't being used anymore.
342 
343  vpiImageDestroy(inLeft);
344  vpiImageDestroy(inRight);
345  vpiImageDestroy(tmpLeft);
346  vpiImageDestroy(tmpRight);
347  vpiImageDestroy(stereoLeft);
348  vpiImageDestroy(stereoRight);
349  vpiImageDestroy(confidenceMap);
350  vpiImageDestroy(disparity);
351  vpiPayloadDestroy(stereo);
352 
353  return retval;
354 }
Declares functions that handle image format conversion.
#define VPI_IMAGE_FORMAT_S16_BL
Single plane with one block-linear 16-bit signed integer channel.
Definition: ImageFormat.h:123
#define VPI_IMAGE_FORMAT_Y8_ER_BL
Single plane with one block-linear 8-bit unsigned integer channel with full-range luma (grayscale) in...
Definition: ImageFormat.h:164
#define VPI_IMAGE_FORMAT_U16
Single plane with one 16-bit unsigned integer channel.
Definition: ImageFormat.h:111
#define VPI_IMAGE_FORMAT_S16
Single plane with one 16-bit signed integer channel.
Definition: ImageFormat.h:120
#define VPI_IMAGE_FORMAT_Y8_ER
Single plane with one pitch-linear 8-bit unsigned integer channel with full-range luma (grayscale) in...
Definition: ImageFormat.h:159
Functions and structures for dealing with VPI images.
Functions for handling OpenCV interoperability with VPI.
Declares functions that implement the Rescale algorithm.
Declaration of VPI status codes handling functions.
Declares functions that implement stereo disparity estimation algorithms.
Declares functions dealing with VPI streams.
VPIStatus vpiInitConvertImageFormatParams(VPIConvertImageFormatParams *params)
Initialize VPIConvertImageFormatParams with default values.
VPIStatus vpiSubmitConvertImageFormat(VPIStream stream, uint64_t backend, VPIImage input, VPIImage output, const VPIConvertImageFormatParams *params)
Converts the image contents to the desired format, with optional scaling and offset.
Parameters for customizing image format conversion.
uint64_t VPIImageFormat
Pre-defined image formats.
Definition: ImageFormat.h:94
void vpiImageDestroy(VPIImage img)
Destroy an image instance.
struct VPIImageImpl * VPIImage
A handle to an image.
Definition: Types.h:254
VPIStatus vpiImageLockData(VPIImage img, VPILockMode mode, VPIImageBufferType bufType, VPIImageData *data)
Acquires the lock on an image object and returns the image contents.
VPIStatus vpiImageCreate(int32_t width, int32_t height, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
Create an empty image instance with the specified flags.
VPIStatus vpiImageUnlock(VPIImage img)
Releases the lock on an image object.
@ VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR
Host-accessible with planes in pitch-linear memory layout.
Definition: Image.h:176
Stores information about image characteristics and content.
Definition: Image.h:238
VPIStatus vpiImageCreateWrapperOpenCVMat(const cv::Mat &mat, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
Wraps a cv::Mat in an VPIImage with the given image format.
VPIStatus vpiImageDataExportOpenCVMat(const VPIImageData &imgData, cv::Mat *mat)
Fills an existing cv::Mat with data from VPIImageData coming from a locked VPIImage.
struct VPIPayloadImpl * VPIPayload
A handle to an algorithm payload.
Definition: Types.h:266
void vpiPayloadDestroy(VPIPayload payload)
Deallocates the payload object and all associated resources.
VPIStatus vpiSubmitRescale(VPIStream stream, uint64_t backend, VPIImage input, VPIImage output, VPIInterpolationType interpolationType, VPIBorderExtension border, uint64_t flags)
Changes the size and scale of a 2D image.
int32_t maxDisparity
Maximum disparity for matching search.
VPIStereoDisparityConfidenceType confidenceType
Computation type to produce the confidence output.
int32_t confidenceThreshold
Confidence threshold above which disparity values are considered valid.
int32_t downscaleFactor
Output's downscale factor with respect to the input's resolution.
VPIStatus vpiInitStereoDisparityEstimatorCreationParams(VPIStereoDisparityEstimatorCreationParams *params)
Initializes VPIStereoDisparityEstimatorCreationParams with default values.
VPIStatus vpiCreateStereoDisparityEstimator(uint64_t backends, int32_t imageWidth, int32_t imageHeight, VPIImageFormat inputFormat, const VPIStereoDisparityEstimatorCreationParams *params, VPIPayload *payload)
Creates payload for vpiSubmitStereoDisparityEstimator.
VPIStatus vpiInitStereoDisparityEstimatorParams(VPIStereoDisparityEstimatorParams *params)
Initializes VPIStereoDisparityEstimatorParams with default values.
VPIStatus vpiSubmitStereoDisparityEstimator(VPIStream stream, uint64_t backend, VPIPayload payload, VPIImage left, VPIImage right, VPIImage disparity, VPIImage confidenceMap, const VPIStereoDisparityEstimatorParams *params)
Runs stereo processing on a pair of images and outputs a disparity map.
@ VPI_STEREO_CONFIDENCE_INFERENCE
The confidence value of a pixel is on a 0:UINT16_MAX scale, mapping from 0% to 100%.
Structure that defines the parameters for vpiCreateStereoDisparityEstimator.
Structure that defines the parameters for vpiSubmitStereoDisparityEstimator.
struct VPIStreamImpl * VPIStream
A handle to a stream.
Definition: Types.h:248
VPIStatus vpiStreamSync(VPIStream stream)
Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
void vpiStreamDestroy(VPIStream stream)
Destroy a stream instance and deallocate all HW resources.
VPIStatus vpiStreamCreate(uint64_t flags, VPIStream *stream)
Create a stream instance.
@ VPI_BACKEND_CUDA
CUDA backend.
Definition: Types.h:93
@ VPI_BACKEND_PVA
PVA backend.
Definition: Types.h:94
@ VPI_BACKEND_OFA
OFA backend.
Definition: Types.h:96
@ VPI_BACKEND_VIC
VIC backend.
Definition: Types.h:95
@ VPI_BORDER_CLAMP
Border pixels are repeated indefinitely.
Definition: Types.h:277
@ VPI_INTERP_LINEAR
Linear interpolation.
Definition: Interpolation.h:93
@ VPI_LOCK_READ
Lock memory only for reading.
Definition: Types.h:621