VPI - Vision Programming Interface

3.2 Release

Stereo Disparity

Overview

The Stereo Disparity application receives left and right stereo pair images and returns the disparity between them, which is a function of image depth. The result is saved as an image file to disk. If available, it'll also output the corresponding confidence map.

Instructions

The command line parameters are:

<backend> <left image> <right image>

where

  • backend: either cuda, ofa or ofa-pva-vic; it defines the backend that will perform the processing. ofa-pva-vic and cuda allow output of the confidence map in addition to the disparity.
  • left image: left input image of a rectified stereo pair, it accepts png, jpeg and possibly others.
  • right image: right input image of a stereo pair.

Here's one example:

  • C++
    ./vpi_sample_02_stereo_disparity cuda ../assets/chair_stereo_left.png ../assets/chair_stereo_right.png
  • Python
    python3 main.py cuda ../assets/chair_stereo_left.png ../assets/chair_stereo_right.png

This is using the CUDA backend and the provided sample images. You can try with other stereo pair images, respecting the constraints imposed by the algorithm.

The Python version of this sample also allow for setting various additional parameters, as well as additional input image extensions and to turn on verbose mode. The following command-line arguments can be passed to the Python sample:

  • Python
    python3 main.py <backend> <left image> <right image> --width W --height H --downscale D --window_size WIN
    --skip_confidence --conf_threshold T --conf_type absolute/relative -p1 P1 -p2 P2 --p2_alpha P2alpha
    --uniqueness U --skip_diagonal --num_passes N --min_disparity MIN --max_disparity MAX --output_mode 0/1/2
    -v/--verbose
    where the additional optional arguments are:
  • width: set the width W when passing ".raw" input images
  • height: set the height H when passing ".raw" input images
  • downscale: to set the output downscale factor as D
  • window_size: to set the median filter window size as WIN
  • skip_confidence: to avoid calculating confidence and applying it as a mask
  • conf_threshold: set the confidence threshold as T
  • conf_type: set the confidence type as either absolute or relative
  • p1: set p1 penalty as P1
  • p2: set p2 penalty as P2
  • p2_alpha: set p2Alpha adaptive penalty as P2alpha
  • uniqueness: set uniqueness as U
  • skip_diagonal: to avoid using diagonal paths in CUDA or OFA backends
  • num_passes: to set the number of passes N in OFA backends
  • min_disparity: to set the minimum disparity MIN in CUDA backend
  • max_disparity: to set the maximum disparity MAX in backends
  • output_mode: 0 for colored output, 1 for grayscale and 2 for raw binary
  • verbose: to turn on verbose mode To understand in detail each of these aditional arguments related to the stereo disparity algorithm, please read the corresponding documentation.

Results

Left input image Right input image
Stereo disparity Confidence map

Source Code

For convenience, here's the code that is also installed in the samples directory.

Language:
27 import sys
28 import vpi
29 import numpy as np
30 from PIL import Image
31 from argparse import ArgumentParser
32 import cv2
33 
34 
35 def read_raw_file(fpath, resize_to=None, verbose=False):
36  try:
37  if verbose:
38  print(f'I Reading: {fpath}', end=' ', flush=True)
39  f = open(fpath, 'rb')
40  np_arr = np.fromfile(f, dtype=np.uint16, count=-1)
41  f.close()
42  if verbose:
43  print(f'done!\nI Raw array: shape: {np_arr.shape} dtype: {np_arr.dtype}')
44  if resize_to is not None:
45  np_arr = np_arr.reshape(resize_to, order='C')
46  if verbose:
47  print(f'I Reshaped array: shape: {np_arr.shape} dtype: {np_arr.dtype}')
48  pil_img = Image.fromarray(np_arr, mode="I;16L")
49  return pil_img
50  except:
51  raise ValueError(f'E Cannot process raw input: {fpath}')
52 
53 
54 def process_arguments():
55  parser = ArgumentParser()
56 
57  parser.add_argument('backend', choices=['cuda','ofa','ofa-pva-vic'],
58  help='Backend to be used for processing')
59  parser.add_argument('left', help='Rectified left input image from a stereo pair')
60  parser.add_argument('right', help='Rectified right input image from a stereo pair')
61  parser.add_argument('--width', default=-1, type=int, help='Input width for raw input files')
62  parser.add_argument('--height', default=-1, type=int, help='Input height for raw input files')
63  parser.add_argument('--downscale', default=1, type=int, help='Output downscale factor')
64  parser.add_argument('--window_size', default=5, type=int, help='Median filter window size')
65  parser.add_argument('--skip_confidence', default=False, action='store_true', help='Do not calculate confidence')
66  parser.add_argument('--conf_threshold', default=32767, type=int, help='Confidence threshold')
67  parser.add_argument('--conf_type', default='best', choices=['best', 'absolute', 'relative', 'inference'],
68  help='Computation type to produce the confidence output. Default will pick best option given backend.')
69  parser.add_argument('-p1', default=3, type=int, help='Penalty P1 on small disparities')
70  parser.add_argument('-p2', default=48, type=int, help='Penalty P2 on large disparities')
71  parser.add_argument('--p2_alpha', default=0, type=int, help='Alpha for adaptive P2 Penalty')
72  parser.add_argument('--uniqueness', default=-1, type=float, help='Uniqueness ratio')
73  parser.add_argument('--skip_diagonal', default=False, action='store_true', help='Do not use diagonal paths')
74  parser.add_argument('--num_passes', default=3, type=int, help='Number of passes')
75  parser.add_argument('--min_disparity', default=0, type=int, help='Minimum disparity')
76  parser.add_argument('--max_disparity', default=256, type=int, help='Maximum disparity')
77  parser.add_argument('--output_mode', default=0, type=int, help='0: color; 1: grayscale; 2: raw binary')
78  parser.add_argument('-v', '--verbose', default=False, action='store_true', help='Verbose mode')
79 
80  return parser.parse_args()
81 
82 
83 def main():
84  args = process_arguments()
85 
86  scale = 1 # pixel value scaling factor when loading input
87 
88  if args.backend == 'cuda':
89  backend = vpi.Backend.CUDA
90  elif args.backend == 'ofa':
91  backend = vpi.Backend.OFA
92  elif args.backend == 'ofa-pva-vic':
93  backend = vpi.Backend.OFA|vpi.Backend.PVA|vpi.Backend.VIC
94  else:
95  raise ValueError(f'E Invalid backend: {args.backend}')
96 
97  conftype = None
98  if args.conf_type == 'best':
99  conftype = vpi.ConfidenceType.INFERENCE if args.backend == 'ofa-pva-vic' else vpi.ConfidenceType.ABSOLUTE
100  elif args.conf_type == 'absolute':
101  conftype = vpi.ConfidenceType.ABSOLUTE
102  elif args.conf_type == 'relative':
103  conftype = vpi.ConfidenceType.RELATIVE
104  elif args.conf_type == 'inference':
105  conftype = vpi.ConfidenceType.INFERENCE
106  else:
107  raise ValueError(f'E Invalid confidence type: {args.conf_type}')
108 
109  minDisparity = args.min_disparity
110  maxDisparity = args.max_disparity
111  includeDiagonals = not args.skip_diagonal
112  numPasses = args.num_passes
113  calcConf = not args.skip_confidence
114  downscale = args.downscale
115  windowSize = args.window_size
116  quality = 6
117 
118  if args.verbose:
119  print(f'I Backend: {backend}\nI Left image: {args.left}\nI Right image: {args.right}\n'
120  f'I Disparities (min, max): {(minDisparity, maxDisparity)}\n'
121  f'I Input scale factor: {scale}\nI Output downscale factor: {downscale}\n'
122  f'I Window size: {windowSize}\nI Quality: {quality}\n'
123  f'I Calculate confidence: {calcConf}\nI Confidence threshold: {args.conf_threshold}\n'
124  f'I Confidence type: {conftype}\nI Uniqueness ratio: {args.uniqueness}\n'
125  f'I Penalty P1: {args.p1}\nI Penalty P2: {args.p2}\nI Adaptive P2 alpha: {args.p2_alpha}\n'
126  f'I Include diagonals: {includeDiagonals}\nI Number of passes: {numPasses}\n'
127  f'I Output mode: {args.output_mode}\nI Verbose: {args.verbose}\n'
128  , end='', flush=True)
129 
130  if 'raw' in args.left:
131  pil_left = read_raw_file(args.left, resize_to=[args.height, args.width], verbose=args.verbose)
132  np_left = np.asarray(pil_left)
133  else:
134  try:
135  pil_left = Image.open(args.left)
136  if pil_left.mode == 'I':
137  np_left = np.asarray(pil_left).astype(np.int16)
138  else:
139  np_left = np.asarray(pil_left)
140  except:
141  raise ValueError(f'E Cannot open left input image: {args.left}')
142 
143  if 'raw' in args.right:
144  pil_right = read_raw_file(args.right, resize_to=[args.height, args.width], verbose=args.verbose)
145  np_right = np.asarray(pil_right)
146  else:
147  try:
148  pil_right = Image.open(args.right)
149  if pil_right.mode == 'I':
150  np_right = np.asarray(pil_right).astype(np.int16)
151  else:
152  np_right = np.asarray(pil_right)
153  except:
154  raise ValueError(f'E Cannot open right input image: {args.right}')
155 
156  # Streams for left and right independent pre-processing
157  streamLeft = vpi.Stream()
158  streamRight = vpi.Stream()
159 
160  # Load input into a vpi.Image and convert it to grayscale, 16bpp
161  with vpi.Backend.CUDA:
162  with streamLeft:
163  left = vpi.asimage(np_left).convert(vpi.Format.Y16_ER, scale=scale)
164  with streamRight:
165  right = vpi.asimage(np_right).convert(vpi.Format.Y16_ER, scale=scale)
166 
167  # Preprocess input
168  # Block linear format is needed for ofa backends
169  # We use VIC backend for the format conversion because it is low power
170  if args.backend in {'ofa-pva-vic', 'ofa'}:
171  if args.verbose:
172  print(f'W {args.backend} forces to convert input images to block linear', flush=True)
173  with vpi.Backend.VIC:
174  with streamLeft:
175  left = left.convert(vpi.Format.Y16_ER_BL)
176  with streamRight:
177  right = right.convert(vpi.Format.Y16_ER_BL)
178 
179  if args.verbose:
180  print(f'I Input left image: {left.size} {left.format}\n'
181  f'I Input right image: {right.size} {right.format}', flush=True)
182 
183  confidenceU16 = None
184 
185  if calcConf:
186  if args.backend not in {'cuda', 'ofa-pva-vic'}:
187  # Only CUDA and OFA-PVA-VIC support confidence map
188  calcConf = False
189  if args.verbose:
190  print(f'W {args.backend} does not allow to calculate confidence', flush=True)
191 
192 
193  outWidth = (left.size[0] + downscale - 1) // downscale
194  outHeight = (left.size[1] + downscale - 1) // downscale
195 
196  if calcConf:
197  confidenceU16 = vpi.Image((outWidth, outHeight), vpi.Format.U16)
198 
199  # Use stream left to consolidate actual stereo processing
200  streamStereo = streamLeft
201 
202  if args.backend == 'ofa-pva-vic' and maxDisparity not in {128, 256}:
203  maxDisparity = 128 if (maxDisparity // 128) < 1 else 256
204  if args.verbose:
205  print(f'W {args.backend} only supports 128 or 256 maxDisparity. Overriding to {maxDisparity}', flush=True)
206 
207  if args.verbose:
208  if 'ofa' not in args.backend:
209  print('W Ignoring P2 alpha and number of passes since not an OFA backend', flush=True)
210  if args.backend != 'cuda':
211  print('W Ignoring uniqueness since not a CUDA backend', flush=True)
212  print('I Estimating stereo disparity ... ', end='', flush=True)
213 
214  # Estimate stereo disparity.
215  with streamStereo, backend:
216  disparityS16 = vpi.stereodisp(left, right, downscale=downscale, out_confmap=confidenceU16,
217  window=windowSize, maxdisp=maxDisparity, confthreshold=args.conf_threshold,
218  quality=quality, conftype=conftype, mindisp=minDisparity,
219  p1=args.p1, p2=args.p2, p2alpha=args.p2_alpha, uniqueness=args.uniqueness,
220  includediagonals=includeDiagonals, numpasses=numPasses)
221 
222  if args.verbose:
223  print('done!\nI Post-processing ... ', end='', flush=True)
224 
225  # Postprocess results and save them to disk
226  with streamStereo, vpi.Backend.CUDA:
227  # Some backends outputs disparities in block-linear format, we must convert them to
228  # pitch-linear for consistency with other backends.
229  if disparityS16.format == vpi.Format.S16_BL:
230  disparityS16 = disparityS16.convert(vpi.Format.S16, backend=vpi.Backend.VIC)
231 
232  # Scale disparity and confidence map so that values like between 0 and 255.
233 
234  # Disparities are in Q10.5 format, so to map it to float, it gets
235  # divided by 32. Then the resulting disparity range, from 0 to
236  # stereo.maxDisparity gets mapped to 0-255 for proper output.
237  # Copy disparity values back to the CPU.
238  disparityU8 = disparityS16.convert(vpi.Format.U8, scale=255.0/(32*maxDisparity)).cpu()
239 
240  # Apply JET colormap to turn the disparities into color, reddish hues
241  # represent objects closer to the camera, blueish are farther away.
242  disparityColor = cv2.applyColorMap(disparityU8, cv2.COLORMAP_JET)
243 
244  # Converts to RGB for output with PIL.
245  disparityColor = cv2.cvtColor(disparityColor, cv2.COLOR_BGR2RGB)
246 
247  if calcConf:
248  confidenceU8 = confidenceU16.convert(vpi.Format.U8, scale=255.0/65535).cpu()
249 
250  # When pixel confidence is 0, its color in the disparity is black.
251  mask = cv2.threshold(confidenceU8, 1, 255, cv2.THRESH_BINARY)[1]
252  mask = cv2.cvtColor(mask, cv2.COLOR_GRAY2BGR)
253  disparityColor = cv2.bitwise_and(disparityColor, mask)
254 
255  fext = '.raw' if args.output_mode == 2 else '.png'
256 
257  disparity_fname = f'disparity_python{sys.version_info[0]}_{args.backend}' + fext
258  confidence_fname = f'confidence_python{sys.version_info[0]}_{args.backend}' + fext
259 
260  if args.verbose:
261  print(f'done!\nI Disparity output: {disparity_fname}', flush=True)
262  if calcConf:
263  print(f'I Confidence output: {confidence_fname}', flush=True)
264 
265  # Save results to disk.
266  try:
267  if args.output_mode == 0:
268  Image.fromarray(disparityColor).save(disparity_fname)
269  if args.verbose:
270  print(f'I Output disparity image: {disparityColor.shape} '
271  f'{disparityColor.dtype}', flush=True)
272  elif args.output_mode == 1:
273  Image.fromarray(disparityU8).save(disparity_fname)
274  if args.verbose:
275  print(f'I Output disparity image: {disparityU8.shape} '
276  f'{disparityU8.dtype}', flush=True)
277  elif args.output_mode == 2:
278  disparityS16.cpu().tofile(disparity_fname)
279  if args.verbose:
280  print(f'I Output disparity image: {disparityS16.size} '
281  f'{disparityS16.format}', flush=True)
282 
283  if calcConf:
284  if args.output_mode == 0 or args.output_mode == 1:
285  Image.fromarray(confidenceU8).save(confidence_fname)
286  if args.verbose:
287  print(f'I Output confidence image: {confidenceU8.shape} '
288  f'{confidenceU8.dtype}', flush=True)
289  else:
290  confidenceU16.cpu().tofile(confidence_fname)
291  if args.verbose:
292  print(f'I Output confidence image: {confidenceU16.size} '
293  f'{confidenceU16.format}', flush=True)
294 
295  except:
296  raise ValueError(f'E Cannot write outputs: {disparity_fname}, {confidence_fname}\n'
297  f'E Using output mode: {args.output_mode}')
298 
299 
300 if __name__ == '__main__':
301  main()
29 #include <opencv2/core/version.hpp>
30 #if CV_MAJOR_VERSION >= 3
31 # include <opencv2/imgcodecs.hpp>
32 #else
33 # include <opencv2/contrib/contrib.hpp> // for colormap
34 # include <opencv2/highgui/highgui.hpp>
35 #endif
36 
37 #include <opencv2/imgproc/imgproc.hpp>
38 #include <vpi/OpenCVInterop.hpp>
39 
40 #include <vpi/Image.h>
41 #include <vpi/Status.h>
42 #include <vpi/Stream.h>
44 #include <vpi/algo/Rescale.h>
46 
47 #include <cstring>
48 #include <iostream>
49 #include <sstream>
50 
51 #define CHECK_STATUS(STMT) \
52  do \
53  { \
54  VPIStatus status = (STMT); \
55  if (status != VPI_SUCCESS) \
56  { \
57  char buffer[VPI_MAX_STATUS_MESSAGE_LENGTH]; \
58  vpiGetLastStatusMessage(buffer, sizeof(buffer)); \
59  std::ostringstream ss; \
60  ss << "line " << __LINE__ << " " << vpiStatusGetName(status) << ": " << buffer; \
61  throw std::runtime_error(ss.str()); \
62  } \
63  } while (0);
64 
65 int main(int argc, char *argv[])
66 {
67  // OpenCV image that will be wrapped by a VPIImage.
68  // Define it here so that it's destroyed *after* wrapper is destroyed
69  cv::Mat cvImageLeft, cvImageRight;
70 
71  // VPI objects that will be used
72  VPIImage inLeft = NULL;
73  VPIImage inRight = NULL;
74  VPIImage tmpLeft = NULL;
75  VPIImage tmpRight = NULL;
76  VPIImage stereoLeft = NULL;
77  VPIImage stereoRight = NULL;
78  VPIImage disparity = NULL;
79  VPIImage confidenceMap = NULL;
80  VPIStream stream = NULL;
81  VPIPayload stereo = NULL;
82 
83  int retval = 0;
84 
85  try
86  {
87  // =============================
88  // Parse command line parameters
89 
90  if (argc != 4)
91  {
92  throw std::runtime_error(std::string("Usage: ") + argv[0] +
93  " <cuda|ofa|ofa-pva-vic> <left image> <right image>");
94  }
95 
96  std::string strBackend = argv[1];
97  std::string strLeftFileName = argv[2];
98  std::string strRightFileName = argv[3];
99 
100  uint64_t backends;
101 
102  if (strBackend == "cuda")
103  {
104  backends = VPI_BACKEND_CUDA;
105  }
106  else if (strBackend == "ofa")
107  {
108  backends = VPI_BACKEND_OFA;
109  }
110  else if (strBackend == "ofa-pva-vic")
111  {
113  }
114  else
115  {
116  throw std::runtime_error("Backend '" + strBackend +
117  "' not recognized, it must be either cuda, ofa or ofa-pva-vic.");
118  }
119 
120  // =====================
121  // Load the input images
122  cvImageLeft = cv::imread(strLeftFileName);
123  if (cvImageLeft.empty())
124  {
125  throw std::runtime_error("Can't open '" + strLeftFileName + "'");
126  }
127 
128  cvImageRight = cv::imread(strRightFileName);
129  if (cvImageRight.empty())
130  {
131  throw std::runtime_error("Can't open '" + strRightFileName + "'");
132  }
133 
134  // =================================
135  // Allocate all VPI resources needed
136 
137  int32_t inputWidth = cvImageLeft.cols;
138  int32_t inputHeight = cvImageLeft.rows;
139 
140  // Create the stream that will be used for processing.
141  CHECK_STATUS(vpiStreamCreate(0, &stream));
142 
143  // We now wrap the loaded images into a VPIImage object to be used by VPI.
144  // VPI won't make a copy of it, so the original image must be in scope at all times.
145  CHECK_STATUS(vpiImageCreateWrapperOpenCVMat(cvImageLeft, 0, &inLeft));
146  CHECK_STATUS(vpiImageCreateWrapperOpenCVMat(cvImageRight, 0, &inRight));
147 
148  // Format conversion parameters needed for input pre-processing
149  VPIConvertImageFormatParams convParams;
150  CHECK_STATUS(vpiInitConvertImageFormatParams(&convParams));
151 
152  // Initialize default parameters
154  CHECK_STATUS(vpiInitStereoDisparityEstimatorCreationParams(&createParams));
155 
156  // Select max disparity that works well for the chair_stereo_{left,right}_1920.png files
157  createParams.maxDisparity = 256;
158 
159  // Default format and size for input stereo pair (some backends require adjustments, see below)
160  VPIImageFormat stereoFormat = VPI_IMAGE_FORMAT_Y8_ER;
161 
162  int stereoWidth = inputWidth;
163  int stereoHeight = inputHeight;
164 
165  // Default format and size for output
166  VPIImageFormat disparityFormat = VPI_IMAGE_FORMAT_S16;
167 
168  int outputWidth = inputWidth;
169  int outputHeight = inputHeight;
170 
171  // Override some backend-dependent parameters
172  if (strBackend.find("ofa") != std::string::npos)
173  {
174  // Implementations using OFA require BL input
175  stereoFormat = VPI_IMAGE_FORMAT_Y8_ER_BL;
176 
177  if (strBackend == "ofa")
178  {
179  // when using OFA alone, output must also be BL
180  disparityFormat = VPI_IMAGE_FORMAT_S16_BL;
181  }
182 
183  // Using downscale factor with OFA improves performance
184  createParams.downscaleFactor = 2;
185  outputWidth = (inputWidth + createParams.downscaleFactor - 1) / createParams.downscaleFactor;
186  outputHeight = (inputHeight + createParams.downscaleFactor - 1) / createParams.downscaleFactor;
187 
188  // Output width including downscaleFactor must be at least max(64, maxDisparity/downscaleFactor) when the
189  // OFA+PVA+VIC backend is used
190  if (strBackend.find("pva") != std::string::npos)
191  {
192  int minWidth = std::max(createParams.maxDisparity / createParams.downscaleFactor, outputWidth);
193  outputWidth = std::max(64, minWidth);
194  outputHeight = (inputHeight * outputWidth) / inputWidth;
195  stereoWidth = outputWidth * createParams.downscaleFactor;
196  stereoHeight = outputHeight * createParams.downscaleFactor;
197  }
198  }
199 
200  // Create the payload for Stereo Disparity algorithm.
201  // Payload is created before the image objects so that non-supported backends can be trapped with an error.
202  CHECK_STATUS(vpiCreateStereoDisparityEstimator(backends, stereoWidth, stereoHeight, stereoFormat, &createParams,
203  &stereo));
204 
205  // Create the output image where the disparity map will be stored.
206  CHECK_STATUS(vpiImageCreate(outputWidth, outputHeight, disparityFormat, 0, &disparity));
207 
208  // Create the input stereo images
209  CHECK_STATUS(vpiImageCreate(stereoWidth, stereoHeight, stereoFormat, 0, &stereoLeft));
210  CHECK_STATUS(vpiImageCreate(stereoWidth, stereoHeight, stereoFormat, 0, &stereoRight));
211 
212  // Create the confidence image if the backend can support it
213  if (strBackend == "ofa-pva-vic" || strBackend == "cuda")
214  {
215  CHECK_STATUS(vpiImageCreate(outputWidth, outputHeight, VPI_IMAGE_FORMAT_U16, 0, &confidenceMap));
216  }
217 
218  // If a rescale of the input is required, create temporary images for the initial format conversion.
219  bool const isRescaleRequired = (stereoWidth != inputWidth) || (stereoHeight != inputHeight);
220  if (isRescaleRequired)
221  {
222  CHECK_STATUS(vpiImageCreate(inputWidth, inputHeight, stereoFormat, 0, &tmpLeft));
223  CHECK_STATUS(vpiImageCreate(inputWidth, inputHeight, stereoFormat, 0, &tmpRight));
224  }
225 
226  // ================
227  // Processing stage
228 
229  // Start with default parameters, and override some values depending on what backend is used.
231  CHECK_STATUS(vpiInitStereoDisparityEstimatorParams(&submitParams));
232  if (strBackend == "ofa-pva-vic")
233  {
234  // The INFERENCE confidence type achieves better performance with OFA+PVA+VIC backend. The only tradeoff is
235  // that the deep-learning based confidence map is not easily expressed as a function of left and right
236  // disparity estimates, in contrast to ABSOLUTE or RELATIVE confidence type.
238  }
239  else if (strBackend == "cuda")
240  {
241  // The chair_stereo_{left,right}_1920.png inputs benefit from a higher confidence threshold with CUDA
242  submitParams.confidenceThreshold = UINT16_MAX - 10000;
243  }
244 
245  // -----------------
246  // Pre-process input
247  if (isRescaleRequired)
248  {
249  // We require a conversion with CUDA only because we loaded the images in the default BGR format from OpenCV
250  // and the VIC backend does not support 3-channel RGB/BGR image formats.
251  // Alternatively, we could load grayscale images and handle the conversion+rescale in one operation on VIC.
252 
253  // Convert opencv input to grayscale format using CUDA
254  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, inLeft, tmpLeft, &convParams));
255  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, inRight, tmpRight, &convParams));
256 
257  // Rescale on VIC
258  CHECK_STATUS(
259  vpiSubmitRescale(stream, VPI_BACKEND_VIC, tmpLeft, stereoLeft, VPI_INTERP_LINEAR, VPI_BORDER_CLAMP, 0));
260  CHECK_STATUS(vpiSubmitRescale(stream, VPI_BACKEND_VIC, tmpRight, stereoRight, VPI_INTERP_LINEAR,
261  VPI_BORDER_CLAMP, 0));
262  }
263  else
264  {
265  // Convert opencv input to grayscale format using CUDA
266  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, inLeft, stereoLeft, &convParams));
267  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, inRight, stereoRight, &convParams));
268  }
269 
270  // ------------------------------
271  // Do stereo disparity estimation
272 
273  // Submit it with the input and output images
274  CHECK_STATUS(vpiSubmitStereoDisparityEstimator(stream, backends, stereo, stereoLeft, stereoRight, disparity,
275  confidenceMap, &submitParams));
276 
277  // Wait until the algorithm finishes processing
278  CHECK_STATUS(vpiStreamSync(stream));
279 
280  // ========================================
281  // Output pre-processing and saving to disk
282  // Lock output to retrieve its data on cpu memory
283  VPIImageData data;
284  CHECK_STATUS(vpiImageLockData(disparity, VPI_LOCK_READ, VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR, &data));
285 
286  // Make an OpenCV matrix out of this image
287  cv::Mat cvDisparity;
288  CHECK_STATUS(vpiImageDataExportOpenCVMat(data, &cvDisparity));
289 
290  // Scale result and write it to disk. Disparities are in Q10.5 format,
291  // so to map it to float, it gets divided by 32. Then the resulting disparity range,
292  // from 0 to maxDisparity gets mapped to 0-255 for proper output.
293  cvDisparity.convertTo(cvDisparity, CV_8UC1, 255.0 / (32 * createParams.maxDisparity), 0);
294 
295  // Apply JET colormap to turn the disparities into color.
296  // Reddish hues represent objects closer to the camera, blueish are farther away.
297  cv::Mat cvDisparityColor;
298  applyColorMap(cvDisparity, cvDisparityColor, cv::COLORMAP_JET);
299 
300  // Done handling output, don't forget to unlock it.
301  CHECK_STATUS(vpiImageUnlock(disparity));
302 
303  // If we have a confidence map, adjust it for display and write it to disk too.
304  if (confidenceMap)
305  {
306  // Lock the image data and export to cv::Mat
307  VPIImageData data;
308  CHECK_STATUS(vpiImageLockData(confidenceMap, VPI_LOCK_READ, VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR, &data));
309  cv::Mat cvConfidence;
310  CHECK_STATUS(vpiImageDataExportOpenCVMat(data, &cvConfidence));
311 
312  // Confidence map varies from 0 to 65535, we scale it to [0-255].
313  cvConfidence.convertTo(cvConfidence, CV_8UC1, 255.0 / 65535, 0);
314  imwrite("confidence_" + strBackend + ".png", cvConfidence);
315 
316  CHECK_STATUS(vpiImageUnlock(confidenceMap));
317 
318  // When pixel confidence is 0, we would like its color in the disparity image to be black.
319  cv::Mat cvMask;
320  threshold(cvConfidence, cvMask, 1, 255, cv::THRESH_BINARY);
321  cvtColor(cvMask, cvMask, cv::COLOR_GRAY2BGR);
322  bitwise_and(cvDisparityColor, cvMask, cvDisparityColor);
323  }
324 
325  imwrite("disparity_" + strBackend + ".png", cvDisparityColor);
326  }
327  catch (std::exception &e)
328  {
329  std::cerr << e.what() << std::endl;
330  retval = 1;
331  }
332 
333  // ========
334  // Clean up
335 
336  // Destroying stream first makes sure that all work submitted to
337  // it is finished.
338  vpiStreamDestroy(stream);
339 
340  // Only then we can destroy the other objects, as we're sure they
341  // aren't being used anymore.
342 
343  vpiImageDestroy(inLeft);
344  vpiImageDestroy(inRight);
345  vpiImageDestroy(tmpLeft);
346  vpiImageDestroy(tmpRight);
347  vpiImageDestroy(stereoLeft);
348  vpiImageDestroy(stereoRight);
349  vpiImageDestroy(confidenceMap);
350  vpiImageDestroy(disparity);
351  vpiPayloadDestroy(stereo);
352 
353  return retval;
354 }
Declares functions that handle image format conversion.
#define VPI_IMAGE_FORMAT_S16_BL
Single plane with one block-linear 16-bit signed integer channel.
Definition: ImageFormat.h:123
#define VPI_IMAGE_FORMAT_Y8_ER_BL
Single plane with one block-linear 8-bit unsigned integer channel with full-range luma (grayscale) in...
Definition: ImageFormat.h:164
#define VPI_IMAGE_FORMAT_U16
Single plane with one 16-bit unsigned integer channel.
Definition: ImageFormat.h:111
#define VPI_IMAGE_FORMAT_S16
Single plane with one 16-bit signed integer channel.
Definition: ImageFormat.h:120
#define VPI_IMAGE_FORMAT_Y8_ER
Single plane with one pitch-linear 8-bit unsigned integer channel with full-range luma (grayscale) in...
Definition: ImageFormat.h:159
Functions and structures for dealing with VPI images.
Functions for handling OpenCV interoperability with VPI.
Declares functions that implement the Rescale algorithm.
Declaration of VPI status codes handling functions.
Declares functions that implement stereo disparity estimation algorithms.
Declares functions dealing with VPI streams.
VPIStatus vpiInitConvertImageFormatParams(VPIConvertImageFormatParams *params)
Initialize VPIConvertImageFormatParams with default values.
VPIStatus vpiSubmitConvertImageFormat(VPIStream stream, uint64_t backend, VPIImage input, VPIImage output, const VPIConvertImageFormatParams *params)
Converts the image contents to the desired format, with optional scaling and offset.
Parameters for customizing image format conversion.
uint64_t VPIImageFormat
Pre-defined image formats.
Definition: ImageFormat.h:94
void vpiImageDestroy(VPIImage img)
Destroy an image instance.
struct VPIImageImpl * VPIImage
A handle to an image.
Definition: Types.h:256
VPIStatus vpiImageLockData(VPIImage img, VPILockMode mode, VPIImageBufferType bufType, VPIImageData *data)
Acquires the lock on an image object and returns the image contents.
VPIStatus vpiImageCreate(int32_t width, int32_t height, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
Create an empty image instance with the specified flags.
VPIStatus vpiImageUnlock(VPIImage img)
Releases the lock on an image object.
@ VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR
Host-accessible with planes in pitch-linear memory layout.
Definition: Image.h:172
Stores information about image characteristics and content.
Definition: Image.h:234
VPIStatus vpiImageCreateWrapperOpenCVMat(const cv::Mat &mat, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
Wraps a cv::Mat in an VPIImage with the given image format.
VPIStatus vpiImageDataExportOpenCVMat(const VPIImageData &imgData, cv::Mat *mat)
Fills an existing cv::Mat with data from VPIImageData coming from a locked VPIImage.
struct VPIPayloadImpl * VPIPayload
A handle to an algorithm payload.
Definition: Types.h:268
void vpiPayloadDestroy(VPIPayload payload)
Deallocates the payload object and all associated resources.
VPIStatus vpiSubmitRescale(VPIStream stream, uint64_t backend, VPIImage input, VPIImage output, VPIInterpolationType interpolationType, VPIBorderExtension border, uint64_t flags)
Changes the size and scale of a 2D image.
int32_t maxDisparity
Maximum disparity for matching search.
VPIStereoDisparityConfidenceType confidenceType
Computation type to produce the confidence output.
int32_t confidenceThreshold
Confidence threshold above which disparity values are considered valid.
int32_t downscaleFactor
Output's downscale factor with respect to the input's resolution.
VPIStatus vpiInitStereoDisparityEstimatorCreationParams(VPIStereoDisparityEstimatorCreationParams *params)
Initializes VPIStereoDisparityEstimatorCreationParams with default values.
VPIStatus vpiCreateStereoDisparityEstimator(uint64_t backends, int32_t imageWidth, int32_t imageHeight, VPIImageFormat inputFormat, const VPIStereoDisparityEstimatorCreationParams *params, VPIPayload *payload)
Creates payload for vpiSubmitStereoDisparityEstimator.
VPIStatus vpiInitStereoDisparityEstimatorParams(VPIStereoDisparityEstimatorParams *params)
Initializes VPIStereoDisparityEstimatorParams with default values.
VPIStatus vpiSubmitStereoDisparityEstimator(VPIStream stream, uint64_t backend, VPIPayload payload, VPIImage left, VPIImage right, VPIImage disparity, VPIImage confidenceMap, const VPIStereoDisparityEstimatorParams *params)
Runs stereo processing on a pair of images and outputs a disparity map.
@ VPI_STEREO_CONFIDENCE_INFERENCE
The confidence value of a pixel is on a 0:UINT16_MAX scale, mapping from 0% to 100%.
Structure that defines the parameters for vpiCreateStereoDisparityEstimator.
Structure that defines the parameters for vpiSubmitStereoDisparityEstimator.
struct VPIStreamImpl * VPIStream
A handle to a stream.
Definition: Types.h:250
VPIStatus vpiStreamSync(VPIStream stream)
Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
void vpiStreamDestroy(VPIStream stream)
Destroy a stream instance and deallocate all HW resources.
VPIStatus vpiStreamCreate(uint64_t flags, VPIStream *stream)
Create a stream instance.
@ VPI_BACKEND_CUDA
CUDA backend.
Definition: Types.h:93
@ VPI_BACKEND_PVA
PVA backend.
Definition: Types.h:94
@ VPI_BACKEND_OFA
OFA backend.
Definition: Types.h:97
@ VPI_BACKEND_VIC
VIC backend.
Definition: Types.h:95
@ VPI_BORDER_CLAMP
Border pixels are repeated indefinitely.
Definition: Types.h:279
@ VPI_INTERP_LINEAR
Linear interpolation.
Definition: Interpolation.h:93
@ VPI_LOCK_READ
Lock memory only for reading.
Definition: Types.h:617