VPI - Vision Programming Interface

2.4 Release

Stereo Disparity

Overview

The Stereo Disparity application receives left and right stereo pair images and returns the disparity between them, which is a function of image depth. The result is saved as an image file to disk. If available, it'll also output the corresponding confidence map.

Instructions

The command line parameters are:

<backend> <left image> <right image>

where

  • backend: either cpu, cuda, pva, ofa, ofa-pva-vic or pva-nvenc-vic; it defines the backend that will perform the processing. pva-nvenc-vic, ofa-pva-vic and cuda allow output of the confidence map in addition to the disparity.
  • left image: left input image of a rectified stereo pair, it accepts png, jpeg and possibly others.
  • right image: right input image of a stereo pair.
  • Note: for pva-nvenc-vic backend the left_image's size must be 1920x1080.

Here's one example:

  • C++
    ./vpi_sample_02_stereo_disparity cuda ../assets/chair_stereo_left.png ../assets/chair_stereo_right.png
  • Python
    python3 main.py cuda ../assets/chair_stereo_left.png ../assets/chair_stereo_right.png

This is using the CUDA backend and the provided sample images. You can try with other stereo pair images, respecting the constraints imposed by the algorithm.

The Python version of this sample also allow for setting various additional parameters, as well as additional input image extensions and to turn on verbose mode. The following command-line arguments can be passed to the Python sample:

  • Python
    python3 main.py <backend> <left image> <right image> --width W --height H --downscale D --window_size WIN
    --skip_confidence --conf_threshold T --conf_type absolute/relative -p1 P1 -p2 P2 --p2_alpha P2alpha
    --uniqueness U --skip_diagonal --num_passes N --min_disparity MIN --max_disparity MAX --output_mode 0/1/2
    -v/--verbose
    where the additional optional arguments are:
  • width: set the width W when passing ".raw" input images
  • height: set the height H when passing ".raw" input images
  • downscale: to set the output downscale factor as D
  • window_size: to set the median filter window size as WIN
  • skip_confidence: to avoid calculating confidence and applying it as a mask
  • conf_threshold: set the confidence threshold as T
  • conf_type: set the confidence type as either absolute or relative
  • p1: set p1 penalty as P1
  • p2: set p2 penalty as P2
  • p2_alpha: set p2Alpha adaptive penalty as P2alpha
  • uniqueness: set uniqueness as U
  • skip_diagonal: to avoid using diagonal paths in CUDA or OFA backends
  • num_passes: to set the number of passes N in OFA backends
  • min_disparity: to set the minimum disparity MIN in CUDA backend
  • max_disparity: to set the maximum disparity MAX in backends
  • output_mode: 0 for colored output, 1 for grayscale and 2 for raw binary
  • verbose: to turn on verbose mode To understand in detail each of these aditional arguments related to the stereo disparity algorithm, please read the corresponding documentation.

Results

Left input image Right input image
Stereo disparity Confidence map

Source Code

For convenience, here's the code that is also installed in the samples directory.

Language:
27 import cv2
28 import sys
29 import vpi
30 import numpy as np
31 from PIL import Image
32 from argparse import ArgumentParser
33 
34 
35 def read_raw_file(fpath, resize_to=None, verbose=False):
36  try:
37  if verbose:
38  print(f'I Reading: {fpath}', end=' ', flush=True)
39  f = open(fpath, 'rb')
40  np_arr = np.fromfile(f, dtype=np.uint16, count=-1)
41  f.close()
42  if verbose:
43  print(f'done!\nI Raw array: shape: {np_arr.shape} dtype: {np_arr.dtype}')
44  if resize_to is not None:
45  np_arr = np_arr.reshape(resize_to, order='C')
46  if verbose:
47  print(f'I Reshaped array: shape: {np_arr.shape} dtype: {np_arr.dtype}')
48  pil_img = Image.fromarray(np_arr, mode="I;16L")
49  return pil_img
50  except:
51  raise ValueError(f'E Cannot process raw input: {fpath}')
52 
53 
54 def process_arguments():
55  parser = ArgumentParser()
56 
57  parser.add_argument('backend', choices=['cpu','cuda','pva','ofa','ofa-pva-vic','pva-nvenc-vic'],
58  help='Backend to be used for processing')
59  parser.add_argument('left', help='Rectified left input image from a stereo pair')
60  parser.add_argument('right', help='Rectified right input image from a stereo pair')
61  parser.add_argument('--width', default=-1, type=int, help='Input width for raw input files')
62  parser.add_argument('--height', default=-1, type=int, help='Input height for raw input files')
63  parser.add_argument('--downscale', default=1, type=int, help='Output downscale factor')
64  parser.add_argument('--window_size', default=5, type=int, help='Median filter window size')
65  parser.add_argument('--skip_confidence', default=False, action='store_true', help='Do not calculate confidence')
66  parser.add_argument('--conf_threshold', default=32767, type=int, help='Confidence threshold')
67  parser.add_argument('--conf_type', default='absolute', choices=['absolute', 'relative'],
68  help='Computation type to produce the confidence output')
69  parser.add_argument('-p1', default=3, type=int, help='Penalty P1 on small disparities')
70  parser.add_argument('-p2', default=48, type=int, help='Penalty P2 on large disparities')
71  parser.add_argument('--p2_alpha', default=0, type=int, help='Alpha for adaptive P2 Penalty')
72  parser.add_argument('--uniqueness', default=-1, type=float, help='Uniqueness ratio')
73  parser.add_argument('--skip_diagonal', default=False, action='store_true', help='Do not use diagonal paths')
74  parser.add_argument('--num_passes', default=3, type=int, help='Number of passes')
75  parser.add_argument('--min_disparity', default=0, type=int, help='Minimum disparity')
76  parser.add_argument('--max_disparity', default=256, type=int, help='Maximum disparity')
77  parser.add_argument('--output_mode', default=0, type=int, help='0: color; 1: grayscale; 2: raw binary')
78  parser.add_argument('-v', '--verbose', default=False, action='store_true', help='Verbose mode')
79 
80  return parser.parse_args()
81 
82 
83 def main():
84  args = process_arguments()
85 
86  scale = 1 # pixel value scaling factor when loading input
87 
88  if args.backend == 'cpu':
89  backend = vpi.Backend.CPU
90  elif args.backend == 'cuda':
91  backend = vpi.Backend.CUDA
92  elif args.backend == 'pva':
93  backend = vpi.Backend.PVA
94  elif args.backend == 'ofa':
95  backend = vpi.Backend.OFA
96  elif args.backend == 'ofa-pva-vic':
97  backend = vpi.Backend.OFA|vpi.Backend.PVA|vpi.Backend.VIC
98  elif args.backend == 'pva-nvenc-vic':
99  backend = vpi.Backend.PVA|vpi.Backend.NVENC|vpi.Backend.VIC
100  # For PVA+NVENC+VIC mode, 16bpp input must be MSB-aligned, which
101  # is equivalent to say that it is Q8.8 (fixed-point, 8 decimals).
102  scale = 256
103  else:
104  raise ValueError(f'E Invalid backend: {args.backend}')
105 
106  conftype = None
107  if args.conf_type == 'absolute':
108  conftype = vpi.ConfidenceType.ABSOLUTE
109  elif args.conf_type == 'relative':
110  conftype = vpi.ConfidenceType.RELATIVE
111  else:
112  raise ValueError(f'E Invalid confidence type: {args.conf_type}')
113 
114  minDisparity = args.min_disparity
115  maxDisparity = args.max_disparity
116  includeDiagonals = not args.skip_diagonal
117  numPasses = args.num_passes
118  calcConf = not args.skip_confidence
119  downscale = args.downscale
120  windowSize = args.window_size
121  quality = 6
122 
123  if args.verbose:
124  print(f'I Backend: {backend}\nI Left image: {args.left}\nI Right image: {args.right}\n'
125  f'I Disparities (min, max): {(minDisparity, maxDisparity)}\n'
126  f'I Input scale factor: {scale}\nI Output downscale factor: {downscale}\n'
127  f'I Window size: {windowSize}\nI Quality: {quality}\n'
128  f'I Calculate confidence: {calcConf}\nI Confidence threshold: {args.conf_threshold}\n'
129  f'I Confidence type: {conftype}\nI Uniqueness ratio: {args.uniqueness}\n'
130  f'I Penalty P1: {args.p1}\nI Penalty P2: {args.p2}\nI Adaptive P2 alpha: {args.p2_alpha}\n'
131  f'I Include diagonals: {includeDiagonals}\nI Number of passes: {numPasses}\n'
132  f'I Output mode: {args.output_mode}\nI Verbose: {args.verbose}\n'
133  , end='', flush=True)
134 
135  if 'raw' in args.left:
136  pil_left = read_raw_file(args.left, resize_to=[args.height, args.width], verbose=args.verbose)
137  else:
138  try:
139  pil_left = Image.open(args.left)
140  except:
141  raise ValueError(f'E Cannot open left input image: {args.left}')
142 
143  if 'raw' in args.right:
144  pil_right = read_raw_file(args.right, resize_to=[args.height, args.width], verbose=args.verbose)
145  else:
146  try:
147  pil_right = Image.open(args.right)
148  except:
149  raise ValueError(f'E Cannot open right input image: {args.right}')
150 
151  # Streams for left and right independent pre-processing
152  streamLeft = vpi.Stream()
153  streamRight = vpi.Stream()
154 
155  # Load input into a vpi.Image and convert it to grayscale, 16bpp
156  with vpi.Backend.CUDA:
157  with streamLeft:
158  left = vpi.asimage(np.asarray(pil_left)).convert(vpi.Format.Y16_ER, scale=scale)
159  with streamRight:
160  right = vpi.asimage(np.asarray(pil_right)).convert(vpi.Format.Y16_ER, scale=scale)
161 
162  # Preprocess input
163  # Block linear format is needed for pva-nvenc-vic pipeline and ofa backends
164  # Currently we can only convert to block-linear using VIC backend.
165  # The input also must be 1080p for pva-nvenc-vic backend.
166  if args.backend in {'pva-nvenc-vic', 'ofa-pva-vic', 'ofa'}:
167  if args.verbose:
168  print(f'W {args.backend} forces to convert input images to block linear', flush=True)
169  with vpi.Backend.VIC:
170  with streamLeft:
171  left = left.convert(vpi.Format.Y16_ER_BL)
172  with streamRight:
173  right = right.convert(vpi.Format.Y16_ER_BL)
174  if args.backend == 'pva-nvenc-vic':
175  if left.size[0] != 1920 or left.size[1] != 1080:
176  raise ValueError(f'E {args.backend} requires input to be 1920x1080')
177 
178  if args.verbose:
179  print(f'I Input left image: {left.size} {left.format}\n'
180  f'I Input right image: {right.size} {right.format}', flush=True)
181 
182  confidenceU16 = None
183 
184  if args.backend == 'pva-nvenc-vic':
185  if args.verbose:
186  print(f'W {args.backend} forces to calculate confidence', flush=True)
187  calcConf = True
188 
189  if calcConf:
190  if args.backend not in {'cuda', 'ofa-pva-vic', 'pva-nvenc-vic'}:
191  # Only CUDA, OFA-PVA-VIC and PVA-NVENC-VIC have confidence map
192  calcConf = False
193  if args.verbose:
194  print(f'W {args.backend} does not allow to calculate confidence', flush=True)
195 
196  if calcConf:
197  if args.backend == 'pva-nvenc-vic':
198  # PVA-NVENC-VIC only supports 1/4 of the input size
199  downscale = 4
200  if args.verbose:
201  print(f'W {args.backend} forces downscale to {downscale}', flush=True)
202 
203  outWidth = (left.size[0] + downscale - 1) // downscale
204  outHeight = (left.size[1] + downscale - 1) // downscale
205 
206  if calcConf:
207  confidenceU16 = vpi.Image((outWidth, outHeight), vpi.Format.U16)
208 
209  # Use stream left to consolidate actual stereo processing
210  streamStereo = streamLeft
211 
212  if args.backend in {'pva', 'cpu'}:
213  maxDisparity = 64
214  if args.verbose:
215  print(f'W {args.backend} forces maxDisparity to {maxDisparity}', flush=True)
216  elif args.backend == 'ofa-pva-vic' and maxDisparity not in {128, 256}:
217  maxDisparity = 256
218  if args.verbose:
219  print(f'W {args.backend} forces maxDisparity to {maxDisparity}', flush=True)
220 
221  if args.verbose:
222  if 'ofa' not in args.backend:
223  print('W Ignoring P2 alpha and number of passes since not an OFA backend', flush=True)
224  if args.backend != 'cuda':
225  print('W Ignoring uniqueness since not a CUDA backend', flush=True)
226  print('I Estimating stereo disparity ... ', end='', flush=True)
227 
228  # Estimate stereo disparity.
229  with streamStereo, backend:
230  disparityS16 = vpi.stereodisp(left, right, downscale=downscale, out_confmap=confidenceU16,
231  window=windowSize, maxdisp=maxDisparity, confthreshold=args.conf_threshold,
232  quality=quality, conftype=conftype, mindisp=minDisparity,
233  p1=args.p1, p2=args.p2, p2alpha=args.p2_alpha, uniqueness=args.uniqueness,
234  includediagonals=includeDiagonals, numpasses=numPasses)
235 
236  if args.verbose:
237  print('done!\nI Post-processing ... ', end='', flush=True)
238 
239  # Postprocess results and save them to disk
240  with streamStereo, vpi.Backend.CUDA:
241  # Some backends outputs disparities in block-linear format, we must convert them to
242  # pitch-linear for consistency with other backends.
243  if disparityS16.format == vpi.Format.S16_BL:
244  disparityS16 = disparityS16.convert(vpi.Format.S16, backend=vpi.Backend.VIC)
245 
246  # Scale disparity and confidence map so that values like between 0 and 255.
247 
248  # Disparities are in Q10.5 format, so to map it to float, it gets
249  # divided by 32. Then the resulting disparity range, from 0 to
250  # stereo.maxDisparity gets mapped to 0-255 for proper output.
251  # Copy disparity values back to the CPU.
252  disparityU8 = disparityS16.convert(vpi.Format.U8, scale=255.0/(32*maxDisparity)).cpu()
253 
254  # Apply JET colormap to turn the disparities into color, reddish hues
255  # represent objects closer to the camera, blueish are farther away.
256  disparityColor = cv2.applyColorMap(disparityU8, cv2.COLORMAP_JET)
257 
258  # Converts to RGB for output with PIL.
259  disparityColor = cv2.cvtColor(disparityColor, cv2.COLOR_BGR2RGB)
260 
261  if calcConf:
262  confidenceU8 = confidenceU16.convert(vpi.Format.U8, scale=255.0/65535).cpu()
263 
264  # When pixel confidence is 0, its color in the disparity is black.
265  mask = cv2.threshold(confidenceU8, 1, 255, cv2.THRESH_BINARY)[1]
266  mask = cv2.cvtColor(mask, cv2.COLOR_GRAY2BGR)
267  disparityColor = cv2.bitwise_and(disparityColor, mask)
268 
269  fext = '.raw' if args.output_mode == 2 else '.png'
270 
271  disparity_fname = f'disparity_python{sys.version_info[0]}_{args.backend}' + fext
272  confidence_fname = f'confidence_python{sys.version_info[0]}_{args.backend}' + fext
273 
274  if args.verbose:
275  print(f'done!\nI Disparity output: {disparity_fname}', flush=True)
276  if calcConf:
277  print(f'I Confidence output: {confidence_fname}', flush=True)
278 
279  # Save results to disk.
280  try:
281  if args.output_mode == 0:
282  Image.fromarray(disparityColor).save(disparity_fname)
283  if args.verbose:
284  print(f'I Output disparity image: {disparityColor.shape} '
285  f'{disparityColor.dtype}', flush=True)
286  elif args.output_mode == 1:
287  Image.fromarray(disparityU8).save(disparity_fname)
288  if args.verbose:
289  print(f'I Output disparity image: {disparityU8.shape} '
290  f'{disparityU8.dtype}', flush=True)
291  elif args.output_mode == 2:
292  disparityS16.cpu().tofile(disparity_fname)
293  if args.verbose:
294  print(f'I Output disparity image: {disparityS16.size} '
295  f'{disparityS16.format}', flush=True)
296 
297  if calcConf:
298  if args.output_mode == 0 or args.output_mode == 1:
299  Image.fromarray(confidenceU8).save(confidence_fname)
300  if args.verbose:
301  print(f'I Output confidence image: {confidenceU8.shape} '
302  f'{confidenceU8.dtype}', flush=True)
303  else:
304  confidenceU16.cpu().tofile(confidence_fname)
305  if args.verbose:
306  print(f'I Output confidence image: {confidenceU16.size} '
307  f'{confidenceU16.format}', flush=True)
308 
309  except:
310  raise ValueError(f'E Cannot write outputs: {disparity_fname}, {confidence_fname}\n'
311  f'E Using output mode: {args.output_mode}')
312 
313 
314 if __name__ == '__main__':
315  main()
29 #include <opencv2/core/version.hpp>
30 #if CV_MAJOR_VERSION >= 3
31 # include <opencv2/imgcodecs.hpp>
32 #else
33 # include <opencv2/contrib/contrib.hpp> // for colormap
34 # include <opencv2/highgui/highgui.hpp>
35 #endif
36 
37 #include <opencv2/imgproc/imgproc.hpp>
38 #include <vpi/OpenCVInterop.hpp>
39 
40 #include <vpi/Image.h>
41 #include <vpi/Status.h>
42 #include <vpi/Stream.h>
44 #include <vpi/algo/Rescale.h>
46 
47 #include <cstring> // for memset
48 #include <iostream>
49 #include <sstream>
50 
51 #define CHECK_STATUS(STMT) \
52  do \
53  { \
54  VPIStatus status = (STMT); \
55  if (status != VPI_SUCCESS) \
56  { \
57  char buffer[VPI_MAX_STATUS_MESSAGE_LENGTH]; \
58  vpiGetLastStatusMessage(buffer, sizeof(buffer)); \
59  std::ostringstream ss; \
60  ss << vpiStatusGetName(status) << ": " << buffer; \
61  throw std::runtime_error(ss.str()); \
62  } \
63  } while (0);
64 
65 int main(int argc, char *argv[])
66 {
67  // OpenCV image that will be wrapped by a VPIImage.
68  // Define it here so that it's destroyed *after* wrapper is destroyed
69  cv::Mat cvImageLeft, cvImageRight;
70 
71  // VPI objects that will be used
72  VPIImage inLeft = NULL;
73  VPIImage inRight = NULL;
74  VPIImage tmpLeft = NULL;
75  VPIImage tmpRight = NULL;
76  VPIImage stereoLeft = NULL;
77  VPIImage stereoRight = NULL;
78  VPIImage disparity = NULL;
79  VPIImage confidenceMap = NULL;
80  VPIStream stream = NULL;
81  VPIPayload stereo = NULL;
82 
83  int retval = 0;
84 
85  try
86  {
87  // =============================
88  // Parse command line parameters
89 
90  if (argc != 4)
91  {
92  throw std::runtime_error(std::string("Usage: ") + argv[0] +
93  " <cpu|pva|cuda|pva-nvenc-vic|ofa|ofa-pva-vic> <left image> <right image>\nNote: "
94  "For pva-nvenc-vic backend the left_image's size must be 1920x1080");
95  }
96 
97  std::string strBackend = argv[1];
98  std::string strLeftFileName = argv[2];
99  std::string strRightFileName = argv[3];
100 
101  uint64_t backends;
102 
103  if (strBackend == "cpu")
104  {
105  backends = VPI_BACKEND_CPU;
106  }
107  else if (strBackend == "cuda")
108  {
109  backends = VPI_BACKEND_CUDA;
110  }
111  else if (strBackend == "pva")
112  {
113  backends = VPI_BACKEND_PVA;
114  }
115  else if (strBackend == "pva-nvenc-vic")
116  {
118  }
119  else if (strBackend == "ofa")
120  {
121  backends = VPI_BACKEND_OFA;
122  }
123  else if (strBackend == "ofa-pva-vic")
124  {
126  }
127  else
128  {
129  throw std::runtime_error(
130  "Backend '" + strBackend +
131  "' not recognized, it must be either cpu, cuda, pva, ofa, ofa-pva-vic or pva-nvenc-vic.");
132  }
133 
134  // =====================
135  // Load the input images
136  cvImageLeft = cv::imread(strLeftFileName);
137  if (cvImageLeft.empty())
138  {
139  throw std::runtime_error("Can't open '" + strLeftFileName + "'");
140  }
141 
142  cvImageRight = cv::imread(strRightFileName);
143  if (cvImageRight.empty())
144  {
145  throw std::runtime_error("Can't open '" + strRightFileName + "'");
146  }
147 
148  // =================================
149  // Allocate all VPI resources needed
150 
151  int32_t inputWidth = cvImageLeft.cols;
152  int32_t inputHeight = cvImageLeft.rows;
153 
154  // Create the stream that will be used for processing.
155  CHECK_STATUS(vpiStreamCreate(0, &stream));
156 
157  // We now wrap the loaded images into a VPIImage object to be used by VPI.
158  // VPI won't make a copy of it, so the original image must be in scope at all times.
159  CHECK_STATUS(vpiImageCreateWrapperOpenCVMat(cvImageLeft, 0, &inLeft));
160  CHECK_STATUS(vpiImageCreateWrapperOpenCVMat(cvImageRight, 0, &inRight));
161 
162  // Format conversion parameters needed for input pre-processing
163  VPIConvertImageFormatParams convParams;
164  CHECK_STATUS(vpiInitConvertImageFormatParams(&convParams));
165 
166  // Set algorithm parameters to be used. Only values what differs from defaults will be overwritten.
168  CHECK_STATUS(vpiInitStereoDisparityEstimatorCreationParams(&stereoParams));
169 
170  // Default format and size for inputs and outputs
172  VPIImageFormat disparityFormat = VPI_IMAGE_FORMAT_S16;
173 
174  int stereoWidth = inputWidth;
175  int stereoHeight = inputHeight;
176  int outputWidth = inputWidth;
177  int outputHeight = inputHeight;
178 
179  // Override some backend-dependent parameters
180  if (strBackend == "pva-nvenc-vic")
181  {
182  // Input and output width and height has to be 1920x1080 in block-linear format for pva-nvenc-vic pipeline
183  stereoFormat = VPI_IMAGE_FORMAT_Y16_ER_BL;
184  stereoWidth = 1920;
185  stereoHeight = 1080;
186 
187  // For PVA+NVENC+VIC mode, 16bpp input must be MSB-aligned, which
188  // is equivalent to say that it is Q8.8 (fixed-point, 8 decimals).
189  convParams.scale = 256;
190 
191  // Maximum disparity is fixed to 256.
192  stereoParams.maxDisparity = 256;
193 
194  // pva-nvenc-vic pipeline only supports downscaleFactor = 4
195  stereoParams.downscaleFactor = 4;
196  outputWidth = stereoWidth / stereoParams.downscaleFactor;
197  outputHeight = stereoHeight / stereoParams.downscaleFactor;
198  }
199  else if (strBackend.find("ofa") != std::string::npos)
200  {
201  // Implementations using OFA require BL input
202  stereoFormat = VPI_IMAGE_FORMAT_Y16_ER_BL;
203 
204  if (strBackend == "ofa")
205  {
206  disparityFormat = VPI_IMAGE_FORMAT_S16_BL;
207  }
208 
209  // Output width including downscaleFactor must be at least max(64, maxDisparity/downscaleFactor) when OFA+PVA+VIC are used
210  if (strBackend.find("pva") != std::string::npos)
211  {
212  int downscaledWidth = (inputWidth + stereoParams.downscaleFactor - 1) / stereoParams.downscaleFactor;
213  int minWidth = std::max(stereoParams.maxDisparity / stereoParams.downscaleFactor, downscaledWidth);
214  outputWidth = std::max(64, minWidth);
215  outputHeight = (inputHeight * stereoWidth) / inputWidth;
216  stereoWidth = outputWidth * stereoParams.downscaleFactor;
217  stereoHeight = outputHeight * stereoParams.downscaleFactor;
218  }
219 
220  // Maximum disparity can be either 128 or 256
221  stereoParams.maxDisparity = 128;
222  }
223  else if (strBackend == "pva")
224  {
225  // PVA requires that input and output resolution is 480x270
226  stereoWidth = outputWidth = 480;
227  stereoHeight = outputHeight = 270;
228 
229  // maxDisparity must be 64
230  stereoParams.maxDisparity = 64;
231  }
232 
233  // Create the payload for Stereo Disparity algorithm.
234  // Payload is created before the image objects so that non-supported backends can be trapped with an error.
235  CHECK_STATUS(vpiCreateStereoDisparityEstimator(backends, stereoWidth, stereoHeight, stereoFormat, &stereoParams,
236  &stereo));
237 
238  // Create the image where the disparity map will be stored.
239  CHECK_STATUS(vpiImageCreate(outputWidth, outputHeight, disparityFormat, 0, &disparity));
240 
241  // Create the input stereo images
242  CHECK_STATUS(vpiImageCreate(stereoWidth, stereoHeight, stereoFormat, 0, &stereoLeft));
243  CHECK_STATUS(vpiImageCreate(stereoWidth, stereoHeight, stereoFormat, 0, &stereoRight));
244 
245  // Create some temporary images, and the confidence image if the backend can support it
246  if (strBackend == "pva-nvenc-vic")
247  {
248  // Need an temporary image to convert BGR8 input from OpenCV into pixel-linear 16bpp grayscale.
249  // We can't convert it directly to block-linear since CUDA backend doesn't support it, and
250  // VIC backend doesn't support BGR8 inputs.
251  CHECK_STATUS(vpiImageCreate(inputWidth, inputHeight, VPI_IMAGE_FORMAT_Y16_ER, 0, &tmpLeft));
252  CHECK_STATUS(vpiImageCreate(inputWidth, inputHeight, VPI_IMAGE_FORMAT_Y16_ER, 0, &tmpRight));
253 
254  // confidence map is needed for pva-nvenc-vic pipeline
255  CHECK_STATUS(vpiImageCreate(outputWidth, outputHeight, VPI_IMAGE_FORMAT_U16, 0, &confidenceMap));
256  }
257  else if (strBackend.find("ofa") != std::string::npos)
258  {
259  // OFA also needs a temporary buffer for format conversion
260  CHECK_STATUS(vpiImageCreate(inputWidth, inputHeight, VPI_IMAGE_FORMAT_Y16_ER, 0, &tmpLeft));
261  CHECK_STATUS(vpiImageCreate(inputWidth, inputHeight, VPI_IMAGE_FORMAT_Y16_ER, 0, &tmpRight));
262 
263  if (strBackend.find("pva") != std::string::npos)
264  {
265  // confidence map is supported by OFA+PVA
266  CHECK_STATUS(vpiImageCreate(outputWidth, outputHeight, VPI_IMAGE_FORMAT_U16, 0, &confidenceMap));
267  }
268  }
269  else if (strBackend == "pva")
270  {
271  // PVA also needs a temporary buffer for format conversion and rescaling
272  CHECK_STATUS(vpiImageCreate(inputWidth, inputHeight, stereoFormat, 0, &tmpLeft));
273  CHECK_STATUS(vpiImageCreate(inputWidth, inputHeight, stereoFormat, 0, &tmpRight));
274  }
275  else if (strBackend == "cuda")
276  {
277  CHECK_STATUS(vpiImageCreate(inputWidth, inputHeight, VPI_IMAGE_FORMAT_U16, 0, &confidenceMap));
278  }
279 
280  // ================
281  // Processing stage
282 
283  // -----------------
284  // Pre-process input
285  if (strBackend == "pva-nvenc-vic" || strBackend == "pva" || strBackend == "ofa" || strBackend == "ofa-pva-vic")
286  {
287  // Convert opencv input to temporary grayscale format using CUDA
288  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, inLeft, tmpLeft, &convParams));
289  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, inRight, tmpRight, &convParams));
290 
291  // Do both scale and final image format conversion on VIC.
292  CHECK_STATUS(
293  vpiSubmitRescale(stream, VPI_BACKEND_VIC, tmpLeft, stereoLeft, VPI_INTERP_LINEAR, VPI_BORDER_CLAMP, 0));
294  CHECK_STATUS(vpiSubmitRescale(stream, VPI_BACKEND_VIC, tmpRight, stereoRight, VPI_INTERP_LINEAR,
295  VPI_BORDER_CLAMP, 0));
296  }
297  else
298  {
299  // Convert opencv input to grayscale format using CUDA
300  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, inLeft, stereoLeft, &convParams));
301  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, inRight, stereoRight, &convParams));
302  }
303 
304  // ------------------------------
305  // Do stereo disparity estimation
306 
307  // Submit it with the input and output images
308  CHECK_STATUS(vpiSubmitStereoDisparityEstimator(stream, backends, stereo, stereoLeft, stereoRight, disparity,
309  confidenceMap, NULL));
310 
311  // Wait until the algorithm finishes processing
312  CHECK_STATUS(vpiStreamSync(stream));
313 
314  // ========================================
315  // Output pre-processing and saving to disk
316  // Lock output to retrieve its data on cpu memory
317  VPIImageData data;
318  CHECK_STATUS(vpiImageLockData(disparity, VPI_LOCK_READ, VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR, &data));
319 
320  // Make an OpenCV matrix out of this image
321  cv::Mat cvDisparity;
322  CHECK_STATUS(vpiImageDataExportOpenCVMat(data, &cvDisparity));
323 
324  // Scale result and write it to disk. Disparities are in Q10.5 format,
325  // so to map it to float, it gets divided by 32. Then the resulting disparity range,
326  // from 0 to stereo.maxDisparity gets mapped to 0-255 for proper output.
327  cvDisparity.convertTo(cvDisparity, CV_8UC1, 255.0 / (32 * stereoParams.maxDisparity), 0);
328 
329  // Apply JET colormap to turn the disparities into color, reddish hues
330  // represent objects closer to the camera, blueish are farther away.
331  cv::Mat cvDisparityColor;
332  applyColorMap(cvDisparity, cvDisparityColor, cv::COLORMAP_JET);
333 
334  // Done handling output, don't forget to unlock it.
335  CHECK_STATUS(vpiImageUnlock(disparity));
336 
337  // If we have a confidence map,
338  if (confidenceMap)
339  {
340  // Write it to disk too.
341  //
342  VPIImageData data;
343  CHECK_STATUS(vpiImageLockData(confidenceMap, VPI_LOCK_READ, VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR, &data));
344 
345  cv::Mat cvConfidence;
346  CHECK_STATUS(vpiImageDataExportOpenCVMat(data, &cvConfidence));
347 
348  // Confidence map varies from 0 to 65535, we scale it to
349  // [0-255].
350  cvConfidence.convertTo(cvConfidence, CV_8UC1, 255.0 / 65535, 0);
351  imwrite("confidence_" + strBackend + ".png", cvConfidence);
352 
353  CHECK_STATUS(vpiImageUnlock(confidenceMap));
354 
355  // When pixel confidence is 0, its color in the disparity
356  // output is black.
357  cv::Mat cvMask;
358  threshold(cvConfidence, cvMask, 1, 255, cv::THRESH_BINARY);
359  cvtColor(cvMask, cvMask, cv::COLOR_GRAY2BGR);
360  bitwise_and(cvDisparityColor, cvMask, cvDisparityColor);
361  }
362 
363  imwrite("disparity_" + strBackend + ".png", cvDisparityColor);
364  }
365  catch (std::exception &e)
366  {
367  std::cerr << e.what() << std::endl;
368  retval = 1;
369  }
370 
371  // ========
372  // Clean up
373 
374  // Destroying stream first makes sure that all work submitted to
375  // it is finished.
376  vpiStreamDestroy(stream);
377 
378  // Only then we can destroy the other objects, as we're sure they
379  // aren't being used anymore.
380 
381  vpiImageDestroy(inLeft);
382  vpiImageDestroy(inRight);
383  vpiImageDestroy(tmpLeft);
384  vpiImageDestroy(tmpRight);
385  vpiImageDestroy(stereoLeft);
386  vpiImageDestroy(stereoRight);
387  vpiImageDestroy(confidenceMap);
388  vpiImageDestroy(disparity);
389  vpiPayloadDestroy(stereo);
390 
391  return retval;
392 }
Declares functions that handle image format conversion.
#define VPI_IMAGE_FORMAT_S16_BL
Single plane with one block-linear 16-bit signed integer channel.
Definition: ImageFormat.h:121
#define VPI_IMAGE_FORMAT_U16
Single plane with one 16-bit unsigned integer channel.
Definition: ImageFormat.h:109
#define VPI_IMAGE_FORMAT_S16
Single plane with one 16-bit signed integer channel.
Definition: ImageFormat.h:118
#define VPI_IMAGE_FORMAT_Y16_ER_BL
Single plane with one block-linear 16-bit unsigned integer channel with full-range luma (grayscale) i...
Definition: ImageFormat.h:176
#define VPI_IMAGE_FORMAT_Y16_ER
Single plane with one pitch-linear 16-bit unsigned integer channel with full-range luma (grayscale) i...
Definition: ImageFormat.h:171
Functions and structures for dealing with VPI images.
Functions for handling OpenCV interoperability with VPI.
Declares functions that implement the Rescale algorithm.
Declaration of VPI status codes handling functions.
Declares functions that implement stereo disparity estimation algorithms.
Declares functions dealing with VPI streams.
float scale
Scaling factor.
VPIStatus vpiInitConvertImageFormatParams(VPIConvertImageFormatParams *params)
Initialize VPIConvertImageFormatParams with default values.
VPIStatus vpiSubmitConvertImageFormat(VPIStream stream, uint64_t backend, VPIImage input, VPIImage output, const VPIConvertImageFormatParams *params)
Converts the image contents to the desired format, with optional scaling and offset.
Parameters for customizing image format conversion.
uint64_t VPIImageFormat
Pre-defined image formats.
Definition: ImageFormat.h:94
void vpiImageDestroy(VPIImage img)
Destroy an image instance.
struct VPIImageImpl * VPIImage
A handle to an image.
Definition: Types.h:256
VPIStatus vpiImageLockData(VPIImage img, VPILockMode mode, VPIImageBufferType bufType, VPIImageData *data)
Acquires the lock on an image object and returns the image contents.
VPIStatus vpiImageCreate(int32_t width, int32_t height, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
Create an empty image instance with the specified flags.
VPIStatus vpiImageUnlock(VPIImage img)
Releases the lock on an image object.
@ VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR
Host-accessible with planes in pitch-linear memory layout.
Definition: Image.h:172
Stores information about image characteristics and content.
Definition: Image.h:234
VPIStatus vpiImageCreateWrapperOpenCVMat(const cv::Mat &mat, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
Wraps a cv::Mat in an VPIImage with the given image format.
VPIStatus vpiImageDataExportOpenCVMat(const VPIImageData &imgData, cv::Mat *mat)
Fills an existing cv::Mat with data from VPIImageData coming from a locked VPIImage.
struct VPIPayloadImpl * VPIPayload
A handle to an algorithm payload.
Definition: Types.h:268
void vpiPayloadDestroy(VPIPayload payload)
Deallocates the payload object and all associated resources.
VPIStatus vpiSubmitRescale(VPIStream stream, uint64_t backend, VPIImage input, VPIImage output, VPIInterpolationType interpolationType, VPIBorderExtension border, uint64_t flags)
Changes the size and scale of a 2D image.
int32_t maxDisparity
Maximum disparity for matching search.
int32_t downscaleFactor
Output's downscale factor with respect to the input's resolution.
VPIStatus vpiInitStereoDisparityEstimatorCreationParams(VPIStereoDisparityEstimatorCreationParams *params)
Initializes VPIStereoDisparityEstimatorCreationParams with default values.
VPIStatus vpiCreateStereoDisparityEstimator(uint64_t backends, int32_t imageWidth, int32_t imageHeight, VPIImageFormat inputFormat, const VPIStereoDisparityEstimatorCreationParams *params, VPIPayload *payload)
Creates payload for vpiSubmitStereoDisparityEstimator.
VPIStatus vpiSubmitStereoDisparityEstimator(VPIStream stream, uint64_t backend, VPIPayload payload, VPIImage left, VPIImage right, VPIImage disparity, VPIImage confidenceMap, const VPIStereoDisparityEstimatorParams *params)
Runs stereo processing on a pair of images and outputs a disparity map.
Structure that defines the parameters for vpiCreateStereoDisparityEstimator.
struct VPIStreamImpl * VPIStream
A handle to a stream.
Definition: Types.h:250
VPIStatus vpiStreamSync(VPIStream stream)
Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
void vpiStreamDestroy(VPIStream stream)
Destroy a stream instance and deallocate all HW resources.
VPIStatus vpiStreamCreate(uint64_t flags, VPIStream *stream)
Create a stream instance.
@ VPI_BACKEND_CUDA
CUDA backend.
Definition: Types.h:93
@ VPI_BACKEND_PVA
PVA backend.
Definition: Types.h:94
@ VPI_BACKEND_NVENC
NVENC backend.
Definition: Types.h:96
@ VPI_BACKEND_OFA
OFA backend.
Definition: Types.h:97
@ VPI_BACKEND_VIC
VIC backend.
Definition: Types.h:95
@ VPI_BACKEND_CPU
CPU backend.
Definition: Types.h:92
@ VPI_BORDER_CLAMP
Border pixels are repeated indefinitely.
Definition: Types.h:279
@ VPI_INTERP_LINEAR
Linear interpolation.
Definition: Interpolation.h:93
@ VPI_LOCK_READ
Lock memory only for reading.
Definition: Types.h:518