VPI - Vision Programming Interface

2.0 Release

Release Notes v2.0

VPI-2.0.14

VPI-2.0.14 is the first production release of VPI-2.0 branch. It contains several bug fixes and optimizations.

Release Highlights

  • Python bindings
    • Added Python API reference documentation.
    • Improved python's help command for a particular VPI entity now display the respective documentation.
  • Optimization of several algorithms:
  • Added performance benchmark tables to all algorithms.
    • Performance is measured on both Jetson AGX Orin and Jetson AGX Xavier.

Selected Bug Fixes

  • Fixed Stereo Disparity in C++ using PVA and PVA+NVENC+VIC backends.
  • Fixed bad interaction between VPI's internal EGLDisplay and global EGLDisplay that user's application might be using. Now the internal EGLDisplay object is independent from global.
  • Fixed intermittent output corruption in Stereo Disparity Estimator in PVA+NVENC+VIC and PVA+OFA+VIC backends.
  • Fixes intermittent errors in Harris Corner Detector on PVA, running on Jetson AGX Orin.
  • Now PyTorch/CUDA interoperability properly configures pytorch to use CUDA.
  • Fixed small output error on Erode and Dilate when using kernels with even-sized dimensions.
  • Fixed rare segfault when switching from an algorithm running on Tegra backend to some other, and vice-versa.
  • Fixed demo_stereo, now it works with all available backends. Added some missing backends.
  • Fixed segfault on Stereo Disparity Estimator on PVA+OFA+VIC backend when user passes NULL to confidence output parameter.
  • Fixed script in Clock Frequency and Power Settings, now it works properly on both Jetson AGX Orin and Jetson AGX Xavier.

Other

  • Internal optimization on \rev VPIStream processing by reducing the number of memory mapping operations needed when executing in sequence algorithms on different backends.
  • Added Performance Comparison against OpenCV-4.5.4.
  • Removed GUI demo links from desktop. Please see Demo Applications for instructions on how to run the GUI demos.
  • VPI_BACKEND_NVENC is now available even if the underlying NVENC hardware doesn't support optical flow and stereo disparity processing, e.g. Jetson AGX Orin.

VPI-2.0.9

Initial release of VPI-2.0 branch.

VPI-2.0.9 Developer Preview (DP) is the first release of the 2.x series. Being a preview, it's not intended for use in production systems, as performance might have some regressions compared to previous versions.

As the major version bump implies, there's no guarantee of backwards compatibility with VPI-1.2 and older versions. Some breaking changes are documented below, but you can refer to Porting from VPI-1.2 to VPI-2.0 for more details.

This version isn't compatible with JetPack 4.6.1 and older on Tegra.

Release Highlights

  • New algorithms:
    • Image Flip on CPU, CUDA and VIC backends, allowing for flipping the horizontal, vertical or both (a.k.a 180o rotation) axes.
  • Updated algorithms:
  • Platform updates:
    • Added support for Jetson AGX Orin.
    • Dropped support for Jetson Nano, TX2 and TX2 NX.
    • Supports Ubuntu 18.04 (x64 only) and Ubuntu 20.04 (Tegra and x64).
  • Python updates:
  • Image handling
    • Added support for Image views in both C/C++ and Python APIs. It allows creating images that refer to non-overlapping parts of an existing image. Different parts of the existing image can then be processed independently and in parallel. Refer to the sample Image View for an usage example.

C API Updates

  • Image wrapping
    • Unified vpiImageCreateWrapper and vpiImageSetWrapper that handle all supported image types (EGLImage, host pitch-linear, CUDA pitch-Linear, NvBuffer, etc). They replace the following functions, which were removed:
      • vpiImageCreateWrapperCUDAPitchLinear
      • vpiImageCreateWrapperEGLImage
      • vpiImageCreateWrapperHost
      • vpiImageCreateWrapperNvBuffer
      • vpiImageSetWrappedCUDAPitchLinear
      • vpiImageSetWrappedEGLImage
      • vpiImageSetWrappedHost
      • vpiImageSetWrappedNvBuffer
    • VPIImageData was extended to support multiple image types.
    • The new VPIImageWrapperParams struct replaces the following structs, which were removed:
      • VPIWrapEGLImageParams
      • VPIWrapNvBufferParams
    • The new vpiInitImageWrapperParams replace the following functions, which were removed:
      • vpiInitWrapNvBufferParams
      • vpiInitWrapEGLImageParams
  • Image and Pyramid locking
    • vpiImageLock and vpiPyramidLock now only lock the image/pyramid. They do not return their contents anymore. To access the contents, use vpiImageLockData and vpiPyramidLockData respectively.
    • It's now possible to lock an image or pyramid and return its contents in a CUDA-accessible buffer in pitch-linear layout.
  • Array wrapping
    • Unified vpiArrayCreateWrapper and vpiArraySetWrapper that handle all supported external array types (host and cuda). It replace the following functions, which were removed:
      • vpiArrayCreateWrapperHost
      • vpiArrayCreateWrapperCUDA
      • vpiArraySetWrapperHost
      • vpiArraySetWrapperCUDA
    • VPIArrayData was updated to support different external array types.
  • Array locking
  • Several argument check errors are now returned by the respective functions, instead of being returned asynchronously. Several algorithm submit functions were updated.
  • vpiSubmitPerspectiveWarp now does not require a payload, and as an optimization for VIC backend, a VPIWarpGrid can be optionally passed to define the how sparse the processing grid over the output is.
  • Default gradientSize and blockSize parameters in VPIHarrisCornerDetectorParams set by vpiInitHarrisCornerDetectorParams are now both 3x3, instead of 5x5.
  • All flags passed to algorithms, flags getter and object creation functions (vpiImageGetFlags, vpiImageCreate, etc...) are now uint64_t.
  • Removed functions there weren't implemented:
    • vpiEventCreateCUDAEventWrapper
    • vpiEventCreateEGLSyncWrapper
    • vpiEventExportEGLSync
  • The following functions renamed for consistency:
  • New functions:
  • Removed vpiImageInvalidate and vpiArrayInvalidate functions. When an wrapped image or array must be updated outside VPI, they must be locked for write or read/write first. VPI's internal representation will be updated after vpiImageUnlock / vpiArrayUnlock is called.
  • VPIImageFormat and VPIPixelType are now represented as a uint64_t instead of an enum, for better standards conformance.
  • Added VPI_BACKEND_OFA backend, available on Jetson AGX Orin only, used in Stereo Disparity Estimator
  • Stereo Disparity Estimator on PVA now requires that downscaleFactor is 4.
  • Added VPI_BACKEND_MASK, representing all bits reserved for backends in the 64-bit object flags (images, contexts, arrays, streams, ...).
  • Added object flag VPI_RESTRICT_MEM_USAGE. When used for images, arrays and pyramids, it'll instruct VPI to be more conservative in the amount of memory it allocates. Some functionality might be affected, such as possibility of locking and retrieving the object's contents depending on their format, etc.
  • Removed VPI_DISABLE_BL_HOST_LOCK. Use VPI_RESTRICT_MEM_USAGE instead.
  • Removed VPI_FLAG_ALL
  • Added object flags VPI_REQUIRE_BACKENDS. When passed to creation function, it'll return an error if the requested backend can't be instantiated.
  • Added VPI_BORDER_LIMITED border extension, to be used in vpiSubmitErode and vpiSubmitDilate instead of VPI_BORDER_INVALID.
  • Several algorithms are now enforcing some parameter range restrictions where it makes sense. Please consult the reference documentation for details on the new restrictions.
  • vpiSubmitStereoDisparityEstimator now output disparity with format VPI_IMAGE_FORMAT_S16.
  • vpiSubmitBilateralFilter now supports inputs with format VPI_IMAGE_FORMAT_F32.

Python API Updates

  • Locking images, arrays and pyramids for host access is now done with .lock_cpu() method. It also returns the numpy array with the memory contents.
  • .lock_cuda() and .cuda() were added to return a buffer that follows the cuda array interface. They work similarly to .lock_cpu() and .cpu() respectively. Please refer to PyTorch/CUDA interoperability for example on how to use them.
  • Added .view(rect) member function to images, it returns a image view that refers to a rectangular sub-region of the given image. See Image View for usage examples.

Other

  • Rewritten API reference documentation for most functions and structures, correctly stating valid parameter ranges, expected error statuses, etc.

Selected Bug Fixes

  • Fixed segfault in vpiSubmitTemporalNoiseReduction when the previous frame isn't the output of the previous iteration. Now it just resets the internal state, starting a new noise reduction sequence.
  • vpiSubmitTemporalNoiseReduction now copies the input image into the output when strength parameter is 0. Before it was returning a green frame.

Known Issues

  • KLT Feature Tracker on PVA had some problems with tracking robustness and was removed from this release. It'll be added in the future when these issues are solved. Meanwhile, use the CUDA or CPU backends.
  • Host images wrapped into VPIImages using vpiImageCreateWrapper might incur in a performance hit when using them with algorithms running on CUDA backend. User should avoid wrappers in this case, preferring to use VPIImages allocated with vpiImageCreate.
  • Possible performance hit when using CUDA images wrapped into VPIImages using vpiImageCreateWrapper in algorithms running in PVA, VIC and/or NVENC. User should avoid using wrappers in this case, preferring to use VPIImages allocated with vpiImageCreate.
  • Harris Corner Detector result scores/positions might differ among backends.
  • PyTorch/CUDA interoperability might not work on Tegra due to some issues with CUDA support with PyTorch in this platform and how it behaves when VPI module is loaded.
  • Stereo Disparity Estimator
    • output differs significantly between CPU and new CUDA backend implementation.
    • On CPU backend and old CUDA backend (VPI-1.0 ABI), no checking on maximum disparity limit is being performed. It's recommended set maximum disparity to at most 64. Using a higher value leads to undefined behavior: too much memory is allocated, which may lead to system running out of memory.
    • The confidence map generated by OFA+PVA+VIC backend might have some negligible differences with respect to other backends.
  • Remap and Stereo demo applications
    • On Tegra platforms, the demos might fail to connect to the device's camera and segfault.

Notices

Disclaimer

ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, "MATERIALS") ARE BEING PROVIDED "AS IS." NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.

Information furnished is believed to be accurate and reliable. However, NVIDIA Corporation assumes no responsibility for the consequences of use of such information or for any infringement of patents or other rights of third parties that may result from its use. No license is granted by implication of otherwise under any patent rights of NVIDIA Corporation. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all other information previously supplied. NVIDIA Corporation products are not authorized as critical components in life support devices or systems without express written approval of NVIDIA Corporation.

Trademarks

NVIDIA and the NVIDIA logo are trademarks or registered trademarks of NVIDIA Corporation in the U.S. and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.

Copyright

© 2019-2022 NVIDIA Corporation. All rights reserved.