VPI-3.2.4

VPI-3.2.4 is the first production release of VPI-3.2 branch. It contains several new algorithms and bug fixes.

For previous release notes refer to Previous Releases

Release Highlights

AprilTag Detector and Pose Estimator :
- AprilTags algorithm added for PVA and CPU backends.
- Supports detection and pose estimation for a number of common tag families.
Dynamic Remap :
- Dynamic Remap added for backend CUDA.
- Used for generic geometrical transformations.
- A new algorithm that extends existing Remap functionality.
Recursive Gaussian Filter :
- Recursive Gaussian filter added for CPU and CUDA backends.
- Provides superior performance compared to direct convolution Gaussian Filter for large sigma.
Added VPI_BACKEND_PVA implementation to the following algorithms :
- Mix Channels
- Crop Scaler
- ORB feature detector (Descriptor Extractor only)
- Pyramidal LK Optical Flow
Performance improvements for existing VPI_BACKEND_PVA algorithms :
- Up to 1.6x speedup for Harris Corner Detector
- Up to 2x speedup on Erode in single stream pipelines
- Up to 2x speedup on Dilate in single stream pipelines
- Up to 2x speedup on Median Filter in single stream pipelines
- Up to 2x speedup on Separable Convolution in single stream pipelines
- Up to 2-3x speedup for Gaussian Pyramid Generator
- Up to 2-3x speedup on Convolution in single stream pipelines
- Up to 3-5x speedup on Box Filter in single stream pipelines
- Up to 3-5x speedup on Gaussian Filter in single stream pipelines
Added VPI_STEREO_CONFIDENCE_INFERENCE support to Stereo Disparity Estimator
- 2x better FPS and 1.5x better latency compared to other stereo confidence types for OFA+PVA+VIC backend

Other Updates

Support added for BL16 formats for the following algorithms :
- Image Flip
- Rescale
- Perspective Warp
- Convert Image Format
- Stereo Disparity Estimator (For VPI_BACKEND_OFA and OFA+PVA+VIC backends)
Block linear support added for the following algorithms :
- Temporal Noise Reduction for backend CUDA
- Bilateral Filter backend CUDA
Rescale :
- Support added for RGB8 and BGR8 for backend CUDA.
Dense Optical Flow :
- Support added for varying block heights for Block linear formats.
Median Filter :
- Added support for VPI_BORDER_CLAMP on PVA backend for image formats U8, S8, U16, S16 and 2S16.
Convert Image Format on CUDA backend:
- Support added for conversion from RGB8, BGR8 to Y8_ER_BL, Y8_ER_BL16, Y16_ER_BL, Y16_ER_BL16.
- Support added for conversion from RGBA8, BGRA8 to Y8_ER_BL, Y8_ER_BL16, Y16_ER_BL, Y16_ER_BL16.
Gaussian Pyramid Generator :
- Added support for VPI_BORDER_CLAMP for backend PVA.
Upgraded pybind11 to v2.13.1 to fix compatibility issues with numpy2.
Added the DCF Tracker sample app.

Known Issues

Descriptors generated by ORB descriptor extractor on VPI_BACKEND_PVA are only compatible with other ORB descriptors generated by VPI_BACKEND_PVA, and are not guaranteed to be compatible with descriptors generated by other backends.
Fisheye python sample does not work with OpenCV python v4.10 and later versions. Please use OpenCV python v4.8.0.
Host images wrapped into VPIImages using vpiImageCreateWrapper might impact performance when using them with algorithms running on the CUDA backend. You should avoid wrappers in this case, and use VPIImages allocated with vpiImageCreate instead.
Performance might be affected when using CUDA images wrapped into VPIImages using vpiImageCreateWrapper in algorithms running in PVA, VIC and/or OFA. User should avoid using wrappers in this case, preferring to use VPIImages allocated with vpiImageCreate.
Harris Corner Detector result scores/positions might differ among backends.
Stereo Disparity Estimator
- The confidence map generated by OFA+PVA+VIC backend might have some negligible differences with respect to other backends.
Performance might be affected when using Dense Optical Flow on python due to re-creating the payload at every call.
DCF tracker sample with PVA backend fails to run.
- Fix: replace installed DCF tracker sample code with the one in sample source code.
On CUDA versions 12.5 and 12.6 Inverse FFT CUDA backend throws error for image sizes with large prime factors. Works for CUDA version 12.2.

Notices

Disclaimer

ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, "MATERIALS") ARE BEING PROVIDED "AS IS." NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.

Information furnished is believed to be accurate and reliable. However, NVIDIA Corporation assumes no responsibility for the consequences of use of such information or for any infringement of patents or other rights of third parties that may result from its use. No license is granted by implication of otherwise under any patent rights of NVIDIA Corporation. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all other information previously supplied. NVIDIA Corporation products are not authorized as critical components in life support devices or systems without express written approval of NVIDIA Corporation.

Trademarks

NVIDIA and the NVIDIA logo are trademarks or registered trademarks of NVIDIA Corporation in the U.S. and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.

VPI - Vision Programming Interface

3.2 Release