Comparison with VPI-0.1.0
VPI-0.2.0 brings several performance improvements in most of its algorithms across all backends with respect to previous VPI-0.1.0.
The charts below shows the performance increase per backend, per device. The benchmarking procedure used is described here.
Algorithm Parameters
Here is a list of the parameters used while benchmarking all above algorithms. This helps puts the values above into context.
- Gaussian Pyramid Generator
- CPU/CUDA: 8-bit 1920x1080 input, 4 levels, 0.5x scale (each level is 4x smaller than previous)
- PVA: 16-bit 3264x2448, 5 levels, 0.5x scale
- Gaussian Image Filter
- CPU/CUDA: 8-bit 1920x1080 input, 3x3 kernel support size
- PVA: 8-bit 3264x2448 input, 3x3 kernel support size.
- Image Convolver
- CPU/CUDA: 8-bit 1920x1080 input, 5x5 kernel support size.
- PVA: 8-bit 3264x2448 input, 5x5 kernel support size.
- Box Image Filter
- CPU/CUDA: 8-bit 1920x1080 input, 3x3 kernel support, clamp boundary condition.
- PVA: 8-bit 3264x2448 input, 3x3 kernel support, zero boundary condition.
- Image Resampler
- CPU/CUDA: 8-bit 480x270 upscale to 1920x1080 with linear interpolation
- Stereo Disparity Estimator
- CPU/CUDA/PVA 16-bit 480x270 input.
- Bilateral Image Filter
- CPU/CUDA: 8-bit 3264x2448 input, 5x5 spatial kernel support size.
- Separable Image Convolver
- CPU/CUDA: 16-bit 1920x1080 input, 5x5 kernel support size.
- PVA: 16-bit 3264x2448 input, 5x5 kernel support size.
- Harris Keypoint Detector
- CPU/CUDA: 16-bit 1920x1080 input, 3x3 gradient kernel size, 3x3 block size.