VPI - Vision Programming Interface

0.3.7 Release

Separable Image Convolver

Overview

The Separable Image Convolver algorithm performs a 2D convolution operation, but takes advantage of the fact that the 2D kernel is separable. The user passes one horizontal and one vertical 1D kernel. This usually leads to better performance, especially for kernels larger than 5x5. For smaller kernels, it's preferable to use Image Convolver algorithm with a 2D kernel directly.

Input Sobel kernel Output

\begin{eqnarray*} k_{col} &=& \frac{1}{64} \begin{bmatrix} 1 \\ 6 \\ 15 \\ 20 \\ 15 \\ 6 \\ 1 \end{bmatrix} \\ k_{row} &=& \begin{bmatrix} -1 & -5 & -6 & 0 & 6 & 5 & 1 \end{bmatrix} \end{eqnarray*}

Implementation

Discrete 2D convolution is implemented using the following discrete function:

\begin{eqnarray*} I'[x,y] &=& \sum_{m=0}^{k_w} K_{row}[m] \times I[x,y-(m - \lfloor k_w/2 \rfloor)] \\ I''[x,y] &=& \sum_{m=0}^{k_h} K_{col}[m] \times I'[x-(m - \lfloor k_h/2 \rfloor),y] \end{eqnarray*}

Where:

  • \(I\) is the input image.
  • \(I'\) is the temporary image with convolution along the rows.
  • \(I''\) is the final result.
  • \(K_{row}\) is the row convolution kernel.
  • \(K_{col}\) is the column convolution kernel.
  • \(k_w,k_h\) are the kernel's width and height, respectively.
Note
Most computer vision libraries expect the kernel to be reversed before calling their convolution functions. Not so with VPI, we implement a actual convolution, not cross-correlation. Naturally, this is irrelevant if the kernel is symmetric.

Usage

  1. Initialization phase
    1. Include the header that defines the needed functions and structures.
    2. Define the stream on which the algorithm will be executed, the input and output images.
      VPIStream stream = /*...*/;
      VPIImage input = /*...*/;
    3. Create the output image.
      uint32_t w, h;
      vpiImageGetSize(input, &w, &h);
      vpiImageGetType(input, &type);
      VPIImage output;
      vpiImageCreate(w, h, type, 0, &output);
  2. Processing phase
    1. Define the kernel to be used. In this case, a simple 3x3 edge detector.
      float sobel_row[7] = {-1, -5, -6, 0, +6, +5, +1};
      float sobel_col[7] = {1 / 64.f, 6 / 64.f, 15 / 64.f, 20 / 64.f, 15 / 64.f, 6 / 64.f, 1 / 64.f};
    2. Submit the algorithm to the stream, passing the kernel, input, output images and boundary condition.
      VPI_CHECK_STATUS(
      vpiSubmitSeparableImageConvolver(stream, input, output, sobel_row, 7, sobel_col, 7, VPI_BOUNDARY_COND_ZERO));
    3. Optionally, wait until the processing is done.
      vpiStreamSync(stream);

Consult the Image Convolution for a complete example.

For more details, consult the API reference.

Limitations and Constraints

Constraints for specific backends supersede the ones specified for all backends.

All Backends

PVA

  • Input and output dimensions must be between 160x92 and 3264x2448.
  • Minimum 1D convolution kernel size is 2, maximum is 11.
  • Horizontal and vertical kernel sizes must be equal, i.e., only square kernels can be used.
  • Kernel weights are restricted to \(|weight| < 1\).
  • The following image types are accepted:
  • The following boundary conditions are accepted.

Performance

For further information on how performance benchmarked, see Performance Measurement.

Jetson AGX Xavier
sizetypekernelCPUCUDAPVA
1920x1080u83x3 0.362 ms0.0652 msn/a
1920x1080u85x5 0.442 ms0.0689 msn/a
1920x1080u87x7 0.98 ms0.0875 msn/a
1920x1080u811x11 1.283 ms0.0988 msn/a
1920x1080s163x3 0.456 ms0.1062 ms3.267 ms
1920x1080s165x5 0.59 ms0.1154 ms3.915 ms
1920x1080s167x7 1.12 ms0.1342 ms3.864 ms
1920x1080s1611x11 1.30 ms0.1589 ms4.791 ms
Jetson TX2
sizetypekernelCPUCUDAPVA
1920x1080u83x3 1.44 ms0.260 msn/a
1920x1080u85x5 1.74 ms0.289 msn/a
1920x1080u87x7 2.17 ms0.396 msn/a
1920x1080u811x11 3.00 ms0.472 msn/a
1920x1080s163x3 2.0 ms0.392 msn/a
1920x1080s165x5 2.09 ms0.429 msn/a
1920x1080s167x7 2.61 ms0.594 msn/a
1920x1080s1611x11 3.42 ms0.692 msn/a
Jetson Nano
sizetypekernelCPUCUDAPVA
1920x1080u83x3 3.06 ms0.671 msn/a
1920x1080u85x5 3.83 ms0.7470 msn/a
1920x1080u87x7 4.70 ms1.027 msn/a
1920x1080u811x11 6.85 ms1.239 msn/a
1920x1080s163x3 3.60 ms1.001 msn/a
1920x1080s165x5 4.242 ms1.051 msn/a
1920x1080s167x7 5.51 ms1.426 msn/a
1920x1080s1611x11 7.36 ms1.692 msn/a
VPIImageType
VPIImageType
Image formats.
Definition: Types.h:206
vpiStreamSync
VPIStatus vpiStreamSync(VPIStream stream)
Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
VPIStream
struct VPIStreamImpl * VPIStream
A handle to a stream.
Definition: Types.h:177
SeparableImageConvolver.h
VPIImage
struct VPIImageImpl * VPIImage
A handle to an image.
Definition: Types.h:183
vpiImageGetSize
VPIStatus vpiImageGetSize(VPIImage img, uint32_t *width, uint32_t *height)
Get the image size in pixels.
vpiImageGetType
VPIStatus vpiImageGetType(VPIImage img, VPIImageType *type)
Get the image type.
VPI_BOUNDARY_COND_ZERO
@ VPI_BOUNDARY_COND_ZERO
All pixels outside the image are considered to be zero.
Definition: Types.h:270
vpiImageCreate
VPIStatus vpiImageCreate(uint32_t width, uint32_t height, VPIImageType type, uint32_t flags, VPIImage *img)
Create an empty image instance with the specified flags.
vpiSubmitSeparableImageConvolver
VPIStatus vpiSubmitSeparableImageConvolver(VPIStream stream, VPIImage input, VPIImage output, const float *kernelXData, uint32_t kernelXSize, const float *kernelYData, uint32_t kernelYSize, VPIBoundaryCond boundary)
Runs a generic 2D convolution over an image.