Overview

This algorithm implements the Harris keypoint detection operator that is commonly used to detect keypoints and infer features of an image.

The standard Harris detector algorithm as described in [1] is applied first. After that, a non-max suppression pruning process is applied to the result to remove multiple or spurious keypoints.

Input	Parameters	Output keypoints
	\begin{align} \mathit{gradientSize} &= 5 \\ \mathit{blockSize} &= 5 \\ \mathit{strengthThresh} &= 250 \\ \mathit{sensitivity} &= 0.24 \\ \mathit{minNMSDistance} &= 8 \end{align}

Implementation

Compute the spatial gradient of the input using one of the following filters, depending on the value of VPIHarrisKeypointDetectorParams::gradientSize :
- For gradientSize = 3:
  \begin{align*} \mathit{sobel}_x &= \frac{1}{4} \cdot \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} \cdot \begin{bmatrix} -1 & 0 & 1 \end{bmatrix} \\ \mathit{sobel}_y &= (\mathit{sobel}_x)^\intercal \end{align*}
- For gradientSize = 5:
  \begin{align*} \mathit{sobel}_x &= \frac{1}{16} \cdot \begin{bmatrix} 1 \\ 4 \\ 6 \\ 4 \\ 1 \end{bmatrix} \cdot \begin{bmatrix} -1 & -2 & 0 & 2 & 1 \end{bmatrix} \\ \mathit{sobel}_y &= (\mathit{sobel}_x)^\intercal \end{align*}
- For gradientSize = 7:
  \begin{align*} \mathit{sobel}_x &= \frac{1}{64} \cdot \begin{bmatrix} 1 \\ 6 \\ 15 \\ 20 \\ 15 \\ 6 \\ 1 \end{bmatrix} \cdot \begin{bmatrix} -1 & -4 & -5 & 0 & 5 & 4 & 1 \end{bmatrix} \\ \mathit{sobel}_y &= (\mathit{sobel}_x)^\intercal \end{align*}
Compute a gradient covariance matrix (structure tensor) for each pixel within a block window, as described by:
\[ M = \sum_{p \in B}\begin{bmatrix}I_x^2(p) & I_x(p) I_y(p) \\ I_x(p) I_y(p) & I_y^2(p) \end{bmatrix} \]

where:
- p is a pixel coordinate within B, a block window of size 3x3, 5x5 or 7x7.
- \(I(p)\) is the input image
- \( I_x(p) = I(p) * \mathit{sobel}_x \)
- \( I_y(p) = I(p) * \mathit{sobel}_y \)
Compute a Harris response score using a sensitivity factor
\[ R = \mathit{det}(M) - k \cdot \mathit{trace}^2(M ) \]

where k is the sensitivity factor
Applies a threshold-strength criterion, pruning keypoints whose response < VPIHarrisKeypointDetectorParams::strengthThresh.
Applies a non-max suppression pruning process.

This process splits the input image into a 2D cell grid. It selects a single corner with the highest response score inside the cell. If several corners within the cell have the same response score, it selects the bottom-right corner.

Usage

Initialization phase
1. Include the header that defines the box filter function.
  #include <vpi/algo/HarrisKeypointDetector.h>
2. Define the stream on which the algorithm will be executed and the input image.
  VPIStream stream = /*...*/;
  
  VPIImage input = /*...*/;
3. Create the output arrays that will store the keypoints and their scores.
  VPIArray keypoints;
  
  vpiArrayCreate(8192, VPI_ARRAY_TYPE_KEYPOINT, 0, &keypoints);
  
  VPIArray scores;
  
  vpiArrayCreate(8192, VPI_ARRAY_TYPE_U32, 0, &scores);
4. Since this algorithm needs temporary memory buffers, create the payload for it.
  uint32_t w,h;
  
  vpiImageGetSize(input, &w, &h);
  
  VPIPayload harris;
  
  vpiCreateHarrisKeypointDetector(stream, w, h, &harris);
Processing phase
1. Fill the configuration structure with parameters for the current algorithm invocation.
  VPIHarrisKeypointDetectorParams params;
  
  params.gradientSize = 5;
  
  params.blockSize = 5;
  
  params.strengthThresh = 250;
  
  params.sensitivity = 0.24;
  
  params.minNMSDistance = 8;
2. Submit the algorithm and its parameters to the stream.
  vpiSubmitHarrisKeypointDetector(harris, input, keypoints, scores, &params);
3. Optionally, wait until the processing is done.
  vpiStreamSync(stream);
Cleanup phase
1. Free resources held by the payload.
  vpiPayloadDestroy(harris);

Limitations and Constraints

Constraints for specific backends supersede the ones specified for all backends.

All Backends

Input image must have same dimensions as the ones specified during payload creation.
Only supports Sobel gradient kernels of sizes 3x3, 5x5 and 7x7.
Output scores and keypoints arrays must have the same capacity.
Must satisfy \(\mathit{minNMSDistance} \geq 1\).
The following image types are accepted:

PVA

Not implemented

References

C. Harris, M. Stephens (1988), "A Combined Corner and Edge Detector"
Proceedings of Alvey Vision Conference, pp. 147-151.

VPI - Vision Programming Interface

0.1.0 Release