VPI - Vision Programming Interface

2.2 Release

ORB feature detector

Overview

Orientated FAST and rBRIEF (ORB) [1] is a feature detection and description algorithm. It detects features across an input pyramid as well as a descriptor for each feature, returning the coordinates for each feature as well as its associated bitstring descriptor. The advantage of ORB over other detection/description algorithm such as SIFT [2] is its relative simplicity and computational efficiency. This advantage is important in real-time video processing and machine learning pipelines. The main disadvantage of ORB is its lack of robustness regarding describing keypoints of images that change in angle and scale. ORB is designed to handle such changes, but it is not as effective as algorithms such as SIFT/SURF [3] in this regard. Note that the patent on SIFT expired, but SURF is still a patented algorithm/method [4]

This example shows an input image on the left with its corresponding feature locations on the right.

Input Output

Implementation

The ORB algorithm pulls out features from each level of the input pyramid. An input gaussian pyramid is utilized to generate features at multiple different scales in the image.

At each level, the algorithm begins by utilizing the VPI implementation of FAST Corner Detection to detect a parameterized number of FAST corners from the image. After this, it assignes a cornerness score to each FAST corner detected. If the score parameter is set to HARRIS_SCORE, the algorithm calculates the Harris cornerness strength of each detected corner as its respective score. If not, the algorithm does not assign a score. The corners are then sorted by their cornerness scores and the top algorithm parameter VPIORBParams::maxFeatures divided by the number of pyramid levels corners are kept.

After the corners are extracted from each level of the pyramid, the algorithm goes on to calculate a rBRIEF descriptor for each keypoint. The first step in calculating this descriptor is finding the local orientation of a keypoint. This is done by finding the angle between the keypoint and the intensity centroid of a patch surrounding the keypoint. The intensity centroid of a patch is defined as \(m_10/m_00\) , \(m_01/m_00\), where m_pq is defined as \(x^p * y^q * I(x, y)\) for each point \((x, y)\) in the patch. Considering this, we can define the orientation angle as \(atan2(m_01, m_10)\).

After the orientation of each keypoint is determined, the descriptor must be generated. The descriptor is generated by taking 256 binary tests on a patch surrounding a keypoint and combining their results into a 256 bit long bitstring. A binary test is defined as such: two pixels on an image patch are compared by intensity, and if the first has a greater intensity than the second, a value of 1 is returned; if not, a value of 0 is returned. The locations of these pixels for the tests is predetermined by a pattern that minimizes correlation and increases variance. The location pattern is rotated by the orientation angle before the tests are done. This allows for the descriptors to be rotationally invariant. In the current VPI CUDA implementation, keypoints close to the border do not have descriptor values and all bits are set to zero. These points are not automatically sorted out and if desired, need to be removed.

C API functions

For list of limitations, constraints and backends that implements the algorithm, consult reference documentation of the following functions:

Function Description
vpiInitORBParams Initializes VPIORBParams with default values.
vpiCreateORBFeatureDetector Creates a ORB payload.
vpiSubmitORBFeatureDetector Submits a ORB operation to the stream.

Usage

Language:
  1. Initialization phase
    1. Include the header that defines the ORB algorithm functions and parameter structure.
      #include <vpi/algo/ORB.h>
      Declares functions that implement support for ORB.
    2. Create the ORB parameter object and instantiate with the default parameters
      VPIORBParams params;
      vpiInitORBParams(&params);
      VPIStatus vpiInitORBParams(VPIORBParams *params)
      Initializes VPIORBParams with default values.
      Structure that defines the parameters for both vpiCreateORBFeatureDetector and vpiSubmitORBFeatureDet...
      Definition: ORB.h:84
    3. Create the ORB payload. The input width/height/scale should match that of the input pyramid (the width and height corresponding to the first layer of the pyramid). The capacity is the maximum number of FAST corners initially detected at each level of the image pyramid. These are then filtered down to the top N best corners. A higher capacity typically corresponds to larger runtimes but more accurate keypoints.
      VPIPayload payload;
      VPIStatus vpiCreateORBFeatureDetector(uint64_t backends, size_t capacity, VPIPayload *payload)
      Creates a ORB payload.
      struct VPIPayloadImpl * VPIPayload
      A handle to an algorithm payload.
      Definition: Types.h:268
      @ VPI_BACKEND_CPU
      CPU backend.
      Definition: Types.h:92
    4. Create the stream where the algorithm will be submitted for execution.
      VPIStream stream;
      vpiStreamCreate(0, &stream);
      struct VPIStreamImpl * VPIStream
      A handle to a stream.
      Definition: Types.h:250
      VPIStatus vpiStreamCreate(uint64_t flags, VPIStream *stream)
      Create a stream instance.
    5. Define the input pyramid object (see Gaussian Pyramid Generator doc for details)
      VPIImage input;
      LoadImage(sIn, VPI_IMAGE_FORMAT_U8, &input);
      int width, height;
      vpiImageGetSize(input, &width, &height);
      VPIPyramid inputPyr;
      VPI_CHECK_STATUS(
      vpiPyramidCreate(width, height, VPI_IMAGE_FORMAT_U8, params.pyramidLevels, 0.5, VPI_BACKEND_CPU, &inputPyr));
      vpiStreamSync(stream);
      #define VPI_IMAGE_FORMAT_U8
      Single plane with one 8-bit unsigned integer channel.
      Definition: ImageFormat.h:100
      VPIStatus vpiSubmitGaussianPyramidGenerator(VPIStream stream, uint64_t backend, VPIImage input, VPIPyramid output, VPIBorderExtension border)
      Computes the Gaussian pyramid from the input image.
      struct VPIImageImpl * VPIImage
      A handle to an image.
      Definition: Types.h:256
      VPIStatus vpiImageGetSize(VPIImage img, int32_t *width, int32_t *height)
      Get the image dimensions in pixels.
      int32_t pyramidLevels
      The number of levels in the scale pyramid to utilize.
      Definition: ORB.h:98
      VPIStatus vpiPyramidCreate(int32_t width, int32_t height, VPIImageFormat fmt, int32_t numLevels, float scale, uint64_t flags, VPIPyramid *pyr)
      Create an empty image pyramid instance with the specified flags.
      struct VPIPyramidImpl * VPIPyramid
      A handle to an image pyramid.
      Definition: Types.h:262
      VPIStatus vpiStreamSync(VPIStream stream)
      Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
      @ VPI_BORDER_ZERO
      All pixels outside the image are considered to be zero.
      Definition: Types.h:278
    6. Create the output array that will store the keypoints with the ORB corners. The output array capacity controls the maximum corners to be found. In this example, the output array capacity is set to \( 1000 \).
      VPIArray corners;
      VPIStatus vpiArrayCreate(int32_t capacity, VPIArrayType type, uint64_t flags, VPIArray *array)
      Create an empty array instance.
      struct VPIArrayImpl * VPIArray
      A handle to an array.
      Definition: Types.h:232
      @ VPI_ARRAY_TYPE_KEYPOINT_F32
      VPIKeypointF32 element.
      Definition: ArrayType.h:77
    7. Create the output array that will store the descriptor of the ORB corners. In this example, the output array capacity is set to \( 1000 \).
      VPIArray descriptors;
      vpiArrayCreate(1000, VPI_ARRAY_TYPE_BRIEF_DESCRIPTOR, 0, &descriptors);
  2. Processing phase
    1. Submit the algorithm and its parameters to the stream. It'll be executed by the CPU backend. In this example, the border zero is used to treat pixels outside of the border as having an intensity of 0.
      vpiSubmitORBFeatureDetector(stream, VPI_BACKEND_CPU, payload, inputPyr, corners, descriptors, &params, VPI_BORDER_ZERO);
      VPIStatus vpiSubmitORBFeatureDetector(VPIStream stream, uint64_t backend, VPIPayload payload, VPIPyramid input, VPIArray outCorners, VPIArray outDescriptors, const VPIORBParams *params, VPIBorderExtension border)
      Submits a ORB operation to the stream.
    2. Optionally, wait until the processing is done.
      vpiStreamSync(stream);
  3. Cleanup phase
    1. Free resources held by the stream, the input image and the output array.
      vpiArrayDestroy(corners);
      void vpiArrayDestroy(VPIArray array)
      Destroy an array instance.
      void vpiImageDestroy(VPIImage img)
      Destroy an image instance.
      void vpiStreamDestroy(VPIStream stream)
      Destroy a stream instance and deallocate all HW resources.

For more information, see ORB feature in the "C API Reference" section of VPI - Vision Programming Interface.

Performance

Performance benchmarks will be added at a later time.

References

  1. Rublee, E., Rabaud, V., Konolige, K., & Bradski, G. (2011, November). ORB: An efficient alternative to SIFT or SURF. In 2011 International conference on computer vision (pp. 2564-2571). Ieee.
  2. Lowe, D. G. (1999, September). Object recognition from local scale-invariant features. In Proceedings of the seventh IEEE international conference on computer vision (Vol. 2, pp. 1150-1157). Ieee.
  3. Bay, H., Tuytelaars, T., & Van Gool, L. (2006). Surf: Speeded up robust features. In European conference on computer vision (pp. 404-417). Springer, Berlin, Heidelberg.
  4. Funayama, R., Yanagihara, H., Van Gool, L., Tuytelaars, T., & Bay, H. (2012). U.S. Patent No. 8,165,401. Washington, DC: U.S. Patent and Trademark Office.