Overview

Oriented FAST and rBRIEF (ORB) [1] is a feature detector and descriptor extractor algorithm. It detects features or corners, also known as keypoints, across an input pyramid and extracts a descriptor for each feature, returning its coordinates, including the octave (i.e. the pyramid level) where the feature was found, as well as its associated bitstring descriptor. The advantage of ORB over other feature detectors and descriptor extractors, such as SIFT [2], is its relative simplicity and computational efficiency. This advantage is important in real-time video processing and machine learning pipelines. The main disadvantage of ORB is its lack of robustness describing features on images that change in angle and scale. Although ORB is designed to handle such cases, it is not as effective as other algorithms, such as SIFT or SURF [3] in this aspect.

The example below shows an input image on the left with its corresponding feature locations on the right (features with octave greater than zero are rescaled back to octave zero or the base image in the pyramid).

Input	Output

Implementation

The ORB algorithm detects features or corners, algo known as keypoints, in each level (or octave) of the input pyramid using FAST algorithm. The input is normally a Gaussian pyramid, that allows to generate multiple corners at different scales of the base image.

For each level of the input pyramid, the ORB algorithm runs FAST to detect potentially a large number of corners. Afterwards, ORB assigns a cornerness score to each FAST corner detected. One way to assign scores is via the HARRIS algorithm, using the Harris response score with a 3x3 block window and sensitivy factor equal 1. The cornerness score is then used to sort all FAST corners from highest to lowest score value and filter the top N detected corners by ORB, where N is potentially much smaller than the total number of corners found by FAST. Another way to assign scores is via FAST itself, effectivelly skipping the Harris response score assignment and sorting, trading quality of corners detected by ORB for performance.

The corners detected by ORB on each level of the input pyramid are gathered in a single output array. Corners store the (x, y, octave) position on the input pyramid, where octave is the pyramid level where the corner was found and (x, y) is the position inside the image at that pyramid level. ORB only considers one layer per octave, thus ORB keypoints always store layer=0 in the keypoint structure. Corners in the final, lowest resolution levels may be discarded if the maximum capacity of the output array is reached.

ORB calculates a descriptor called rBRIEF for each of the corners detected. This is done in the highest, base level of the input pyramid. The first step to calculate the rBRIEF descriptor of a corner is to compute its local orientation. This is done by calculating the angle between the corner and the intensity centroid of a patch surrounding the corner. The intensity centroid of a patch is defined as \(m_{10}/m_{00}\) , \(m_{01}/m_{00}\), where \(m\_{pq}\) is defined as \(x^p * y^q * I(x, y)\) for each point \((x,~y)\) in the patch. Considering this, we can define the orientation angle as \(atan2(m_{01},~m_{10})\).

After the orientation of each corner is determined, the descriptor must be generated. The descriptor is generated by doing 256 binary tests on a patch surrounding a corner and combining their results into a 256 bit string. Each binary test is defined as: two pixels on a patch are compared by intensity, and if the first has greater intensity than the second, a value of 1 is set; if not, a value of 0 is set. The pixel intensities are gathered at the pyramid level where the corner was found. The pixel locations for these tests are determined by a pattern that minimizes correlation and increases variance. The location pattern is rotated by the orientation angle before the tests are done. This ensures the descriptors are rotationally invariant.

C API functions

For list of limitations, constraints and backends that implements the algorithm, consult reference documentation of the following functions:

Function	Description
vpiInitORBParams	Initializes VPIORBParams with default values.
vpiCreateORBFeatureDetector	Creates an ORB feature detector payload.
vpiSubmitORBFeatureDetector	Submits an ORB feature detector operation to the stream.
vpiCreateORBDescriptorExtractor	Creates an ORB descriptor extractor payload.
vpiSubmitORBDescriptorExtractor	Submits an ORB descriptor extractor operation to the stream.

Usage

Language: C/C++ Python

Import VPI module
import vpi
Read an input image, convert it to grayscale and construct a Gaussian pyramid from it, using the CPU backend.
input = vpi.asimage(np.asarray(Image.open(args.input))) \

.convert(vpi.Format.U8, backend=vpi.Backend.CPU) \

.gaussian_pyramid(4, backend=vpi.Backend.CPU)
Run ORB corner detector algorithm on the input pyramid using the CPU backend. The FAST intensity threshold is set to 142 to be more selective in the corners found in this example. Also, the maximum number of features per level is set to 88 and maximum input pyramid levels to use is 3.
with vpi.Backend.CPU:

corners, descriptors = input.orb(intensity_threshold=142, max_features_per_level=88, max_pyr_levels=3)
Optionally, retrieve all corner positions found and their descriptors as numpy array in the CPU memory.
a_corners = corners.cpu()

a_descriptors = descriptors.cpu()

Initialization phase
1. Include the header that defines the ORB algorithm functions and parameter structure.
  #include <vpi/algo/ORB.h>
  
  ORB.h
  Declares functions that implement support for ORB.
2. Create the ORB parameter object, initially setting it to the default parameters. The FAST intensity threshold is set to 142 to be more selective in the corners found in this example. Also, the maximum number of features per level is set to 88 and maximum input pyramid levels to use is 3.
  VPIORBParams params;
  
  vpiInitORBParams(&params);
  
  params.fastParams.intensityThreshold = 142;
  
  params.maxFeaturesPerLevel = 88;
  
  params.maxPyramidLevels = 3;
  
  VPIFASTCornerDetectorParams::intensityThreshold
  float intensityThreshold
  Threshold to select a pixel as being part of the arc in circle around a keypoint candidate.
  Definition: FASTCorners.h:112
  
  VPIORBParams::maxFeaturesPerLevel
  int32_t maxFeaturesPerLevel
  The maximum number N of features per level of the input pyramid to be used by ORB.
  Definition: ORB.h:106
  
  VPIORBParams::fastParams
  VPIFASTCornerDetectorParams fastParams
  Parameters for the FAST corner detector, see FAST Corners Detector for more details.
  Definition: ORB.h:94
  
  VPIORBParams::maxPyramidLevels
  int32_t maxPyramidLevels
  Maximum number of levels in the input pyramid to utilize.
  Definition: ORB.h:111
  
  vpiInitORBParams
  VPIStatus vpiInitORBParams(VPIORBParams *params)
  Initializes VPIORBParams with default values.
  
  VPIORBParams
  Structure that defines the parameters for vpiSubmitORBFeatureDetector.
  Definition: ORB.h:89
3. Create the ORB payload. The capacity is the maximum number of FAST corners that can be detected at each level of the input pyramid. A higher capacity typically corresponds to slower run times but provides ORB with more corners to filter. This internal buffer capacity is set to 20 times the maximum number of features per level.
  VPIPayload payload;
  
  int bufCapacity = params.maxFeaturesPerLevel * 20;
  
  vpiCreateORBFeatureDetector(VPI_BACKEND_CPU, bufCapacity, &payload);
  
  vpiCreateORBFeatureDetector
  VPIStatus vpiCreateORBFeatureDetector(uint64_t backends, int32_t capacity, VPIPayload *payload)
  Creates an ORB feature detector payload.
  
  VPIPayload
  struct VPIPayloadImpl * VPIPayload
  A handle to an algorithm payload.
  Definition: Types.h:268
  
  VPI_BACKEND_CPU
  @ VPI_BACKEND_CPU
  CPU backend.
  Definition: Types.h:92
4. Create the stream where the algorithm will be submitted for execution.
  VPIStream stream;
  
  vpiStreamCreate(0, &stream);
  
  VPIStream
  struct VPIStreamImpl * VPIStream
  A handle to a stream.
  Definition: Types.h:250
  
  vpiStreamCreate
  VPIStatus vpiStreamCreate(uint64_t flags, VPIStream *stream)
  Create a stream instance.
5. Define the input pyramid object, please see Gaussian pyramid for details.
  VPIImage input;
  
  LoadImage(sIn, VPI_IMAGE_FORMAT_U8, &input);
  
  int width, height;
  
  vpiImageGetSize(input, &width, &height);
  
  VPIPyramid inputPyr;
  
  vpiPyramidCreate(width, height, VPI_IMAGE_FORMAT_U8, 4, 0.5, VPI_BACKEND_CPU, &inputPyr);
  
  vpiSubmitGaussianPyramidGenerator(stream, VPI_BACKEND_CPU, input, inputPyr, VPI_BORDER_CLAMP);
  
  VPI_IMAGE_FORMAT_U8
  #define VPI_IMAGE_FORMAT_U8
  Single plane with one 8-bit unsigned integer channel.
  Definition: ImageFormat.h:100
  
  vpiSubmitGaussianPyramidGenerator
  VPIStatus vpiSubmitGaussianPyramidGenerator(VPIStream stream, uint64_t backend, VPIImage input, VPIPyramid output, VPIBorderExtension border)
  Computes the Gaussian pyramid from the input image.
  
  VPIImage
  struct VPIImageImpl * VPIImage
  A handle to an image.
  Definition: Types.h:256
  
  vpiImageGetSize
  VPIStatus vpiImageGetSize(VPIImage img, int32_t *width, int32_t *height)
  Get the image dimensions in pixels.
  
  vpiPyramidCreate
  VPIStatus vpiPyramidCreate(int32_t width, int32_t height, VPIImageFormat fmt, int32_t numLevels, float scale, uint64_t flags, VPIPyramid *pyr)
  Create an empty image pyramid instance with the specified flags.
  
  VPIPyramid
  struct VPIPyramidImpl * VPIPyramid
  A handle to an image pyramid.
  Definition: Types.h:262
  
  VPI_BORDER_CLAMP
  @ VPI_BORDER_CLAMP
  Border pixels are repeated indefinitely.
  Definition: Types.h:279
6. Create the output array that will store the ORB corners. The output array capacity controls the maximum number of corners to be detected by ORB in all levels. This output capacity is set to the maximum number of features per level times maximum pyramid levels to use.
  VPIArray corners;
  
  int outCapacity = params.maxFeaturesPerLevel * params.maxPyramidLevels;
  
  vpiArrayCreate(outCapacity, VPI_ARRAY_TYPE_KEYPOINT_F32, 0, &corners);
  
  vpiArrayCreate
  VPIStatus vpiArrayCreate(int32_t capacity, VPIArrayType type, uint64_t flags, VPIArray *array)
  Create an empty array instance.
  
  VPIArray
  struct VPIArrayImpl * VPIArray
  A handle to an array.
  Definition: Types.h:232
  
  VPI_ARRAY_TYPE_KEYPOINT_F32
  @ VPI_ARRAY_TYPE_KEYPOINT_F32
  VPIKeypointF32 element.
  Definition: ArrayType.h:77
7. Create the output array that will store the ORB descriptors. It is one descriptor for each ORB corner detected. Its capacity is the same as the output corner array capacity.
  VPIArray descriptors;
  
  vpiArrayCreate(outCapacity, VPI_ARRAY_TYPE_BRIEF_DESCRIPTOR, 0, &descriptors);
  
  VPI_ARRAY_TYPE_BRIEF_DESCRIPTOR
  @ VPI_ARRAY_TYPE_BRIEF_DESCRIPTOR
  VPIBriefDescriptor element.
  Definition: ArrayType.h:84
Processing phase
1. Submit the algorithm and its parameters to the stream. It'll be executed by the CPU backend. The border limited is used to ignore pixels near image boundary.
  VPI_CHECK_STATUS(vpiSubmitORBFeatureDetector(stream, VPI_BACKEND_CPU, payload, inputPyr, corners, descriptors,
  
  &params, VPI_BORDER_LIMITED));
  
  vpiSubmitORBFeatureDetector
  VPIStatus vpiSubmitORBFeatureDetector(VPIStream stream, uint64_t backend, VPIPayload payload, VPIPyramid input, VPIArray outCorners, VPIArray outDescriptors, const VPIORBParams *params, VPIBorderExtension border)
  Submits an ORB feature detector operation to the stream.
  
  VPI_BORDER_LIMITED
  @ VPI_BORDER_LIMITED
  Consider image as limited to not access outside pixels.
  Definition: Types.h:282
2. Optionally, wait until the processing is done.
  vpiStreamSync(stream);
  
  vpiStreamSync
  VPIStatus vpiStreamSync(VPIStream stream)
  Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
Cleanup phase
1. Free resources held by the stream, the input image, output arrays and payload.
  vpiStreamDestroy(stream);
  
  vpiImageDestroy(input);
  
  vpiArrayDestroy(corners);
  
  vpiArrayDestroy(descriptors);
  
  vpiPayloadDestroy(payload);
  
  vpiArrayDestroy
  void vpiArrayDestroy(VPIArray array)
  Destroy an array instance.
  
  vpiImageDestroy
  void vpiImageDestroy(VPIImage img)
  Destroy an image instance.
  
  vpiPayloadDestroy
  void vpiPayloadDestroy(VPIPayload payload)
  Deallocates the payload object and all associated resources.
  
  vpiStreamDestroy
  void vpiStreamDestroy(VPIStream stream)
  Destroy a stream instance and deallocate all HW resources.

For more information, please see ORB features in the "C API Reference" section of VPI - Vision Programming Interface.

Performance

For information on how to use the performance table below, see Algorithm Performance Tables.
Before comparing measurements, consult Comparing Algorithm Elapsed Times.
For further information on how performance was benchmarked, see Performance Benchmark.

References

Rublee, E., Rabaud, V., Konolige, K., & Bradski, G. (2011, November). ORB: An efficient alternative to SIFT or SURF. In 2011 International conference on computer vision (pp. 2564-2571). Ieee.
Lowe, D. G. (1999, September). Object recognition from local scale-invariant features. In Proceedings of the seventh IEEE international conference on computer vision (Vol. 2, pp. 1150-1157). Ieee.
Bay, H., Tuytelaars, T., & Van Gool, L. (2006). Surf: Speeded up robust features. In European conference on computer vision (pp. 404-417). Springer, Berlin, Heidelberg.