Oriented FAST and rBRIEF (ORB) [1] is a feature detector and descriptor extractor algorithm. It detects features or corners, also known as keypoints, across an input pyramid and extracts a descriptor for each feature, returning its coordinates, including the octave (i.e. the pyramid level) where the feature was found, as well as its associated bitstring descriptor. The advantage of ORB over other feature detectors and descriptor extractors, such as SIFT [2], is its relative simplicity and computational efficiency. This advantage is important in real-time video processing and machine learning pipelines. The main disadvantage of ORB is its lack of robustness describing features on images that change in angle and scale. Although ORB is designed to handle such cases, it is not as effective as other algorithms, such as SIFT or SURF [3] in this aspect.
The example below shows an input image on the left with its corresponding feature locations on the right (features with octave greater than zero are rescaled back to octave zero or the base image in the pyramid).
Input
Output
Implementation
The ORB algorithm detects features or corners, algo known as keypoints, in each level (or octave) of the input pyramid using FAST algorithm. The input is normally a Gaussian pyramid, that allows to generate multiple corners at different scales of the base image.
For each level of the input pyramid, the ORB algorithm runs FAST to detect potentially a large number of corners. Afterwards, ORB assigns a cornerness score to each FAST corner detected. One way to assign scores is via the HARRIS algorithm, using the Harris response score with a 3x3 block window and sensitivy factor equal 1. The cornerness score is then used to sort all FAST corners from highest to lowest score value and filter the top N detected corners by ORB, where N is potentially much smaller than the total number of corners found by FAST. Another way to assign scores is via FAST itself, effectivelly skipping the Harris response score assignment and sorting, trading quality of corners detected by ORB for performance.
The corners detected by ORB on each level of the input pyramid are gathered in a single output array. Corners store the (x, y, octave) position on the input pyramid, where octave is the pyramid level where the corner was found and (x, y) is the position inside the image at that pyramid level. ORB only considers one layer per octave, thus ORB keypoints always store layer=0 in the keypoint structure. Corners in the final, lowest resolution levels may be discarded if the maximum capacity of the output array is reached.
ORB calculates a descriptor called rBRIEF for each of the corners detected. This is done in the highest, base level of the input pyramid. The first step to calculate the rBRIEF descriptor of a corner is to compute its local orientation. This is done by calculating the angle between the corner and the intensity centroid of a patch surrounding the corner. The intensity centroid of a patch is defined as \(m_{10}/m_{00}\) , \(m_{01}/m_{00}\), where \(m\_{pq}\) is defined as \(x^p * y^q * I(x, y)\) for each point \((x,~y)\) in the patch. Considering this, we can define the orientation angle as \(atan2(m_{01},~m_{10})\).
After the orientation of each corner is determined, the descriptor must be generated. The descriptor is generated by doing 256 binary tests on a patch surrounding a corner and combining their results into a 256 bit string. Each binary test is defined as: two pixels on a patch are compared by intensity, and if the first has greater intensity than the second, a value of 1 is set; if not, a value of 0 is set. The pixel intensities are gathered at the pyramid level where the corner was found. The pixel locations for these tests are determined by a pattern that minimizes correlation and increases variance. The location pattern is rotated by the orientation angle before the tests are done. This ensures the descriptors are rotationally invariant.
C API functions
For list of limitations, constraints and backends that implements the algorithm, consult reference documentation of the following functions:
Run ORB corner detector algorithm on the input pyramid using the CPU backend. The FAST intensity threshold is set to 142 to be more selective in the corners found in this example. Also, the maximum number of features per level is set to 88 and maximum input pyramid levels to use is 3.
Declares functions that implement support for ORB.
Create the ORB parameter object, initially setting it to the default parameters. The FAST intensity threshold is set to 142 to be more selective in the corners found in this example. Also, the maximum number of features per level is set to 88 and maximum input pyramid levels to use is 3.
Create the ORB payload. The capacity is the maximum number of FAST corners that can be detected at each level of the input pyramid. A higher capacity typically corresponds to slower run times but provides ORB with more corners to filter. This internal buffer capacity is set to 20 times the maximum number of features per level.
Create the output array that will store the ORB corners. The output array capacity controls the maximum number of corners to be detected by ORB in all levels. This output capacity is set to the maximum number of features per level times maximum pyramid levels to use.
Create the output array that will store the ORB descriptors. It is one descriptor for each ORB corner detected. Its capacity is the same as the output corner array capacity.
Submit the algorithm and its parameters to the stream. It'll be executed by the CPU backend. The border limited is used to ignore pixels near image boundary.
Rublee, E., Rabaud, V., Konolige, K., & Bradski, G. (2011, November). ORB: An efficient alternative to SIFT or SURF. In 2011 International conference on computer vision (pp. 2564-2571). Ieee.
Lowe, D. G. (1999, September). Object recognition from local scale-invariant features. In Proceedings of the seventh IEEE international conference on computer vision (Vol. 2, pp. 1150-1157). Ieee.
Bay, H., Tuytelaars, T., & Van Gool, L. (2006). Surf: Speeded up robust features. In European conference on computer vision (pp. 404-417). Springer, Berlin, Heidelberg.
Generated by NVIDIA | Fri May 3 2024 16:28:29 | 63154aecb3404bace355cb192ed3564f98e1d80a