VPI - Vision Programming Interface

4.0 Release

ORB feature detector

Overview

Oriented FAST and rBRIEF (ORB) is a feature detection and description algorithm. It detects features across an input pyramid as well as a descriptor for each feature, returning the coordinates for each feature as well as its associated bitstring descriptor. This sample application does: (1) read an input image; (2) create a Gaussian pyramid from the image; (3) run ORB on the pyramid; (4) draw the ORB features as key points with colors representing the ORB descriptor; (5) write an output image with the colored key points. Each feature is drawn on top of the input image as a circle, whose color maps from blue to red the descriptor associated with the feature. The map uses the Hamming distance of each descriptor to the first descriptor, hence the blue key point is the first descriptor and shades of yellow to red are progressively distant.

Instructions

The command line parameters are:

<backend> <input image>

where

  • backend: either cpu or cuda; it defines the backend that will perform the processing.
  • input image: input image file name to be used as source image, it accepts png, jpeg and others.

Here's one example:

  • C++
    ./vpi_sample_18_orb_feature_detector cuda ../assets/kodim08.png
  • Python
    python3 main.py cuda ../assets/kodim08.png

This is using the CUDA backend and one of the provided sample images. You can try with other images, respecting the constraints imposed by the algorithm.

Results

Input imageMatched featurs between input image and flipped image image

Source Code

For convenience, here's the code that is also installed in the samples directory.

Language:
27 import sys
28 import vpi
29 import numpy as np
30 from PIL import Image, ImageOps
31 from argparse import ArgumentParser
32 import cv2
33 
34 
35 # Parse command line arguments
36 parser = ArgumentParser()
37 parser.add_argument('backend', choices=['cpu','cuda'],
38  help='Backend to be used for processing')
39 
40 parser.add_argument('s', metavar='filename',
41  help='Image to be used as source')
42 
43 args = parser.parse_args()
44 
45 if args.backend == 'cpu':
46  backend = vpi.Backend.CPU
47 elif args.backend == 'cuda':
48  backend = vpi.Backend.CUDA
49 else:
50  sys.exit("Un-supported backend")
51 
52 # Load input image into a vpi.Image
53 try:
54  srcData = np.asarray(ImageOps.grayscale(Image.open(args.s)))
55 except IOError:
56  sys.exit("Source file not found")
57 except:
58  sys.exit("Error with source file")
59 
60 src = vpi.asimage(srcData)
61 
62 # Using the chosen backend to build the input pyramid and run ORB
63 with backend:
64  pyr = src.gaussian_pyramid(3)
65  corners, descriptors = pyr.orb(intensity_threshold=142, max_features_per_level=88, max_pyr_levels=3)
66 
67 # Draw the keypoints in the output image
68 
69 out = src.convert(vpi.Format.BGR8, backend=vpi.Backend.CUDA)
70 
71 if corners.size > 0:
72  distances = []
73  with descriptors.rlock_cpu() as descriptors_data:
74  first_desc = descriptors_data[0][0]
75  for i in range(descriptors.size):
76  curr_desc = descriptors_data[i][0]
77  hamm_dist = sum([bin(c ^ f).count('1') for c, f in zip(curr_desc, first_desc)])
78  distances.append(hamm_dist)
79 
80  max_dist = max(distances)
81 
82  cmap = cv2.applyColorMap(np.arange(0, 256, dtype=np.uint8), cv2.COLORMAP_JET)
83  cmap_idx = lambda i: int(round((distances[i] / max_dist) * 255))
84 
85  with out.lock_cpu() as out_data, corners.rlock_cpu() as corners_data:
86  for i in range(corners.size):
87  color = tuple([int(x) for x in cmap[cmap_idx(i), 0]])
88  kpt = tuple(corners_data[i].astype(np.int16))
89  x = kpt[0] * (2 ** kpt[2])
90  y = kpt[1] * (2 ** kpt[2])
91  cv2.circle(out_data, (x, y), 3, color, -1)
92 
93 # Save the output image to disk
94 cv2.imwrite('orb_feature_python'+str(sys.version_info[0])+'_'+args.backend+'.png', out.cpu())
29 #include <opencv2/core.hpp>
30 #include <opencv2/features2d.hpp>
31 #include <opencv2/imgcodecs.hpp>
32 #include <opencv2/imgproc.hpp>
33 #include <vpi/OpenCVInterop.hpp>
34 
35 #include <vpi/Array.h>
36 #include <vpi/Image.h>
37 #include <vpi/Pyramid.h>
38 #include <vpi/Status.h>
39 #include <vpi/Stream.h>
42 #include <vpi/algo/ImageFlip.h>
43 #include <vpi/algo/ORB.h>
44 
45 #include <bitset>
46 #include <cmath>
47 #include <cstdio>
48 #include <cstring>
49 #include <iostream>
50 #include <numeric>
51 #include <sstream>
52 #include <vector>
53 
54 #define CHECK_STATUS(STMT) \
55  do \
56  { \
57  VPIStatus status = (STMT); \
58  if (status != VPI_SUCCESS) \
59  { \
60  char buffer[VPI_MAX_STATUS_MESSAGE_LENGTH]; \
61  vpiGetLastStatusMessage(buffer, sizeof(buffer)); \
62  std::ostringstream ss; \
63  ss << vpiStatusGetName(status) << ": " << buffer; \
64  throw std::runtime_error(ss.str()); \
65  } \
66  } while (0);
67 
68 static cv::Mat DrawKeypoints(cv::Mat img, VPIPyramidalKeypointF32 *kpts, VPIBriefDescriptor *descs, int numKeypoints)
69 {
70  cv::Mat out;
71  img.convertTo(out, CV_8UC1);
72  cvtColor(out, out, cv::COLOR_GRAY2BGR);
73 
74  if (numKeypoints == 0)
75  {
76  return out;
77  }
78 
79  std::vector<int> distances(numKeypoints, 0);
80  float maxDist = 0.f;
81 
82  for (int i = 0; i < numKeypoints; i++)
83  {
84  for (int j = 0; j < VPI_BRIEF_DESCRIPTOR_ARRAY_LENGTH; j++)
85  {
86  distances[i] += std::bitset<8 * sizeof(uint8_t)>(descs[i].data[j] ^ descs[0].data[j]).count();
87  }
88  if (distances[i] > maxDist)
89  {
90  maxDist = distances[i];
91  }
92  }
93 
94  uint8_t ids[256];
95  std::iota(&ids[0], &ids[0] + 256, 0);
96  cv::Mat idsMat(256, 1, CV_8UC1, ids);
97 
98  cv::Mat cmap;
99  applyColorMap(idsMat, cmap, cv::COLORMAP_JET);
100 
101  for (int i = 0; i < numKeypoints; i++)
102  {
103  int cmapIdx = static_cast<int>(std::round((distances[i] / maxDist) * 255));
104 
105  float rescale = std::pow(2, kpts[i].octave);
106  float x = kpts[i].x * rescale;
107  float y = kpts[i].y * rescale;
108 
109  circle(out, cv::Point(x, y), 3, cmap.at<cv::Vec3b>(cmapIdx, 0), -1);
110  }
111 
112  return out;
113 }
114 
115 int main(int argc, char *argv[])
116 {
117  // OpenCV image that will be wrapped by a VPIImage.
118  // Define it here so that it's destroyed *after* wrapper is destroyed
119  cv::Mat cvImage;
120 
121  // VPI objects that will be used
122  VPIImage imgInput = NULL;
123  VPIImage imgGrayScale = NULL;
124 
125  VPIPyramid pyrInput = NULL;
126  VPIArray keypoints = NULL;
127  VPIArray descriptors = NULL;
128  VPIPayload orbPayload = NULL;
129  VPIStream stream = NULL;
130 
131  int retval = 0;
132 
133  try
134  {
135  // =============================
136  // Parse command line parameters
137 
138  if (argc != 3)
139  {
140  throw std::runtime_error(std::string("Usage: ") + argv[0] + " <cpu|cuda|pva> <input image>");
141  }
142 
143  std::string strBackend = argv[1];
144  std::string strInputFileName = argv[2];
145 
146  // Now parse the backend
147  VPIBackend backend;
148 
149  if (strBackend == "cpu")
150  {
151  backend = VPI_BACKEND_CPU;
152  }
153  else if (strBackend == "cuda")
154  {
155  backend = VPI_BACKEND_CUDA;
156  }
157  else if (strBackend == "pva")
158  {
159  backend = VPI_BACKEND_PVA;
160  }
161  else
162  {
163  throw std::runtime_error("Backend '" + strBackend +
164  "' not recognized, it must be either cpu, cuda or pva.");
165  }
166 
167  // Use the selected backend with CPU to be able to read data back from CUDA to CPU for example.
168  const VPIBackend backendWithCPU = static_cast<VPIBackend>(backend | VPI_BACKEND_CPU);
169 
170  // =====================
171  // Load the input image
172 
173  cvImage = cv::imread(strInputFileName);
174  if (cvImage.empty())
175  {
176  throw std::runtime_error("Can't open first image: '" + strInputFileName + "'");
177  }
178 
179  // =================================
180  // Allocate all VPI resources needed
181 
182  // Create the stream where processing will happen
183  CHECK_STATUS(vpiStreamCreate(0, &stream));
184 
185  // Define the algorithm parameters.
186  VPIORBParams orbParams;
187  CHECK_STATUS(vpiInitORBParams(&orbParams));
188 
189  orbParams.fastParams.intensityThreshold = 142;
190  orbParams.maxFeaturesPerLevel = 88;
191  orbParams.maxPyramidLevels = 3;
192 
193  // We now wrap the loaded image into a VPIImage object to be used by VPI.
194  // VPI won't make a copy of it, so the original image must be in scope at all times.
195  CHECK_STATUS(vpiImageCreateWrapperOpenCVMat(cvImage, 0, &imgInput));
196  CHECK_STATUS(vpiImageCreate(cvImage.cols, cvImage.rows, VPI_IMAGE_FORMAT_U8, 0, &imgGrayScale));
197 
198  // For the output arrays capacity we can use the maximum number of features per level multiplied by the
199  // maximum number of pyramid levels, this will be the de factor maximum for all levels of the input.
200  int outCapacity = orbParams.maxFeaturesPerLevel * orbParams.maxPyramidLevels;
201 
202  // Create the output keypoint array.
203  CHECK_STATUS(vpiArrayCreate(outCapacity, VPI_ARRAY_TYPE_PYRAMIDAL_KEYPOINT_F32, backendWithCPU, &keypoints));
204 
205  // Create the output descriptors array. To output corners only use NULL instead.
206  CHECK_STATUS(vpiArrayCreate(outCapacity, VPI_ARRAY_TYPE_BRIEF_DESCRIPTOR, backendWithCPU, &descriptors));
207 
208  // For the internal buffers capacity we can use the maximum number of features per level multiplied by 20.
209  // This will make FAST find a large number of corners so then ORB can select the top N corners in
210  // accordance to Harris score of each corner, where N = maximum number of features per level.
211  int bufCapacity = orbParams.maxFeaturesPerLevel * 20;
212 
213  // Create the payload for ORB Feature Detector algorithm
214  CHECK_STATUS(vpiCreateORBFeatureDetector(backend, bufCapacity, &orbPayload));
215 
216  // ================
217  // Processing stage
218 
219  // First convert input to grayscale
220  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, backend, imgInput, imgGrayScale, NULL));
221 
222  // Then, create the Gaussian Pyramid for the image and wait for the execution to finish
223  CHECK_STATUS(vpiPyramidCreate(cvImage.cols, cvImage.rows, VPI_IMAGE_FORMAT_U8, orbParams.maxPyramidLevels, 0.5,
224  backend, &pyrInput));
225  CHECK_STATUS(vpiSubmitGaussianPyramidGenerator(stream, backend, imgGrayScale, pyrInput, VPI_BORDER_CLAMP));
226 
227  // Then get ORB features and wait for the execution to finish
228  CHECK_STATUS(vpiSubmitORBFeatureDetector(stream, backend, orbPayload, pyrInput, keypoints, descriptors,
229  &orbParams, VPI_BORDER_LIMITED));
230 
231  CHECK_STATUS(vpiStreamSync(stream));
232 
233  // =======================================
234  // Output processing and saving it to disk
235 
236  // Lock output keypoints and scores to retrieve its data on cpu memory
237  VPIArrayData outKeypointsData;
238  VPIArrayData outDescriptorsData;
239  VPIImageData imgData;
240  CHECK_STATUS(vpiArrayLockData(keypoints, VPI_LOCK_READ, VPI_ARRAY_BUFFER_HOST_AOS, &outKeypointsData));
241  CHECK_STATUS(vpiArrayLockData(descriptors, VPI_LOCK_READ, VPI_ARRAY_BUFFER_HOST_AOS, &outDescriptorsData));
242  CHECK_STATUS(vpiImageLockData(imgGrayScale, VPI_LOCK_READ, VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR, &imgData));
243 
244  VPIPyramidalKeypointF32 *outKeypoints = (VPIPyramidalKeypointF32 *)outKeypointsData.buffer.aos.data;
245  VPIBriefDescriptor *outDescriptors = (VPIBriefDescriptor *)outDescriptorsData.buffer.aos.data;
246 
247  cv::Mat img;
248  CHECK_STATUS(vpiImageDataExportOpenCVMat(imgData, &img));
249 
250  // Draw the keypoints in the output image
251  cv::Mat outImage = DrawKeypoints(img, outKeypoints, outDescriptors, *outKeypointsData.buffer.aos.sizePointer);
252 
253  // Save the output image to disk
254  imwrite("orb_feature_detector_" + strBackend + ".png", outImage);
255 
256  // Done handling outputs, don't forget to unlock them.
257  CHECK_STATUS(vpiImageUnlock(imgGrayScale));
258  CHECK_STATUS(vpiArrayUnlock(keypoints));
259  CHECK_STATUS(vpiArrayUnlock(descriptors));
260  }
261  catch (std::exception &e)
262  {
263  std::cerr << e.what() << std::endl;
264  retval = 1;
265  }
266 
267  // ========
268  // Clean up
269 
270  // Make sure stream is synchronized before destroying the objects
271  // that might still be in use.
272  vpiStreamSync(stream);
273 
274  vpiImageDestroy(imgInput);
275  vpiImageDestroy(imgGrayScale);
276  vpiArrayDestroy(keypoints);
277  vpiArrayDestroy(descriptors);
278  vpiPayloadDestroy(orbPayload);
279  vpiStreamDestroy(stream);
280 
281  return retval;
282 }
Functions and structures for dealing with VPI arrays.
Declares functions that handle image format conversion.
Declares functions that handle gaussian pyramids.
Declares functions that implement Image flip algorithms.
#define VPI_IMAGE_FORMAT_U8
Single plane with one 8-bit unsigned integer channel.
Definition: ImageFormat.h:100
Functions and structures for dealing with VPI images.
Declares functions that implement support for ORB.
Functions for handling OpenCV interoperability with VPI.
Functions and structures for dealing with VPI pyramids.
Declaration of VPI status codes handling functions.
Declares functions dealing with VPI streams.
#define VPI_BRIEF_DESCRIPTOR_ARRAY_LENGTH
Length of Brief Descriptor Array.
Definition: Types.h:367
Stores a BRIEF Descriptor.
Definition: Types.h:378
void * data
Points to the first element of the array.
Definition: Array.h:135
VPIArrayBuffer buffer
Stores the array contents.
Definition: Array.h:175
int32_t * sizePointer
Points to the number of elements in the array.
Definition: Array.h:122
VPIArrayBufferAOS aos
Array stored in array-of-structures layout.
Definition: Array.h:162
VPIStatus vpiArrayUnlock(VPIArray array)
Releases the lock on array object.
VPIStatus vpiArrayLockData(VPIArray array, VPILockMode mode, VPIArrayBufferType bufType, VPIArrayData *data)
Acquires the lock on an array object and returns the array contents.
void vpiArrayDestroy(VPIArray array)
Destroy an array instance.
VPIStatus vpiArrayCreate(int32_t capacity, VPIArrayType type, uint64_t flags, VPIArray *array)
Create an empty array instance.
struct VPIArrayImpl * VPIArray
A handle to an array.
Definition: Types.h:230
@ VPI_ARRAY_TYPE_BRIEF_DESCRIPTOR
VPIBriefDescriptor element.
Definition: ArrayType.h:84
@ VPI_ARRAY_TYPE_PYRAMIDAL_KEYPOINT_F32
VPIPyramidalKeypointF32 element.
Definition: ArrayType.h:87
@ VPI_ARRAY_BUFFER_HOST_AOS
Host-accessible array-of-structures.
Definition: Array.h:146
Stores information about array characteristics and contents.
Definition: Array.h:168
VPIStatus vpiSubmitConvertImageFormat(VPIStream stream, uint64_t backend, VPIImage input, VPIImage output, const VPIConvertImageFormatParams *params)
Converts the image contents to the desired format, with optional scaling and offset.
float intensityThreshold
Threshold to select a pixel as being part of the arc in circle around a keypoint candidate.
Definition: FASTCorners.h:112
VPIStatus vpiSubmitGaussianPyramidGenerator(VPIStream stream, uint64_t backend, VPIImage input, VPIPyramid output, VPIBorderExtension border)
Computes the Gaussian pyramid from the input image.
void vpiImageDestroy(VPIImage img)
Destroy an image instance.
struct VPIImageImpl * VPIImage
A handle to an image.
Definition: Types.h:254
VPIStatus vpiImageLockData(VPIImage img, VPILockMode mode, VPIImageBufferType bufType, VPIImageData *data)
Acquires the lock on an image object and returns the image contents.
VPIStatus vpiImageCreate(int32_t width, int32_t height, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
Create an empty image instance with the specified flags.
VPIStatus vpiImageUnlock(VPIImage img)
Releases the lock on an image object.
@ VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR
Host-accessible with planes in pitch-linear memory layout.
Definition: Image.h:176
Stores information about image characteristics and content.
Definition: Image.h:238
VPIStatus vpiImageCreateWrapperOpenCVMat(const cv::Mat &mat, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
Wraps a cv::Mat in an VPIImage with the given image format.
VPIStatus vpiImageDataExportOpenCVMat(const VPIImageData &imgData, cv::Mat *mat)
Fills an existing cv::Mat with data from VPIImageData coming from a locked VPIImage.
int32_t maxFeaturesPerLevel
The maximum number N of features per level of the input pyramid to be used by ORB.
Definition: ORB.h:106
VPIFASTCornerDetectorParams fastParams
Parameters for the FAST corner detector, see FAST Corners Detector for more details.
Definition: ORB.h:94
int32_t maxPyramidLevels
Maximum number of levels in the input pyramid to utilize.
Definition: ORB.h:111
VPIStatus vpiInitORBParams(VPIORBParams *params)
Initializes VPIORBParams with default values.
VPIStatus vpiSubmitORBFeatureDetector(VPIStream stream, uint64_t backend, VPIPayload payload, VPIPyramid input, VPIArray outCorners, VPIArray outDescriptors, const VPIORBParams *params, VPIBorderExtension border)
Submits an ORB feature detector operation to the stream.
VPIStatus vpiCreateORBFeatureDetector(uint64_t backends, int32_t capacity, VPIPayload *payload)
Creates an ORB feature detector payload.
Structure that defines the parameters for vpiSubmitORBFeatureDetector.
Definition: ORB.h:89
struct VPIPayloadImpl * VPIPayload
A handle to an algorithm payload.
Definition: Types.h:266
void vpiPayloadDestroy(VPIPayload payload)
Deallocates the payload object and all associated resources.
VPIStatus vpiPyramidCreate(int32_t width, int32_t height, VPIImageFormat fmt, int32_t numLevels, float scale, uint64_t flags, VPIPyramid *pyr)
Create an empty image pyramid instance with the specified flags.
struct VPIPyramidImpl * VPIPyramid
A handle to an image pyramid.
Definition: Types.h:260
struct VPIStreamImpl * VPIStream
A handle to a stream.
Definition: Types.h:248
VPIStatus vpiStreamSync(VPIStream stream)
Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
VPIBackend
VPI Backend types.
Definition: Types.h:91
void vpiStreamDestroy(VPIStream stream)
Destroy a stream instance and deallocate all HW resources.
VPIStatus vpiStreamCreate(uint64_t flags, VPIStream *stream)
Create a stream instance.
@ VPI_BACKEND_CUDA
CUDA backend.
Definition: Types.h:93
@ VPI_BACKEND_PVA
PVA backend.
Definition: Types.h:94
@ VPI_BACKEND_CPU
CPU backend.
Definition: Types.h:92
float y
Keypoint's y coordinate.
Definition: Types.h:326
float x
Keypoint's x coordinate.
Definition: Types.h:325
@ VPI_BORDER_LIMITED
Consider image as limited to not access outside pixels.
Definition: Types.h:280
@ VPI_BORDER_CLAMP
Border pixels are repeated indefinitely.
Definition: Types.h:277
@ VPI_LOCK_READ
Lock memory only for reading.
Definition: Types.h:621
Stores a float32 pyramidal-based keypoint coordinate The coordinate includes the (x,...
Definition: Types.h:324