VPI - Vision Programming Interface

3.0 Release

ORB feature detector

Overview

Oriented FAST and rBRIEF (ORB) is a feature detection and description algorithm. It detects features across an input pyramid as well as a descriptor for each feature, returning the coordinates for each feature as well as its associated bitstring descriptor. This sample application does: (1) read an input image; (2) create a Gaussian pyramid from the image; (3) run ORB on the pyramid; (4) draw the ORB features as key points with colors representing the ORB descriptor; (5) write an output image with the colored key points. Each feature is drawn on top of the input image as a circle, whose color maps from blue to red the descriptor associated with the feature. The map uses the Hamming distance of each descriptor to the first descriptor, hence the blue key point is the first descriptor and shades of yellow to red are progressively distant.

Instructions

The command line parameters are:

<backend> <input image>

where

  • backend: either cpu or cuda; it defines the backend that will perform the processing.
  • input image: input image file name to be used as source image, it accepts png, jpeg and others.

Here's one example:

  • C++
    ./vpi_sample_18_orb_feature_detector cuda ../assets/kodim08.png
  • Python
    python3 main.py cuda ../assets/kodim08.png

This is using the CUDA backend and one of the provided sample images. You can try with other images, respecting the constraints imposed by the algorithm.

Results

Input imageMatched featurs between input image and flipped image image

Source Code

For convenience, here's the code that is also installed in the samples directory.

Language:
27 import sys
28 import vpi
29 import numpy as np
30 from PIL import Image, ImageOps
31 from argparse import ArgumentParser
32 import cv2
33 
34 
35 # Parse command line arguments
36 parser = ArgumentParser()
37 parser.add_argument('backend', choices=['cpu','cuda'],
38  help='Backend to be used for processing')
39 
40 parser.add_argument('s', metavar='filename',
41  help='Image to be used as source')
42 
43 args = parser.parse_args()
44 
45 if args.backend == 'cpu':
46  backend = vpi.Backend.CPU
47 elif args.backend == 'cuda':
48  backend = vpi.Backend.CUDA
49 else:
50  sys.exit("Un-supported backend")
51 
52 # Load input image into a vpi.Image
53 try:
54  srcData = np.asarray(ImageOps.grayscale(Image.open(args.s)))
55 except IOError:
56  sys.exit("Source file not found")
57 except:
58  sys.exit("Error with source file")
59 
60 src = vpi.asimage(srcData)
61 
62 # Using the chosen backend to build the input pyramid and run ORB
63 with backend:
64  pyr = src.gaussian_pyramid(3)
65  corners, descriptors = pyr.orb(intensity_threshold=142, max_features_per_level=88, max_pyr_levels=3)
66 
67 # Draw the keypoints in the output image
68 
69 out = src.convert(vpi.Format.BGR8, backend=vpi.Backend.CUDA)
70 
71 if corners.size > 0:
72  distances = []
73  with descriptors.rlock_cpu() as descriptors_data:
74  first_desc = descriptors_data[0][0]
75  for i in range(descriptors.size):
76  curr_desc = descriptors_data[i][0]
77  hamm_dist = sum([bin(c ^ f).count('1') for c, f in zip(curr_desc, first_desc)])
78  distances.append(hamm_dist)
79 
80  max_dist = max(distances)
81 
82  cmap = cv2.applyColorMap(np.arange(0, 256, dtype=np.uint8), cv2.COLORMAP_JET)
83  cmap_idx = lambda i: int(round((distances[i] / max_dist) * 255))
84 
85  with out.lock_cpu() as out_data, corners.rlock_cpu() as corners_data:
86  for i in range(corners.size):
87  color = tuple([int(x) for x in cmap[cmap_idx(i), 0]])
88  kpt = tuple(corners_data[i].astype(np.int16))
89  cv2.circle(out_data, kpt, 3, color, -1)
90 
91 # Save the output image to disk
92 cv2.imwrite('orb_feature_python'+str(sys.version_info[0])+'_'+args.backend+'.png', out.cpu())
29 #include <opencv2/core.hpp>
30 #include <opencv2/features2d.hpp>
31 #include <opencv2/imgcodecs.hpp>
32 #include <opencv2/imgproc.hpp>
33 #include <vpi/OpenCVInterop.hpp>
34 
35 #include <vpi/Array.h>
36 #include <vpi/Image.h>
37 #include <vpi/Pyramid.h>
38 #include <vpi/Status.h>
39 #include <vpi/Stream.h>
42 #include <vpi/algo/ImageFlip.h>
43 #include <vpi/algo/ORB.h>
44 
45 #include <bitset>
46 #include <cmath>
47 #include <cstdio>
48 #include <cstring>
49 #include <iostream>
50 #include <numeric>
51 #include <sstream>
52 #include <vector>
53 
54 #define CHECK_STATUS(STMT) \
55  do \
56  { \
57  VPIStatus status = (STMT); \
58  if (status != VPI_SUCCESS) \
59  { \
60  char buffer[VPI_MAX_STATUS_MESSAGE_LENGTH]; \
61  vpiGetLastStatusMessage(buffer, sizeof(buffer)); \
62  std::ostringstream ss; \
63  ss << vpiStatusGetName(status) << ": " << buffer; \
64  throw std::runtime_error(ss.str()); \
65  } \
66  } while (0);
67 
68 static cv::Mat DrawKeypoints(cv::Mat img, VPIKeypointF32 *kpts, VPIBriefDescriptor *descs, int numKeypoints)
69 {
70  cv::Mat out;
71  img.convertTo(out, CV_8UC1);
72  cvtColor(out, out, cv::COLOR_GRAY2BGR);
73 
74  if (numKeypoints == 0)
75  {
76  return out;
77  }
78 
79  std::vector<int> distances(numKeypoints, 0);
80  float maxDist = 0.f;
81 
82  for (int i = 0; i < numKeypoints; i++)
83  {
84  for (int j = 0; j < VPI_BRIEF_DESCRIPTOR_ARRAY_LENGTH; j++)
85  {
86  distances[i] += std::bitset<8 * sizeof(uint8_t)>(descs[i].data[j] ^ descs[0].data[j]).count();
87  }
88  if (distances[i] > maxDist)
89  {
90  maxDist = distances[i];
91  }
92  }
93 
94  uint8_t ids[256];
95  std::iota(&ids[0], &ids[0] + 256, 0);
96  cv::Mat idsMat(256, 1, CV_8UC1, ids);
97 
98  cv::Mat cmap;
99  applyColorMap(idsMat, cmap, cv::COLORMAP_JET);
100 
101  for (int i = 0; i < numKeypoints; i++)
102  {
103  int cmapIdx = static_cast<int>(std::round((distances[i] / maxDist) * 255));
104 
105  circle(out, cv::Point(kpts[i].x, kpts[i].y), 3, cmap.at<cv::Vec3b>(cmapIdx, 0), -1);
106  }
107 
108  return out;
109 }
110 
111 int main(int argc, char *argv[])
112 {
113  // OpenCV image that will be wrapped by a VPIImage.
114  // Define it here so that it's destroyed *after* wrapper is destroyed
115  cv::Mat cvImage;
116 
117  // VPI objects that will be used
118  VPIImage imgInput = NULL;
119  VPIImage imgGrayScale = NULL;
120 
121  VPIPyramid pyrInput = NULL;
122  VPIArray keypoints = NULL;
123  VPIArray descriptors = NULL;
124  VPIPayload orbPayload = NULL;
125  VPIStream stream = NULL;
126 
127  int retval = 0;
128 
129  try
130  {
131  // =============================
132  // Parse command line parameters
133 
134  if (argc != 3)
135  {
136  throw std::runtime_error(std::string("Usage: ") + argv[0] + " <cpu|cuda> <input image>");
137  }
138 
139  std::string strBackend = argv[1];
140  std::string strInputFileName = argv[2];
141 
142  // Now parse the backend
143  VPIBackend backend;
144 
145  if (strBackend == "cpu")
146  {
147  backend = VPI_BACKEND_CPU;
148  }
149  else if (strBackend == "cuda")
150  {
151  backend = VPI_BACKEND_CUDA;
152  }
153  else
154  {
155  throw std::runtime_error("Backend '" + strBackend + "' not recognized, it must be either cpu or cuda.");
156  }
157 
158  // Use the selected backend with CPU to be able to read data back from CUDA to CPU for example.
159  const VPIBackend backendWithCPU = static_cast<VPIBackend>(backend | VPI_BACKEND_CPU);
160 
161  // =====================
162  // Load the input image
163 
164  cvImage = cv::imread(strInputFileName);
165  if (cvImage.empty())
166  {
167  throw std::runtime_error("Can't open first image: '" + strInputFileName + "'");
168  }
169 
170  // =================================
171  // Allocate all VPI resources needed
172 
173  // Create the stream where processing will happen
174  CHECK_STATUS(vpiStreamCreate(0, &stream));
175 
176  // Define the algorithm parameters.
177  VPIORBParams orbParams;
178  CHECK_STATUS(vpiInitORBParams(&orbParams));
179 
180  orbParams.fastParams.intensityThreshold = 142;
181  orbParams.maxFeaturesPerLevel = 88;
182  orbParams.maxPyramidLevels = 3;
183 
184  // We now wrap the loaded image into a VPIImage object to be used by VPI.
185  // VPI won't make a copy of it, so the original image must be in scope at all times.
186  CHECK_STATUS(vpiImageCreateWrapperOpenCVMat(cvImage, 0, &imgInput));
187  CHECK_STATUS(vpiImageCreate(cvImage.cols, cvImage.rows, VPI_IMAGE_FORMAT_U8, 0, &imgGrayScale));
188 
189  // For the output arrays capacity we can use the maximum number of features per level multiplied by the
190  // maximum number of pyramid levels, this will be the de factor maximum for all levels of the input.
191  int outCapacity = orbParams.maxFeaturesPerLevel * orbParams.maxPyramidLevels;
192 
193  // Create the output keypoint array.
194  CHECK_STATUS(vpiArrayCreate(outCapacity, VPI_ARRAY_TYPE_KEYPOINT_F32, backendWithCPU, &keypoints));
195 
196  // Create the output descriptors array. To output corners only use NULL instead.
197  CHECK_STATUS(vpiArrayCreate(outCapacity, VPI_ARRAY_TYPE_BRIEF_DESCRIPTOR, backendWithCPU, &descriptors));
198 
199  // For the internal buffers capacity we can use the maximum number of features per level multiplied by 20.
200  // This will make FAST find a large number of corners so then ORB can select the top N corners in
201  // accordance to Harris score of each corner, where N = maximum number of features per level.
202  int bufCapacity = orbParams.maxFeaturesPerLevel * 20;
203 
204  // Create the payload for ORB Feature Detector algorithm
205  CHECK_STATUS(vpiCreateORBFeatureDetector(backend, bufCapacity, &orbPayload));
206 
207  // ================
208  // Processing stage
209 
210  // First convert input to grayscale
211  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, backend, imgInput, imgGrayScale, NULL));
212 
213  // Then, create the Gaussian Pyramid for the image and wait for the execution to finish
214  CHECK_STATUS(vpiPyramidCreate(cvImage.cols, cvImage.rows, VPI_IMAGE_FORMAT_U8, orbParams.maxPyramidLevels, 0.5,
215  backend, &pyrInput));
216  CHECK_STATUS(vpiSubmitGaussianPyramidGenerator(stream, backend, imgGrayScale, pyrInput, VPI_BORDER_CLAMP));
217 
218  // Then get ORB features and wait for the execution to finish
219  CHECK_STATUS(vpiSubmitORBFeatureDetector(stream, backend, orbPayload, pyrInput, keypoints, descriptors,
220  &orbParams, VPI_BORDER_LIMITED));
221 
222  CHECK_STATUS(vpiStreamSync(stream));
223 
224  // =======================================
225  // Output processing and saving it to disk
226 
227  // Lock output keypoints and scores to retrieve its data on cpu memory
228  VPIArrayData outKeypointsData;
229  VPIArrayData outDescriptorsData;
230  VPIImageData imgData;
231  CHECK_STATUS(vpiArrayLockData(keypoints, VPI_LOCK_READ, VPI_ARRAY_BUFFER_HOST_AOS, &outKeypointsData));
232  CHECK_STATUS(vpiArrayLockData(descriptors, VPI_LOCK_READ, VPI_ARRAY_BUFFER_HOST_AOS, &outDescriptorsData));
233  CHECK_STATUS(vpiImageLockData(imgGrayScale, VPI_LOCK_READ, VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR, &imgData));
234 
235  VPIKeypointF32 *outKeypoints = (VPIKeypointF32 *)outKeypointsData.buffer.aos.data;
236  VPIBriefDescriptor *outDescriptors = (VPIBriefDescriptor *)outDescriptorsData.buffer.aos.data;
237 
238  cv::Mat img;
239  CHECK_STATUS(vpiImageDataExportOpenCVMat(imgData, &img));
240 
241  // Draw the keypoints in the output image
242  cv::Mat outImage = DrawKeypoints(img, outKeypoints, outDescriptors, *outKeypointsData.buffer.aos.sizePointer);
243 
244  // Save the output image to disk
245  imwrite("orb_feature_detector_" + strBackend + ".png", outImage);
246 
247  // Done handling outputs, don't forget to unlock them.
248  CHECK_STATUS(vpiImageUnlock(imgGrayScale));
249  CHECK_STATUS(vpiArrayUnlock(keypoints));
250  CHECK_STATUS(vpiArrayUnlock(descriptors));
251  }
252  catch (std::exception &e)
253  {
254  std::cerr << e.what() << std::endl;
255  retval = 1;
256  }
257 
258  // ========
259  // Clean up
260 
261  // Make sure stream is synchronized before destroying the objects
262  // that might still be in use.
263  vpiStreamSync(stream);
264 
265  vpiImageDestroy(imgInput);
266  vpiImageDestroy(imgGrayScale);
267  vpiArrayDestroy(keypoints);
268  vpiArrayDestroy(descriptors);
269  vpiPayloadDestroy(orbPayload);
270  vpiStreamDestroy(stream);
271 
272  return retval;
273 }
Functions and structures for dealing with VPI arrays.
Declares functions that handle image format conversion.
Declares functions that handle gaussian pyramids.
Declares functions that implement Image flip algorithms.
#define VPI_IMAGE_FORMAT_U8
Single plane with one 8-bit unsigned integer channel.
Definition: ImageFormat.h:100
Functions and structures for dealing with VPI images.
Declares functions that implement support for ORB.
Functions for handling OpenCV interoperability with VPI.
Functions and structures for dealing with VPI pyramids.
Declaration of VPI status codes handling functions.
Declares functions dealing with VPI streams.
#define VPI_BRIEF_DESCRIPTOR_ARRAY_LENGTH
Length of Brief Descriptor Array.
Definition: Types.h:344
Stores a BRIEF Descriptor.
Definition: Types.h:355
void * data
Points to the first element of the array.
Definition: Array.h:135
VPIArrayBuffer buffer
Stores the array contents.
Definition: Array.h:175
int32_t * sizePointer
Points to the number of elements in the array.
Definition: Array.h:122
VPIArrayBufferAOS aos
Array stored in array-of-structures layout.
Definition: Array.h:162
VPIStatus vpiArrayUnlock(VPIArray array)
Releases the lock on array object.
VPIStatus vpiArrayLockData(VPIArray array, VPILockMode mode, VPIArrayBufferType bufType, VPIArrayData *data)
Acquires the lock on an array object and returns the array contents.
void vpiArrayDestroy(VPIArray array)
Destroy an array instance.
VPIStatus vpiArrayCreate(int32_t capacity, VPIArrayType type, uint64_t flags, VPIArray *array)
Create an empty array instance.
struct VPIArrayImpl * VPIArray
A handle to an array.
Definition: Types.h:232
@ VPI_ARRAY_TYPE_KEYPOINT_F32
VPIKeypointF32 element.
Definition: ArrayType.h:77
@ VPI_ARRAY_TYPE_BRIEF_DESCRIPTOR
VPIBriefDescriptor element.
Definition: ArrayType.h:84
@ VPI_ARRAY_BUFFER_HOST_AOS
Host-accessible array-of-structures.
Definition: Array.h:146
Stores information about array characteristics and contents.
Definition: Array.h:168
VPIStatus vpiSubmitConvertImageFormat(VPIStream stream, uint64_t backend, VPIImage input, VPIImage output, const VPIConvertImageFormatParams *params)
Converts the image contents to the desired format, with optional scaling and offset.
float intensityThreshold
Threshold to select a pixel as being part of the arc in circle around a keypoint candidate.
Definition: FASTCorners.h:112
VPIStatus vpiSubmitGaussianPyramidGenerator(VPIStream stream, uint64_t backend, VPIImage input, VPIPyramid output, VPIBorderExtension border)
Computes the Gaussian pyramid from the input image.
void vpiImageDestroy(VPIImage img)
Destroy an image instance.
struct VPIImageImpl * VPIImage
A handle to an image.
Definition: Types.h:256
VPIStatus vpiImageLockData(VPIImage img, VPILockMode mode, VPIImageBufferType bufType, VPIImageData *data)
Acquires the lock on an image object and returns the image contents.
VPIStatus vpiImageCreate(int32_t width, int32_t height, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
Create an empty image instance with the specified flags.
VPIStatus vpiImageUnlock(VPIImage img)
Releases the lock on an image object.
@ VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR
Host-accessible with planes in pitch-linear memory layout.
Definition: Image.h:172
Stores information about image characteristics and content.
Definition: Image.h:234
VPIStatus vpiImageCreateWrapperOpenCVMat(const cv::Mat &mat, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
Wraps a cv::Mat in an VPIImage with the given image format.
VPIStatus vpiImageDataExportOpenCVMat(const VPIImageData &imgData, cv::Mat *mat)
Fills an existing cv::Mat with data from VPIImageData coming from a locked VPIImage.
int32_t maxFeaturesPerLevel
The maximum number N of features per level of the input pyramid to be used by ORB.
Definition: ORB.h:106
VPIFASTCornerDetectorParams fastParams
Parameters for the FAST corner detector, see FAST Corners Detector for more details.
Definition: ORB.h:94
int32_t maxPyramidLevels
Maximum number of levels in the input pyramid to utilize.
Definition: ORB.h:111
VPIStatus vpiInitORBParams(VPIORBParams *params)
Initializes VPIORBParams with default values.
VPIStatus vpiSubmitORBFeatureDetector(VPIStream stream, uint64_t backend, VPIPayload payload, VPIPyramid input, VPIArray outCorners, VPIArray outDescriptors, const VPIORBParams *params, VPIBorderExtension border)
Submits an ORB feature detector operation to the stream.
VPIStatus vpiCreateORBFeatureDetector(uint64_t backends, int32_t capacity, VPIPayload *payload)
Creates an ORB feature detector payload.
Structure that defines the parameters for vpiSubmitORBFeatureDetector.
Definition: ORB.h:89
struct VPIPayloadImpl * VPIPayload
A handle to an algorithm payload.
Definition: Types.h:268
void vpiPayloadDestroy(VPIPayload payload)
Deallocates the payload object and all associated resources.
VPIStatus vpiPyramidCreate(int32_t width, int32_t height, VPIImageFormat fmt, int32_t numLevels, float scale, uint64_t flags, VPIPyramid *pyr)
Create an empty image pyramid instance with the specified flags.
struct VPIPyramidImpl * VPIPyramid
A handle to an image pyramid.
Definition: Types.h:262
struct VPIStreamImpl * VPIStream
A handle to a stream.
Definition: Types.h:250
VPIStatus vpiStreamSync(VPIStream stream)
Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
VPIBackend
VPI Backend types.
Definition: Types.h:91
void vpiStreamDestroy(VPIStream stream)
Destroy a stream instance and deallocate all HW resources.
VPIStatus vpiStreamCreate(uint64_t flags, VPIStream *stream)
Create a stream instance.
@ VPI_BACKEND_CUDA
CUDA backend.
Definition: Types.h:93
@ VPI_BACKEND_CPU
CPU backend.
Definition: Types.h:92
@ VPI_BORDER_LIMITED
Consider image as limited to not access outside pixels.
Definition: Types.h:282
@ VPI_BORDER_CLAMP
Border pixels are repeated indefinitely.
Definition: Types.h:279
@ VPI_LOCK_READ
Lock memory only for reading.
Definition: Types.h:595
Stores a float32 keypoint coordinate The coordinate is relative to the top-left corner of an image.
Definition: Types.h:315