VPI - Vision Programming Interface

3.1 Release

ORB feature detector

Overview

Oriented FAST and rBRIEF (ORB) is a feature detection and description algorithm. It detects features across an input pyramid as well as a descriptor for each feature, returning the coordinates for each feature as well as its associated bitstring descriptor. This sample application does: (1) read an input image; (2) create a Gaussian pyramid from the image; (3) run ORB on the pyramid; (4) draw the ORB features as key points with colors representing the ORB descriptor; (5) write an output image with the colored key points. Each feature is drawn on top of the input image as a circle, whose color maps from blue to red the descriptor associated with the feature. The map uses the Hamming distance of each descriptor to the first descriptor, hence the blue key point is the first descriptor and shades of yellow to red are progressively distant.

Instructions

The command line parameters are:

<backend> <input image>

where

  • backend: either cpu or cuda; it defines the backend that will perform the processing.
  • input image: input image file name to be used as source image, it accepts png, jpeg and others.

Here's one example:

  • C++
    ./vpi_sample_18_orb_feature_detector cuda ../assets/kodim08.png
  • Python
    python3 main.py cuda ../assets/kodim08.png

This is using the CUDA backend and one of the provided sample images. You can try with other images, respecting the constraints imposed by the algorithm.

Results

Input imageMatched featurs between input image and flipped image image

Source Code

For convenience, here's the code that is also installed in the samples directory.

Language:
27 import sys
28 import vpi
29 import numpy as np
30 from PIL import Image, ImageOps
31 from argparse import ArgumentParser
32 import cv2
33 
34 
35 # Parse command line arguments
36 parser = ArgumentParser()
37 parser.add_argument('backend', choices=['cpu','cuda'],
38  help='Backend to be used for processing')
39 
40 parser.add_argument('s', metavar='filename',
41  help='Image to be used as source')
42 
43 args = parser.parse_args()
44 
45 if args.backend == 'cpu':
46  backend = vpi.Backend.CPU
47 elif args.backend == 'cuda':
48  backend = vpi.Backend.CUDA
49 else:
50  sys.exit("Un-supported backend")
51 
52 # Load input image into a vpi.Image
53 try:
54  srcData = np.asarray(ImageOps.grayscale(Image.open(args.s)))
55 except IOError:
56  sys.exit("Source file not found")
57 except:
58  sys.exit("Error with source file")
59 
60 src = vpi.asimage(srcData)
61 
62 # Using the chosen backend to build the input pyramid and run ORB
63 with backend:
64  pyr = src.gaussian_pyramid(3)
65  corners, descriptors = pyr.orb(intensity_threshold=142, max_features_per_level=88, max_pyr_levels=3)
66 
67 # Draw the keypoints in the output image
68 
69 out = src.convert(vpi.Format.BGR8, backend=vpi.Backend.CUDA)
70 
71 if corners.size > 0:
72  distances = []
73  with descriptors.rlock_cpu() as descriptors_data:
74  first_desc = descriptors_data[0][0]
75  for i in range(descriptors.size):
76  curr_desc = descriptors_data[i][0]
77  hamm_dist = sum([bin(c ^ f).count('1') for c, f in zip(curr_desc, first_desc)])
78  distances.append(hamm_dist)
79 
80  max_dist = max(distances)
81 
82  cmap = cv2.applyColorMap(np.arange(0, 256, dtype=np.uint8), cv2.COLORMAP_JET)
83  cmap_idx = lambda i: int(round((distances[i] / max_dist) * 255))
84 
85  with out.lock_cpu() as out_data, corners.rlock_cpu() as corners_data:
86  for i in range(corners.size):
87  color = tuple([int(x) for x in cmap[cmap_idx(i), 0]])
88  kpt = tuple(corners_data[i].astype(np.int16))
89  x = kpt[0] * (2 ** kpt[2])
90  y = kpt[1] * (2 ** kpt[2])
91  cv2.circle(out_data, (x, y), 3, color, -1)
92 
93 # Save the output image to disk
94 cv2.imwrite('orb_feature_python'+str(sys.version_info[0])+'_'+args.backend+'.png', out.cpu())
29 #include <opencv2/core.hpp>
30 #include <opencv2/features2d.hpp>
31 #include <opencv2/imgcodecs.hpp>
32 #include <opencv2/imgproc.hpp>
33 #include <vpi/OpenCVInterop.hpp>
34 
35 #include <vpi/Array.h>
36 #include <vpi/Image.h>
37 #include <vpi/Pyramid.h>
38 #include <vpi/Status.h>
39 #include <vpi/Stream.h>
42 #include <vpi/algo/ImageFlip.h>
43 #include <vpi/algo/ORB.h>
44 
45 #include <bitset>
46 #include <cmath>
47 #include <cstdio>
48 #include <cstring>
49 #include <iostream>
50 #include <numeric>
51 #include <sstream>
52 #include <vector>
53 
54 #define CHECK_STATUS(STMT) \
55  do \
56  { \
57  VPIStatus status = (STMT); \
58  if (status != VPI_SUCCESS) \
59  { \
60  char buffer[VPI_MAX_STATUS_MESSAGE_LENGTH]; \
61  vpiGetLastStatusMessage(buffer, sizeof(buffer)); \
62  std::ostringstream ss; \
63  ss << vpiStatusGetName(status) << ": " << buffer; \
64  throw std::runtime_error(ss.str()); \
65  } \
66  } while (0);
67 
68 static cv::Mat DrawKeypoints(cv::Mat img, VPIPyramidalKeypointF32 *kpts, VPIBriefDescriptor *descs, int numKeypoints)
69 {
70  cv::Mat out;
71  img.convertTo(out, CV_8UC1);
72  cvtColor(out, out, cv::COLOR_GRAY2BGR);
73 
74  if (numKeypoints == 0)
75  {
76  return out;
77  }
78 
79  std::vector<int> distances(numKeypoints, 0);
80  float maxDist = 0.f;
81 
82  for (int i = 0; i < numKeypoints; i++)
83  {
84  for (int j = 0; j < VPI_BRIEF_DESCRIPTOR_ARRAY_LENGTH; j++)
85  {
86  distances[i] += std::bitset<8 * sizeof(uint8_t)>(descs[i].data[j] ^ descs[0].data[j]).count();
87  }
88  if (distances[i] > maxDist)
89  {
90  maxDist = distances[i];
91  }
92  }
93 
94  uint8_t ids[256];
95  std::iota(&ids[0], &ids[0] + 256, 0);
96  cv::Mat idsMat(256, 1, CV_8UC1, ids);
97 
98  cv::Mat cmap;
99  applyColorMap(idsMat, cmap, cv::COLORMAP_JET);
100 
101  for (int i = 0; i < numKeypoints; i++)
102  {
103  int cmapIdx = static_cast<int>(std::round((distances[i] / maxDist) * 255));
104 
105  float rescale = std::pow(2, kpts[i].octave);
106  float x = kpts[i].x * rescale;
107  float y = kpts[i].y * rescale;
108 
109  circle(out, cv::Point(x, y), 3, cmap.at<cv::Vec3b>(cmapIdx, 0), -1);
110  }
111 
112  return out;
113 }
114 
115 int main(int argc, char *argv[])
116 {
117  // OpenCV image that will be wrapped by a VPIImage.
118  // Define it here so that it's destroyed *after* wrapper is destroyed
119  cv::Mat cvImage;
120 
121  // VPI objects that will be used
122  VPIImage imgInput = NULL;
123  VPIImage imgGrayScale = NULL;
124 
125  VPIPyramid pyrInput = NULL;
126  VPIArray keypoints = NULL;
127  VPIArray descriptors = NULL;
128  VPIPayload orbPayload = NULL;
129  VPIStream stream = NULL;
130 
131  int retval = 0;
132 
133  try
134  {
135  // =============================
136  // Parse command line parameters
137 
138  if (argc != 3)
139  {
140  throw std::runtime_error(std::string("Usage: ") + argv[0] + " <cpu|cuda> <input image>");
141  }
142 
143  std::string strBackend = argv[1];
144  std::string strInputFileName = argv[2];
145 
146  // Now parse the backend
147  VPIBackend backend;
148 
149  if (strBackend == "cpu")
150  {
151  backend = VPI_BACKEND_CPU;
152  }
153  else if (strBackend == "cuda")
154  {
155  backend = VPI_BACKEND_CUDA;
156  }
157  else
158  {
159  throw std::runtime_error("Backend '" + strBackend + "' not recognized, it must be either cpu or cuda.");
160  }
161 
162  // Use the selected backend with CPU to be able to read data back from CUDA to CPU for example.
163  const VPIBackend backendWithCPU = static_cast<VPIBackend>(backend | VPI_BACKEND_CPU);
164 
165  // =====================
166  // Load the input image
167 
168  cvImage = cv::imread(strInputFileName);
169  if (cvImage.empty())
170  {
171  throw std::runtime_error("Can't open first image: '" + strInputFileName + "'");
172  }
173 
174  // =================================
175  // Allocate all VPI resources needed
176 
177  // Create the stream where processing will happen
178  CHECK_STATUS(vpiStreamCreate(0, &stream));
179 
180  // Define the algorithm parameters.
181  VPIORBParams orbParams;
182  CHECK_STATUS(vpiInitORBParams(&orbParams));
183 
184  orbParams.fastParams.intensityThreshold = 142;
185  orbParams.maxFeaturesPerLevel = 88;
186  orbParams.maxPyramidLevels = 3;
187 
188  // We now wrap the loaded image into a VPIImage object to be used by VPI.
189  // VPI won't make a copy of it, so the original image must be in scope at all times.
190  CHECK_STATUS(vpiImageCreateWrapperOpenCVMat(cvImage, 0, &imgInput));
191  CHECK_STATUS(vpiImageCreate(cvImage.cols, cvImage.rows, VPI_IMAGE_FORMAT_U8, 0, &imgGrayScale));
192 
193  // For the output arrays capacity we can use the maximum number of features per level multiplied by the
194  // maximum number of pyramid levels, this will be the de factor maximum for all levels of the input.
195  int outCapacity = orbParams.maxFeaturesPerLevel * orbParams.maxPyramidLevels;
196 
197  // Create the output keypoint array.
198  CHECK_STATUS(vpiArrayCreate(outCapacity, VPI_ARRAY_TYPE_PYRAMIDAL_KEYPOINT_F32, backendWithCPU, &keypoints));
199 
200  // Create the output descriptors array. To output corners only use NULL instead.
201  CHECK_STATUS(vpiArrayCreate(outCapacity, VPI_ARRAY_TYPE_BRIEF_DESCRIPTOR, backendWithCPU, &descriptors));
202 
203  // For the internal buffers capacity we can use the maximum number of features per level multiplied by 20.
204  // This will make FAST find a large number of corners so then ORB can select the top N corners in
205  // accordance to Harris score of each corner, where N = maximum number of features per level.
206  int bufCapacity = orbParams.maxFeaturesPerLevel * 20;
207 
208  // Create the payload for ORB Feature Detector algorithm
209  CHECK_STATUS(vpiCreateORBFeatureDetector(backend, bufCapacity, &orbPayload));
210 
211  // ================
212  // Processing stage
213 
214  // First convert input to grayscale
215  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, backend, imgInput, imgGrayScale, NULL));
216 
217  // Then, create the Gaussian Pyramid for the image and wait for the execution to finish
218  CHECK_STATUS(vpiPyramidCreate(cvImage.cols, cvImage.rows, VPI_IMAGE_FORMAT_U8, orbParams.maxPyramidLevels, 0.5,
219  backend, &pyrInput));
220  CHECK_STATUS(vpiSubmitGaussianPyramidGenerator(stream, backend, imgGrayScale, pyrInput, VPI_BORDER_CLAMP));
221 
222  // Then get ORB features and wait for the execution to finish
223  CHECK_STATUS(vpiSubmitORBFeatureDetector(stream, backend, orbPayload, pyrInput, keypoints, descriptors,
224  &orbParams, VPI_BORDER_LIMITED));
225 
226  CHECK_STATUS(vpiStreamSync(stream));
227 
228  // =======================================
229  // Output processing and saving it to disk
230 
231  // Lock output keypoints and scores to retrieve its data on cpu memory
232  VPIArrayData outKeypointsData;
233  VPIArrayData outDescriptorsData;
234  VPIImageData imgData;
235  CHECK_STATUS(vpiArrayLockData(keypoints, VPI_LOCK_READ, VPI_ARRAY_BUFFER_HOST_AOS, &outKeypointsData));
236  CHECK_STATUS(vpiArrayLockData(descriptors, VPI_LOCK_READ, VPI_ARRAY_BUFFER_HOST_AOS, &outDescriptorsData));
237  CHECK_STATUS(vpiImageLockData(imgGrayScale, VPI_LOCK_READ, VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR, &imgData));
238 
239  VPIPyramidalKeypointF32 *outKeypoints = (VPIPyramidalKeypointF32 *)outKeypointsData.buffer.aos.data;
240  VPIBriefDescriptor *outDescriptors = (VPIBriefDescriptor *)outDescriptorsData.buffer.aos.data;
241 
242  cv::Mat img;
243  CHECK_STATUS(vpiImageDataExportOpenCVMat(imgData, &img));
244 
245  // Draw the keypoints in the output image
246  cv::Mat outImage = DrawKeypoints(img, outKeypoints, outDescriptors, *outKeypointsData.buffer.aos.sizePointer);
247 
248  // Save the output image to disk
249  imwrite("orb_feature_detector_" + strBackend + ".png", outImage);
250 
251  // Done handling outputs, don't forget to unlock them.
252  CHECK_STATUS(vpiImageUnlock(imgGrayScale));
253  CHECK_STATUS(vpiArrayUnlock(keypoints));
254  CHECK_STATUS(vpiArrayUnlock(descriptors));
255  }
256  catch (std::exception &e)
257  {
258  std::cerr << e.what() << std::endl;
259  retval = 1;
260  }
261 
262  // ========
263  // Clean up
264 
265  // Make sure stream is synchronized before destroying the objects
266  // that might still be in use.
267  vpiStreamSync(stream);
268 
269  vpiImageDestroy(imgInput);
270  vpiImageDestroy(imgGrayScale);
271  vpiArrayDestroy(keypoints);
272  vpiArrayDestroy(descriptors);
273  vpiPayloadDestroy(orbPayload);
274  vpiStreamDestroy(stream);
275 
276  return retval;
277 }
Functions and structures for dealing with VPI arrays.
Declares functions that handle image format conversion.
Declares functions that handle gaussian pyramids.
Declares functions that implement Image flip algorithms.
#define VPI_IMAGE_FORMAT_U8
Single plane with one 8-bit unsigned integer channel.
Definition: ImageFormat.h:100
Functions and structures for dealing with VPI images.
Declares functions that implement support for ORB.
Functions for handling OpenCV interoperability with VPI.
Functions and structures for dealing with VPI pyramids.
Declaration of VPI status codes handling functions.
Declares functions dealing with VPI streams.
#define VPI_BRIEF_DESCRIPTOR_ARRAY_LENGTH
Length of Brief Descriptor Array.
Definition: Types.h:363
Stores a BRIEF Descriptor.
Definition: Types.h:374
void * data
Points to the first element of the array.
Definition: Array.h:135
VPIArrayBuffer buffer
Stores the array contents.
Definition: Array.h:175
int32_t * sizePointer
Points to the number of elements in the array.
Definition: Array.h:122
VPIArrayBufferAOS aos
Array stored in array-of-structures layout.
Definition: Array.h:162
VPIStatus vpiArrayUnlock(VPIArray array)
Releases the lock on array object.
VPIStatus vpiArrayLockData(VPIArray array, VPILockMode mode, VPIArrayBufferType bufType, VPIArrayData *data)
Acquires the lock on an array object and returns the array contents.
void vpiArrayDestroy(VPIArray array)
Destroy an array instance.
VPIStatus vpiArrayCreate(int32_t capacity, VPIArrayType type, uint64_t flags, VPIArray *array)
Create an empty array instance.
struct VPIArrayImpl * VPIArray
A handle to an array.
Definition: Types.h:232
@ VPI_ARRAY_TYPE_BRIEF_DESCRIPTOR
VPIBriefDescriptor element.
Definition: ArrayType.h:84
@ VPI_ARRAY_TYPE_PYRAMIDAL_KEYPOINT_F32
VPIPyramidalKeypointF32 element.
Definition: ArrayType.h:87
@ VPI_ARRAY_BUFFER_HOST_AOS
Host-accessible array-of-structures.
Definition: Array.h:146
Stores information about array characteristics and contents.
Definition: Array.h:168
VPIStatus vpiSubmitConvertImageFormat(VPIStream stream, uint64_t backend, VPIImage input, VPIImage output, const VPIConvertImageFormatParams *params)
Converts the image contents to the desired format, with optional scaling and offset.
float intensityThreshold
Threshold to select a pixel as being part of the arc in circle around a keypoint candidate.
Definition: FASTCorners.h:112
VPIStatus vpiSubmitGaussianPyramidGenerator(VPIStream stream, uint64_t backend, VPIImage input, VPIPyramid output, VPIBorderExtension border)
Computes the Gaussian pyramid from the input image.
void vpiImageDestroy(VPIImage img)
Destroy an image instance.
struct VPIImageImpl * VPIImage
A handle to an image.
Definition: Types.h:256
VPIStatus vpiImageLockData(VPIImage img, VPILockMode mode, VPIImageBufferType bufType, VPIImageData *data)
Acquires the lock on an image object and returns the image contents.
VPIStatus vpiImageCreate(int32_t width, int32_t height, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
Create an empty image instance with the specified flags.
VPIStatus vpiImageUnlock(VPIImage img)
Releases the lock on an image object.
@ VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR
Host-accessible with planes in pitch-linear memory layout.
Definition: Image.h:172
Stores information about image characteristics and content.
Definition: Image.h:234
VPIStatus vpiImageCreateWrapperOpenCVMat(const cv::Mat &mat, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
Wraps a cv::Mat in an VPIImage with the given image format.
VPIStatus vpiImageDataExportOpenCVMat(const VPIImageData &imgData, cv::Mat *mat)
Fills an existing cv::Mat with data from VPIImageData coming from a locked VPIImage.
int32_t maxFeaturesPerLevel
The maximum number N of features per level of the input pyramid to be used by ORB.
Definition: ORB.h:106
VPIFASTCornerDetectorParams fastParams
Parameters for the FAST corner detector, see FAST Corners Detector for more details.
Definition: ORB.h:94
int32_t maxPyramidLevels
Maximum number of levels in the input pyramid to utilize.
Definition: ORB.h:111
VPIStatus vpiInitORBParams(VPIORBParams *params)
Initializes VPIORBParams with default values.
VPIStatus vpiSubmitORBFeatureDetector(VPIStream stream, uint64_t backend, VPIPayload payload, VPIPyramid input, VPIArray outCorners, VPIArray outDescriptors, const VPIORBParams *params, VPIBorderExtension border)
Submits an ORB feature detector operation to the stream.
VPIStatus vpiCreateORBFeatureDetector(uint64_t backends, int32_t capacity, VPIPayload *payload)
Creates an ORB feature detector payload.
Structure that defines the parameters for vpiSubmitORBFeatureDetector.
Definition: ORB.h:89
struct VPIPayloadImpl * VPIPayload
A handle to an algorithm payload.
Definition: Types.h:268
void vpiPayloadDestroy(VPIPayload payload)
Deallocates the payload object and all associated resources.
VPIStatus vpiPyramidCreate(int32_t width, int32_t height, VPIImageFormat fmt, int32_t numLevels, float scale, uint64_t flags, VPIPyramid *pyr)
Create an empty image pyramid instance with the specified flags.
struct VPIPyramidImpl * VPIPyramid
A handle to an image pyramid.
Definition: Types.h:262
struct VPIStreamImpl * VPIStream
A handle to a stream.
Definition: Types.h:250
VPIStatus vpiStreamSync(VPIStream stream)
Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
VPIBackend
VPI Backend types.
Definition: Types.h:91
void vpiStreamDestroy(VPIStream stream)
Destroy a stream instance and deallocate all HW resources.
VPIStatus vpiStreamCreate(uint64_t flags, VPIStream *stream)
Create a stream instance.
@ VPI_BACKEND_CUDA
CUDA backend.
Definition: Types.h:93
@ VPI_BACKEND_CPU
CPU backend.
Definition: Types.h:92
float y
Keypoint's y coordinate.
Definition: Types.h:322
float x
Keypoint's x coordinate.
Definition: Types.h:321
@ VPI_BORDER_LIMITED
Consider image as limited to not access outside pixels.
Definition: Types.h:282
@ VPI_BORDER_CLAMP
Border pixels are repeated indefinitely.
Definition: Types.h:279
@ VPI_LOCK_READ
Lock memory only for reading.
Definition: Types.h:617
Stores a float32 pyramidal-based keypoint coordinate The coordinate includes the (x,...
Definition: Types.h:320