|
VPI - Vision Programming Interface
0.4.4 Release
|
Overview
This application tracks bounding boxes on an input video, draws the frames on each frame and saves them to disk. The user can define what backend will be used for processing.
- Note
- The output will be in grayscale as the algorithm currently only supports grayrescales.
This sample shows the following:
- Creating and destroying a VPI stream.
- Wrapping an image hosted on CPU (input frame) to be used by VPI.
- Wrapping an array hosted on CPU (input bounding boxes) to be used by VPI.
- Creating a VPI-managed 2D image where output will be written to.
- Use the multi-frame KLT Bounding Box Tracker algorithm.
- Simple stream synchronization.
- Array locking to access its contents from CPU side.
- Error handling.
- Environment clean up using user-defined context.
Instructions
The usage is:
./vpi_sample_06_klt_tracker <backend> <input video> <input bboxes> <output frames>
where
Here's one example:
./vpi_sample_06_klt_tracker cuda ../assets/dashcam.mp4 ../assets/dashcam_bboxes.txt frame.png
This is using the CUDA backend and one of the provided sample videos and bounding boxes.
Results
Frame 0445 | Frame 0465 |
 | |
Source code
For convenience, here's the code that is also installed in the samples directory.
29 #include <opencv2/core/version.hpp>
30 #if CV_MAJOR_VERSION >= 3
31 # include <opencv2/imgcodecs.hpp>
32 # include <opencv2/videoio.hpp>
34 # include <opencv2/highgui/highgui.hpp>
37 #include <opencv2/imgproc/imgproc.hpp>
53 #define CHECK_STATUS(STMT) \
56 VPIStatus status = (STMT); \
57 if (status != VPI_SUCCESS) \
59 char buffer[VPI_MAX_STATUS_MESSAGE_LENGTH]; \
60 vpiGetLastStatusMessage(buffer, sizeof(buffer)); \
61 std::ostringstream ss; \
62 ss << vpiStatusGetName(status) << ": " << buffer; \
63 throw std::runtime_error(ss.str()); \
73 memset(&imgData, 0,
sizeof(imgData));
84 throw std::runtime_error(
"Frame type not supported");
117 switch (imgdata.
type)
136 throw std::runtime_error(
"Image type not supported");
142 if (cvimg.type() == CV_16U)
144 cvimg.convertTo(out, CV_8U);
149 cvtColor(cvimg, out, cv::COLOR_GRAY2BGR);
165 for (
size_t i = 0; i < boxdata.
size; ++i)
167 if (pboxes[i].trackingStatus == 1)
178 x = pboxes[i].bbox.xform.mat3[0][2] + ppreds[i].mat3[0][2];
179 y = pboxes[i].bbox.xform.mat3[1][2] + ppreds[i].mat3[1][2];
180 w = pboxes[i].bbox.width * pboxes[i].bbox.xform.mat3[0][0] * ppreds[i].mat3[0][0];
181 h = pboxes[i].bbox.height * pboxes[i].bbox.xform.mat3[1][1] * ppreds[i].mat3[1][1];
183 rectangle(out, cv::Rect(x, y, w, h), cv::Scalar(rand() % 256, rand() % 256, rand() % 256), 2);
190 std::string fname = filename;
191 int ext = fname.rfind(
'.');
193 char buffer[512] = {};
194 snprintf(buffer,
sizeof(buffer) - 1,
"%s_%04d%s", fname.substr(0, ext).c_str(), frame, fname.substr(ext).c_str());
197 if (!imwrite(buffer, out, {cv::IMWRITE_JPEG_QUALITY, 70}))
199 throw std::runtime_error(
"Can't write to " + std::string(buffer));
203 int main(
int argc,
char *argv[])
216 throw std::runtime_error(std::string(
"Usage: ") + argv[0] +
217 " <cpu|pva|cuda> <input_video> <bbox descr> <output>");
220 std::string strBackend = argv[1];
221 std::string strInputVideo = argv[2];
222 std::string strInputBBoxes = argv[3];
223 std::string strOutputFiles = argv[4];
226 cv::VideoCapture invid;
227 if (!invid.open(strInputVideo))
229 throw std::runtime_error(
"Can't open '" + strInputVideo +
"'");
243 VPIArray inputBoxList, inputPredList;
246 std::vector<VPIKLTTrackedBoundingBox> bboxes;
247 std::vector<VPIHomographyTransform2D> preds;
251 std::map<int, size_t> bboxes_size_at_frame;
259 std::ifstream in(strInputBBoxes);
262 throw std::runtime_error(
"Can't open '" + strInputBBoxes +
"'");
266 int frame, x, y, w, h;
267 while (in >> frame >> x >> y >> w >> h)
269 if (bboxes.size() == 64)
271 throw std::runtime_error(
"Too many bounding boxes");
292 bboxes.push_back(track);
296 xform.
mat3[0][0] = 1;
297 xform.
mat3[1][1] = 1;
298 xform.
mat3[2][2] = 1;
299 preds.push_back(xform);
301 bboxes_size_at_frame[frame] = bboxes.size();
304 if (!in && !in.eof())
306 throw std::runtime_error(
"Can't parse bounding boxes, stopped at bbox #" +
307 std::to_string(bboxes.size()));
315 data.
data = &bboxes[0];
319 data.
data = &preds[0];
326 if (strBackend ==
"cpu")
330 else if (strBackend ==
"cuda")
334 else if (strBackend ==
"pva")
340 throw std::runtime_error(
"Backend '" + strBackend +
341 "' not recognized, it must be either cpu, cuda or pva.");
350 auto fetchFrame = [&invid, &nextFrame, backendType]() {
352 if (!invid.read(frame))
358 if (frame.channels() == 3)
360 cvtColor(frame, frame, cv::COLOR_BGR2GRAY);
369 frame.convertTo(aux, CV_16U);
374 assert(frame.type() == CV_8U);
383 cv::Mat cvTemplate = fetchFrame(), cvReference;
384 VPIImage imgTemplate = ToVPIImage(
nullptr, cvTemplate);
414 size_t curNumBoxes = 0;
418 size_t curFrame = nextFrame - 1;
421 auto tmp = --bboxes_size_at_frame.upper_bound(curFrame);
422 size_t bbox_count = tmp->second;
424 assert(bbox_count >= curNumBoxes &&
"input bounding boxes must be sorted by frame");
427 if (curNumBoxes != bbox_count)
439 for (
size_t i = 0; i < bbox_count - curNumBoxes; ++i)
441 std::cout << curFrame <<
" -> new " << curNumBoxes + i << std::endl;
443 assert(bbox_count <= bboxes.capacity());
444 assert(bbox_count <= preds.capacity());
446 curNumBoxes = bbox_count;
450 SaveKLTBoxes(imgTemplate, inputBoxList, inputPredList, strOutputFiles, curFrame);
453 cvReference = fetchFrame();
456 if (cvReference.data ==
nullptr)
463 imgReference = ToVPIImage(imgReference, cvReference);
468 outputBoxList, outputEstimList, ¶ms));
484 for (
size_t b = 0; b < curNumBoxes; ++b)
487 if (updated_bbox[b].trackingStatus)
490 if (bboxes[b].trackingStatus == 0)
492 std::cout << curFrame <<
" -> dropped " << b << std::endl;
493 bboxes[b].trackingStatus = 1;
500 if (updated_bbox[b].templateStatus)
502 std::cout << curFrame <<
" -> update " << b << std::endl;
512 bboxes[b] = updated_bbox[b];
515 bboxes[b].templateStatus = 1;
519 preds[b].
mat3[0][0] = 1;
520 preds[b].mat3[1][1] = 1;
521 preds[b].mat3[2][2] = 1;
526 bboxes[b].templateStatus = 0;
543 std::swap(imgTemplate, imgReference);
544 std::swap(cvTemplate, cvReference);
547 catch (std::exception &e)
549 std::cerr << e.what() << std::endl;
uint32_t height
Height of this plane in pixels.
struct VPIContextImpl * VPIContext
A handle to a context.
Structure that defines the parameters for vpiCreateKLTFeatureTracker.
VPIStatus vpiCreateKLTFeatureTracker(VPIBackend backend, uint32_t imageWidth, uint32_t imageHeight, VPIImageFormat imageFormat, VPIPayload *payload)
Creates payload for vpiSubmitKLTFeatureTracker.
uint32_t width
Width of this plane in pixels.
VPIStatus vpiArrayCreate(uint32_t capacity, VPIArrayType fmt, uint32_t flags, VPIArray *array)
Create an empty array instance.
float mat3[3][3]
3x3 homogeneous matrix that defines the homography.
VPIStatus vpiStreamCreate(uint32_t flags, VPIStream *stream)
Create a stream instance.
VPIBackend
VPI Backend types.
Stores a bounding box that is being tracked by KLT Tracker.
@ VPI_LOCK_READ_WRITE
Lock memory for reading and writing.
VPIStatus vpiContextCreate(uint32_t flags, VPIContext *ctx)
Create a context instance.
uint32_t numberOfIterationsScaling
Number of Inverse compositional iterations of scale estimations.
VPIStatus vpiContextSetCurrent(VPIContext ctx)
Sets the context for the calling thread.
@ VPI_LOCK_READ
Lock memory only for reading.
VPIStatus vpiSubmitKLTFeatureTracker(VPIStream stream, VPIPayload payload, VPIImage templateImage, VPIArray inputBoxList, VPIArray inputPredictionList, VPIImage referenceImage, VPIArray outputBoxList, VPIArray outputEstimationList, const VPIKLTFeatureTrackerParams *params)
Runs KLT Feature Tracker on two frames.
VPIArrayType type
Type of each array element.
VPIStatus vpiArraySetSize(VPIArray array, uint32_t size)
Set the array size in elements.
float maxScaleChange
Maximum relative scale change.
VPIStatus vpiArrayUnlock(VPIArray array)
Releases the lock on array object.
VPIStatus vpiImageUnlock(VPIImage img)
Releases the lock on an image object.
Functions and structures for dealing with VPI arrays.
VPIStatus vpiArrayCreateHostMemWrapper(const VPIArrayData *arrayData, uint32_t flags, VPIArray *array)
Create an array object by wrapping an existing host memory block.
VPIStatus vpiStreamSync(VPIStream stream)
Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
@ VPI_BACKEND_CUDA
CUDA backend.
Stores information about image characteristics and content.
struct VPIStreamImpl * VPIStream
A handle to a stream.
VPIBoundingBox bbox
Bounding box being tracked.
uint32_t size
Number of elements in the array.
@ VPI_ARRAY_TYPE_HOMOGRAPHY_TRANSFORM_2D
VPIHomographyTransform2D element.
uint8_t trackingStatus
Tracking status of this bounding box.
void vpiContextDestroy(VPIContext ctx)
Destroy a context instance as well as all resources it owns.
uint32_t capacity
Maximum number of elements that the array can hold.
void * data
Points to the first element of the array.
VPIImagePlane planes[VPI_MAX_PLANE_COUNT]
Data of all image planes.
Functions and structures for dealing with VPI images.
float nccThresholdUpdate
Threshold for requiring template update.
@ VPI_ARRAY_TYPE_KLT_TRACKED_BOUNDING_BOX
VPIKLTTrackedBoundingBox element.
uint32_t pitchBytes
Difference in bytes of beginning of one row and the beginning of the previous.
struct VPIImageImpl * VPIImage
A handle to an image.
VPIKLTFeatureTrackerType trackingType
Type of KLT tracking that will be performed.
int32_t numPlanes
Number of planes.
Declares functions that implement the KLT Feature Tracker algorithm.
float width
Bounding box width.
Stores information about array characteristics and content.
struct VPIPayloadImpl * VPIPayload
A handle to an algorithm payload.
VPIStatus vpiImageLock(VPIImage img, VPILockMode mode, VPIImageData *hostData)
Acquires the lock on an image object and returns a pointer to the image planes.
@ VPI_KLT_INVERSE_COMPOSITIONAL
Inverse compositional algorithm for KLT tracker.
VPIImageFormat type
Image type.
Declaration of VPI status codes handling functions.
VPIStatus vpiArrayLock(VPIArray array, VPILockMode mode, VPIArrayData *arrayData)
Acquires the lock on array object and returns a pointer to array data.
@ VPI_BACKEND_CPU
CPU backend.
struct VPIArrayImpl * VPIArray
A handle to an array.
VPIStatus vpiImageCreateHostMemWrapper(const VPIImageData *hostData, uint32_t flags, VPIImage *img)
Create an image object by wrapping around an existing host memory block.
VPIHomographyTransform2D xform
Defines the bounding box top left corner and its homography.
Declares functions dealing with VPI streams.
float height
Bounding box height.
VPIStatus vpiArrayInvalidate(VPIArray array)
Informs that the array's wrapped memory was updated outside VPI.
VPIStatus vpiImageSetWrappedHostMem(VPIImage img, const VPIImageData *hostData)
Redefines the wrapped host memory in an existing VPIImage wrapper.
void * data
Pointer to the first row of this plane.
float nccThresholdStop
Threshold to stop estimating.
float maxTranslationChange
Maximum relative translation change.
@ VPI_BACKEND_PVA
PVA backend.
float nccThresholdKill
Threshold to consider template tracking was lost.
uint8_t templateStatus
Status of the template related to this bounding box.
Functions and structures for dealing with VPI contexts.
VPIStatus vpiImageGetType(VPIImage img, VPIImageFormat *type)
Get the image format.