| 
          
       | 
        VPI - Vision Programming Interface
         0.4.4 Release
     | 
  
 
 
 
Overview
This application tracks bounding boxes on an input video, draws the frames on each frame and saves them to disk. The user can define what backend will be used for processing.
- Note
 - The output will be in grayscale as the algorithm currently only supports grayrescales.
 
This sample shows the following:
- Creating and destroying a VPI stream.
 
- Wrapping an image hosted on CPU (input frame) to be used by VPI.
 
- Wrapping an array hosted on CPU (input bounding boxes) to be used by VPI.
 
- Creating a VPI-managed 2D image where output will be written to.
 
- Use the multi-frame KLT Bounding Box Tracker algorithm.
 
- Simple stream synchronization.
 
- Array locking to access its contents from CPU side.
 
- Error handling.
 
- Environment clean up using user-defined context.
 
Instructions
The usage is:
./vpi_sample_06_klt_tracker <backend> <input video> <input bboxes> <output frames>
where
Here's one example: 
./vpi_sample_06_klt_tracker cuda ../assets/dashcam.mp4 ../assets/dashcam_bboxes.txt frame.png
This is using the CUDA backend and one of the provided sample videos and bounding boxes.
Results
| Frame 0445 | Frame 0465  | 
  |    | 
Source code
For convenience, here's the code that is also installed in the samples directory.
   29 #include <opencv2/core/version.hpp> 
   30 #if CV_MAJOR_VERSION >= 3 
   31 #    include <opencv2/imgcodecs.hpp> 
   32 #    include <opencv2/videoio.hpp> 
   34 #    include <opencv2/highgui/highgui.hpp> 
   37 #include <opencv2/imgproc/imgproc.hpp> 
   53 #define CHECK_STATUS(STMT)                                    \ 
   56         VPIStatus status = (STMT);                            \ 
   57         if (status != VPI_SUCCESS)                            \ 
   59             char buffer[VPI_MAX_STATUS_MESSAGE_LENGTH];       \ 
   60             vpiGetLastStatusMessage(buffer, sizeof(buffer));  \ 
   61             std::ostringstream ss;                            \ 
   62             ss << vpiStatusGetName(status) << ": " << buffer; \ 
   63             throw std::runtime_error(ss.str());               \ 
   73     memset(&imgData, 0, 
sizeof(imgData));
 
   84         throw std::runtime_error(
"Frame type not supported");
 
  117         switch (imgdata.
type)
 
  136             throw std::runtime_error(
"Image type not supported");
 
  142         if (cvimg.type() == CV_16U)
 
  144             cvimg.convertTo(out, CV_8U);
 
  149         cvtColor(cvimg, out, cv::COLOR_GRAY2BGR);
 
  165     for (
size_t i = 0; i < boxdata.
size; ++i)
 
  167         if (pboxes[i].trackingStatus == 1)
 
  178         x = pboxes[i].bbox.xform.mat3[0][2] + ppreds[i].mat3[0][2];
 
  179         y = pboxes[i].bbox.xform.mat3[1][2] + ppreds[i].mat3[1][2];
 
  180         w = pboxes[i].bbox.width * pboxes[i].bbox.xform.mat3[0][0] * ppreds[i].mat3[0][0];
 
  181         h = pboxes[i].bbox.height * pboxes[i].bbox.xform.mat3[1][1] * ppreds[i].mat3[1][1];
 
  183         rectangle(out, cv::Rect(x, y, w, h), cv::Scalar(rand() % 256, rand() % 256, rand() % 256), 2);
 
  190     std::string fname = filename;
 
  191     int ext           = fname.rfind(
'.');
 
  193     char buffer[512] = {};
 
  194     snprintf(buffer, 
sizeof(buffer) - 1, 
"%s_%04d%s", fname.substr(0, ext).c_str(), frame, fname.substr(ext).c_str());
 
  197     if (!imwrite(buffer, out, {cv::IMWRITE_JPEG_QUALITY, 70}))
 
  199         throw std::runtime_error(
"Can't write to " + std::string(buffer));
 
  203 int main(
int argc, 
char *argv[])
 
  216             throw std::runtime_error(std::string(
"Usage: ") + argv[0] +
 
  217                                      " <cpu|pva|cuda> <input_video> <bbox descr> <output>");
 
  220         std::string strBackend     = argv[1];
 
  221         std::string strInputVideo  = argv[2];
 
  222         std::string strInputBBoxes = argv[3];
 
  223         std::string strOutputFiles = argv[4];
 
  226         cv::VideoCapture invid;
 
  227         if (!invid.open(strInputVideo))
 
  229             throw std::runtime_error(
"Can't open '" + strInputVideo + 
"'");
 
  243         VPIArray inputBoxList, inputPredList;
 
  246         std::vector<VPIKLTTrackedBoundingBox> bboxes;
 
  247         std::vector<VPIHomographyTransform2D> preds;
 
  251         std::map<int, size_t> bboxes_size_at_frame; 
 
  259             std::ifstream in(strInputBBoxes);
 
  262                 throw std::runtime_error(
"Can't open '" + strInputBBoxes + 
"'");
 
  266             int frame, x, y, w, h;
 
  267             while (in >> frame >> x >> y >> w >> h)
 
  269                 if (bboxes.size() == 64)
 
  271                     throw std::runtime_error(
"Too many bounding boxes");
 
  292                 bboxes.push_back(track);
 
  296                 xform.
mat3[0][0]               = 1;
 
  297                 xform.
mat3[1][1]               = 1;
 
  298                 xform.
mat3[2][2]               = 1;
 
  299                 preds.push_back(xform);
 
  301                 bboxes_size_at_frame[frame] = bboxes.size();
 
  304             if (!in && !in.eof())
 
  306                 throw std::runtime_error(
"Can't parse bounding boxes, stopped at bbox #" +
 
  307                                          std::to_string(bboxes.size()));
 
  315             data.
data         = &bboxes[0];
 
  319             data.
data = &preds[0];
 
  326         if (strBackend == 
"cpu")
 
  330         else if (strBackend == 
"cuda")
 
  334         else if (strBackend == 
"pva")
 
  340             throw std::runtime_error(
"Backend '" + strBackend +
 
  341                                      "' not recognized, it must be either cpu, cuda or pva.");
 
  350         auto fetchFrame = [&invid, &nextFrame, backendType]() {
 
  352             if (!invid.read(frame))
 
  358             if (frame.channels() == 3)
 
  360                 cvtColor(frame, frame, cv::COLOR_BGR2GRAY);
 
  369                 frame.convertTo(aux, CV_16U);
 
  374                 assert(frame.type() == CV_8U);
 
  383         cv::Mat cvTemplate   = fetchFrame(), cvReference;
 
  384         VPIImage imgTemplate = ToVPIImage(
nullptr, cvTemplate);
 
  414         size_t curNumBoxes = 0;
 
  418             size_t curFrame = nextFrame - 1;
 
  421             auto tmp          = --bboxes_size_at_frame.upper_bound(curFrame);
 
  422             size_t bbox_count = tmp->second;
 
  424             assert(bbox_count >= curNumBoxes && 
"input bounding boxes must be sorted by frame");
 
  427             if (curNumBoxes != bbox_count)
 
  439                 for (
size_t i = 0; i < bbox_count - curNumBoxes; ++i)
 
  441                     std::cout << curFrame << 
" -> new " << curNumBoxes + i << std::endl;
 
  443                 assert(bbox_count <= bboxes.capacity());
 
  444                 assert(bbox_count <= preds.capacity());
 
  446                 curNumBoxes = bbox_count;
 
  450             SaveKLTBoxes(imgTemplate, inputBoxList, inputPredList, strOutputFiles, curFrame);
 
  453             cvReference = fetchFrame();
 
  456             if (cvReference.data == 
nullptr)
 
  463             imgReference = ToVPIImage(imgReference, cvReference);
 
  468                                                     outputBoxList, outputEstimList, ¶ms));
 
  484             for (
size_t b = 0; b < curNumBoxes; ++b)
 
  487                 if (updated_bbox[b].trackingStatus)
 
  490                     if (bboxes[b].trackingStatus == 0)
 
  492                         std::cout << curFrame << 
" -> dropped " << b << std::endl;
 
  493                         bboxes[b].trackingStatus = 1;
 
  500                 if (updated_bbox[b].templateStatus)
 
  502                     std::cout << curFrame << 
" -> update " << b << std::endl;
 
  512                     bboxes[b] = updated_bbox[b];
 
  515                     bboxes[b].templateStatus = 1;
 
  519                     preds[b].
mat3[0][0] = 1;
 
  520                     preds[b].mat3[1][1] = 1;
 
  521                     preds[b].mat3[2][2] = 1;
 
  526                     bboxes[b].templateStatus = 0;
 
  543             std::swap(imgTemplate, imgReference);
 
  544             std::swap(cvTemplate, cvReference);
 
  547     catch (std::exception &e)
 
  549         std::cerr << e.what() << std::endl;
 
   
 
 
uint32_t height
Height of this plane in pixels.
 
struct VPIContextImpl * VPIContext
A handle to a context.
 
Structure that defines the parameters for vpiCreateKLTFeatureTracker.
 
VPIStatus vpiCreateKLTFeatureTracker(VPIBackend backend, uint32_t imageWidth, uint32_t imageHeight, VPIImageFormat imageFormat, VPIPayload *payload)
Creates payload for vpiSubmitKLTFeatureTracker.
 
uint32_t width
Width of this plane in pixels.
 
VPIStatus vpiArrayCreate(uint32_t capacity, VPIArrayType fmt, uint32_t flags, VPIArray *array)
Create an empty array instance.
 
float mat3[3][3]
3x3 homogeneous matrix that defines the homography.
 
VPIStatus vpiStreamCreate(uint32_t flags, VPIStream *stream)
Create a stream instance.
 
VPIBackend
VPI Backend types.
 
Stores a bounding box that is being tracked by KLT Tracker.
 
@ VPI_LOCK_READ_WRITE
Lock memory for reading and writing.
 
VPIStatus vpiContextCreate(uint32_t flags, VPIContext *ctx)
Create a context instance.
 
uint32_t numberOfIterationsScaling
Number of Inverse compositional iterations of scale estimations.
 
VPIStatus vpiContextSetCurrent(VPIContext ctx)
Sets the context for the calling thread.
 
@ VPI_LOCK_READ
Lock memory only for reading.
 
VPIStatus vpiSubmitKLTFeatureTracker(VPIStream stream, VPIPayload payload, VPIImage templateImage, VPIArray inputBoxList, VPIArray inputPredictionList, VPIImage referenceImage, VPIArray outputBoxList, VPIArray outputEstimationList, const VPIKLTFeatureTrackerParams *params)
Runs KLT Feature Tracker on two frames.
 
VPIArrayType type
Type of each array element.
 
VPIStatus vpiArraySetSize(VPIArray array, uint32_t size)
Set the array size in elements.
 
float maxScaleChange
Maximum relative scale change.
 
VPIStatus vpiArrayUnlock(VPIArray array)
Releases the lock on array object.
 
VPIStatus vpiImageUnlock(VPIImage img)
Releases the lock on an image object.
 
Functions and structures for dealing with VPI arrays.
 
VPIStatus vpiArrayCreateHostMemWrapper(const VPIArrayData *arrayData, uint32_t flags, VPIArray *array)
Create an array object by wrapping an existing host memory block.
 
VPIStatus vpiStreamSync(VPIStream stream)
Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
 
@ VPI_BACKEND_CUDA
CUDA backend.
 
Stores information about image characteristics and content.
 
struct VPIStreamImpl * VPIStream
A handle to a stream.
 
VPIBoundingBox bbox
Bounding box being tracked.
 
uint32_t size
Number of elements in the array.
 
@ VPI_ARRAY_TYPE_HOMOGRAPHY_TRANSFORM_2D
VPIHomographyTransform2D element.
 
uint8_t trackingStatus
Tracking status of this bounding box.
 
void vpiContextDestroy(VPIContext ctx)
Destroy a context instance as well as all resources it owns.
 
uint32_t capacity
Maximum number of elements that the array can hold.
 
void * data
Points to the first element of the array.
 
VPIImagePlane planes[VPI_MAX_PLANE_COUNT]
Data of all image planes.
 
Functions and structures for dealing with VPI images.
 
float nccThresholdUpdate
Threshold for requiring template update.
 
@ VPI_ARRAY_TYPE_KLT_TRACKED_BOUNDING_BOX
VPIKLTTrackedBoundingBox element.
 
uint32_t pitchBytes
Difference in bytes of beginning of one row and the beginning of the previous.
 
struct VPIImageImpl * VPIImage
A handle to an image.
 
VPIKLTFeatureTrackerType trackingType
Type of KLT tracking that will be performed.
 
int32_t numPlanes
Number of planes.
 
Declares functions that implement the KLT Feature Tracker algorithm.
 
float width
Bounding box width.
 
Stores information about array characteristics and content.
 
struct VPIPayloadImpl * VPIPayload
A handle to an algorithm payload.
 
VPIStatus vpiImageLock(VPIImage img, VPILockMode mode, VPIImageData *hostData)
Acquires the lock on an image object and returns a pointer to the image planes.
 
@ VPI_KLT_INVERSE_COMPOSITIONAL
Inverse compositional algorithm for KLT tracker.
 
VPIImageFormat type
Image type.
 
Declaration of VPI status codes handling functions.
 
VPIStatus vpiArrayLock(VPIArray array, VPILockMode mode, VPIArrayData *arrayData)
Acquires the lock on array object and returns a pointer to array data.
 
@ VPI_BACKEND_CPU
CPU backend.
 
struct VPIArrayImpl * VPIArray
A handle to an array.
 
VPIStatus vpiImageCreateHostMemWrapper(const VPIImageData *hostData, uint32_t flags, VPIImage *img)
Create an image object by wrapping around an existing host memory block.
 
VPIHomographyTransform2D xform
Defines the bounding box top left corner and its homography.
 
Declares functions dealing with VPI streams.
 
float height
Bounding box height.
 
VPIStatus vpiArrayInvalidate(VPIArray array)
Informs that the array's wrapped memory was updated outside VPI.
 
VPIStatus vpiImageSetWrappedHostMem(VPIImage img, const VPIImageData *hostData)
Redefines the wrapped host memory in an existing VPIImage wrapper.
 
void * data
Pointer to the first row of this plane.
 
float nccThresholdStop
Threshold to stop estimating.
 
float maxTranslationChange
Maximum relative translation change.
 
@ VPI_BACKEND_PVA
PVA backend.
 
float nccThresholdKill
Threshold to consider template tracking was lost.
 
uint8_t templateStatus
Status of the template related to this bounding box.
 
Functions and structures for dealing with VPI contexts.
 
VPIStatus vpiImageGetType(VPIImage img, VPIImageFormat *type)
Get the image format.