VPI - Vision Programming Interface

0.3.7 Release

KLT Bounding Box Tracker

Overview

The Kanade-Lucas-Tomasi (KLT) Tracker algorithm estimates the 2D translation and scale changes of an image template between original template coordinates and a given reference image using the Inverse Compositional algorithm. For more information, see [1].

Inputs are an array of template bounding boxes, a translation and scale changes predictions array and a reference image. Additionally, a template image input is used to update template patches (see details below).

Outputs are the translation and scale changes estimation array from the input bounding box coordinates to the reference image coordinates and the template bounding box coordinates array in the reference image.

Frame #1Frame #10

Implementation

Each template bounding box defines a template image patch stored internally with the function descriptor. These template patches are tracked in reference images based on predicted translation and scale changes. An estimated translation and scale change from the original bounding box coordinates to reference image coordinates is computed. Each such estimation includes a tracking validity flag (tracking success or failure) and whether a template update is required, based on user-defined threshold parameters.

Usage

Note
Due to PVA restrictions, the created VPI arrays' capacity must be 128.
  1. Initialization phase
    1. Include the header that defines the needed functions and structures.
    2. Define the stream on which the algorithm will be executed, the input frames and input bounding boxes. Refer to VPIBoundingBox documentation for instructions on how to properly fill each bounding box given an axis-aligned bounding box, the reference frames, the input boxes and input predictions.
      VPIStream stream = /* ... */;
      size_t frame_count = /*... */;
      VPIImage *frames = /* ... */;
      size_t bbox_count = /* ... */;
      VPIBoundingBox *bboxes = /* ... */;
    3. Create the bounding box array with tracking information. For new bounding boxes, trackingStatus must be 0, indicating that bounding box tracking is valid. templateStatus must be 1, indicating that the template corresponding to this bounding box must be updated.
      VPIKLTTrackedBoundingBox tracked_bboxes[128];
      for (size_t b = 0; b < bbox_count; ++b)
      {
      tracked_bboxes[b].bbox = bboxes[b];
      tracked_bboxes[b].trackingStatus = 0; // valid tracking
      tracked_bboxes[b].templateStatus = 1; // must update
      }
    4. Wrap the tracked bounding box into a VPIArray. The array type must be VPI_ARRAY_TYPE_KLT_TRACKED_BOUNDING_BOX

      VPIArrayData data_bboxes;
      memset(&data_bboxes, 0, sizeof(data_bboxes));
      data_bboxes.capacity = 128;
      data_bboxes.size = bbox_count;
      data_bboxes.data = tracked_bboxes;
      VPIArray inputBoxList;
      vpiArrayWrapHostMem(&data_bboxes, 0, &inputBoxList);
    5. Create the bounding box transformation prediction array, initially filled with identity transforms, since the template matches exactly the bounding box contents in the template image.
      for (size_t i = 0; i < bbox_count; ++i)
      {
      VPIHomographyTransform2D *xform = preds + i;
      // Identity transform.
      memset(xform, 0, sizeof(*xform));
      xform->mat3[0][0] = 1;
      xform->mat3[1][1] = 1;
      xform->mat3[2][2] = 1;
      }
    6. Wrap this array into a VPIArray. The array type must be VPI_ARRAY_TYPE_HOMOGRAPHY_TRANSFORM_2D.
      VPIArrayData data_preds;
      memset(&data_preds, 0, sizeof(data_preds));
      data_preds.capacity = 128;
      data_preds.size = bbox_count;
      data_preds.data = preds;
      VPIArray inputPredList;
      vpiArrayWrapHostMem(&data_preds, 0, &inputPredList);
    7. Create the payload that will contain all temporary buffers needed for processing. It is assumed that all input frames have the same size, so the first frame dimensions and type are used to create the payload.
      VPIImageType imgType;
      vpiImageGetType(frames[0], &imgType);
      uint32_t width, height;
      vpiImageGetSize(frames[0], &width, &height);
      vpiCreateKLTBoundingBoxTracker(stream, width, height, imgType, &klt);
    8. Define the configuration parameters that guide the KLT tracking process.
      params.nccThresholdUpdate = 0.8f;
      params.nccThresholdKill = 0.6f;
      params.nccThresholdStop = 1.0f;
      params.maxScaleChange = 0.2f;
      params.maxTranslationChange = 1.5f;
    9. Create the output tracked bounding box array. It will contain the estimated current frame's bounding box based on previous frame and the template information gathered so far. It also contains the bounding box current tracking status.
      VPIArray outputBoxList;
    10. Create the output estimated transforms. It will contain the transform that makes the bounding box template match the corresponding bounding box on the current (reference) frame.
  2. Processing phase
    1. Start of the processing loop from the second frame. The previous frame is where the algorithm fetches the tracked templates from, the current frame is where these templates are matched against.
      for (int idframe = 1; idframe < frame_count; ++idframe)
      {
      VPIImage imgTemplate = frames[idframe - 1];
      VPIImage imgReference = frames[idframe];
    2. Submit the algorithm. The first time it's run, it will go through all input bounding boxes, crop them from the template frame and store them in the payload. Subsequent runs will either repeat the cropping and storing process for new bounding boxes added (doesn't happen in this example, but happens in the sample application), or perform the template matching on the reference frame.
      VPI_CHECK_STATUS(vpiSubmitKLTBoundingBoxTracker(klt, imgTemplate, inputBoxList, inputPredList, imgReference,
      outputBoxList, outputEstimList, &params));
    3. Wait until the processing is done.
      vpiStreamSync(stream);
    4. Lock the output arrays to retrieve the updated bounding boxes and the estimated transforms.
      VPIArrayData updatedBBoxData;
      vpiArrayLock(outputBoxList, VPI_LOCK_READ, &updatedBBoxData);
      VPIArrayData estimData;
      vpiArrayLock(outputEstimList, VPI_LOCK_READ, &estimData);
      VPIKLTTrackedBoundingBox *updated_bbox = (VPIKLTTrackedBoundingBox *)updatedBBoxData.data;
    5. Loop through all bounding boxes.
      for (size_t b = 0; b < bbox_count; ++b)
      {
    6. Update bounding box statuses. If tracking was lost (trackingStatus==1), the input bounding box must also be marked as such, so subsequent KLT iterations ignore it. If the template needs to be updated (templateStatus==1), the next iteration will do the updating, or else it will perform the template matching.
      tracked_bboxes[b].trackingStatus = updated_bbox[b].trackingStatus;
      tracked_bboxes[b].templateStatus = updated_bbox[b].templateStatus;
    7. Skip bounding boxes that aren't being tracked.
      if (updated_bbox[b].trackingStatus)
      {
      continue;
      }
    8. If template for this bounding box must be updated in next KLT iteration, the user must re-define the bounding box. There are several ways to do it. One can use a feature detector such as Harris keypoint detector to help fetch a brand-new bounding box, use updated_bbox[b] and either refine it through other means to avoid accumulating tracking errors, or simply use it as-is, which is a less robust approach, but still yields decent results. This example chooses this last, simpler approach.
      if (updated_bbox[b].templateStatus)
      {
      tracked_bboxes[b] = updated_bbox[b];
    9. Also reset the corresponding input predicted transforms, setting it to identity, as it's now assumed that the input bounding box matches exactly the object being tracked.
      memset(&preds[b], 0, sizeof(preds[b]));
      preds[b].mat3[0][0] = 1;
      preds[b].mat3[1][1] = 1;
      preds[b].mat3[2][2] = 1;
      }
    10. If the template doesn't need to be updated, set the input predicted transform to the one estimated by this KLT iteration.
      else
      {
      preds[b] = estim[b];
      }
      }
    11. Once all bounding boxes are updated, unlock the output arrays as they aren't needed by this iteration anymore.
      vpiArrayUnlock(outputBoxList);
      vpiArrayUnlock(outputEstimList);
    12. Since the input arrays content has been modified externally, invalidate them so that VPI discards the contents of any copies it might have made internally.
      vpiArrayInvalidate(inputBoxList);
      vpiArrayInvalidate(inputPredList);
      }
  3. Cleanup phase
    1. Free all VPI resources at once by destroying the context.

For more details, consult the API reference.

Limitations and Constraints

Constraints for specific backends supersede the ones specified for all backends.

All Backends

PVA

  • Input images' dimensions must be between 65x65 and 3264x2448.
  • Maximum scale change is 0.2.
  • Minimum input and output array capacities is 128.
  • Maximum number of bounding boxes is 64.
  • Maximum numberOfIterationsScaling is 20.
  • Only accepts VPI_IMAGE_TYPE_U16 inputs whose pixel values' range is between 0 and 255.

References

  1. Simon Baker, Iain Matthews, "Lucas-Kanade 20 Years On: A Unified Framework".
    International Journal of Computer Vision, February 2004, Volume 56, issue 3, pp 221-255.
VPIKLTTrackedBoundingBox
Stores a bounding box that is being tracked by KLT Tracker.
Definition: Types.h:467
vpiArrayCreate
VPIStatus vpiArrayCreate(uint32_t capacity, VPIArrayType fmt, uint32_t flags, VPIArray *array)
Create an empty array instance with the specified flags.
VPIHomographyTransform2D::mat3
float mat3[3][3]
3x3 homogeneous matrix that defines the homography.
Definition: Types.h:437
VPIImageType
VPIImageType
Image formats.
Definition: Types.h:206
KLTBoundingBoxTracker.h
VPI_LOCK_READ
@ VPI_LOCK_READ
Lock memory only for reading.
Definition: Types.h:499
VPIArrayData::type
VPIArrayType type
Type of each array element.
Definition: Array.h:127
vpiArrayUnlock
VPIStatus vpiArrayUnlock(VPIArray array)
Releases the lock on array object.
vpiArrayWrapHostMem
VPIStatus vpiArrayWrapHostMem(const VPIArrayData *arrayData, uint32_t flags, VPIArray *array)
Create an array object by wrapping around an existing host-memory block.
vpiStreamSync
VPIStatus vpiStreamSync(VPIStream stream)
Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
VPIKLTBoundingBoxTrackerParams::maxScaleChange
float maxScaleChange
Maximum relative scale change.
Definition: KLTBoundingBoxTracker.h:112
VPIStream
struct VPIStreamImpl * VPIStream
A handle to a stream.
Definition: Types.h:177
VPIArrayData::size
uint32_t size
Number of elements in the array.
Definition: Array.h:128
VPIKLTBoundingBoxTrackerParams::trackingType
VPIKLTBoundingBoxTrackerType trackingType
Type of KLT tracking that will be performed.
Definition: KLTBoundingBoxTracker.h:119
VPIKLTTrackedBoundingBox::templateStatus
uint8_t templateStatus
Status of the template related to this bounding box.
Definition: Types.h:483
VPI_ARRAY_TYPE_HOMOGRAPHY_TRANSFORM_2D
@ VPI_ARRAY_TYPE_HOMOGRAPHY_TRANSFORM_2D
VPIHomographyTransform2D element.
Definition: Types.h:286
vpiSubmitKLTBoundingBoxTracker
VPIStatus vpiSubmitKLTBoundingBoxTracker(VPIPayload payload, VPIImage templateImage, VPIArray inputBoxList, VPIArray inputPredictionList, VPIImage referenceImage, VPIArray outputBoxList, VPIArray outputEstimationList, const VPIKLTBoundingBoxTrackerParams *params)
Runs KLT Tracker on two frames.
vpiContextDestroy
void vpiContextDestroy(VPIContext ctx)
Destroy a context instance as well as all resources it owns.
VPIArrayData::capacity
uint32_t capacity
Maximum number of elements that the array can hold.
Definition: Array.h:129
VPIArrayData::data
void * data
Points to the first element of the array.
Definition: Array.h:131
VPIKLTBoundingBoxTrackerParams::maxTranslationChange
float maxTranslationChange
Maximum relative translation change.
Definition: KLTBoundingBoxTracker.h:116
VPIKLTBoundingBoxTrackerParams::nccThresholdStop
float nccThresholdStop
Threshold to stop estimating.
Definition: KLTBoundingBoxTracker.h:108
VPIKLTTrackedBoundingBox::trackingStatus
uint8_t trackingStatus
Tracking status of this bounding box.
Definition: Types.h:476
VPIKLTBoundingBoxTrackerParams::nccThresholdKill
float nccThresholdKill
Threshold to consider template tracking was lost.
Definition: KLTBoundingBoxTracker.h:107
vpiCreateKLTBoundingBoxTracker
VPIStatus vpiCreateKLTBoundingBoxTracker(VPIStream stream, uint32_t imageWidth, uint32_t imageHeight, VPIImageType imageType, VPIPayload *payload)
Creates payload for vpiSubmitKLTBoundingBoxTracker.
VPI_ARRAY_TYPE_KLT_TRACKED_BOUNDING_BOX
@ VPI_ARRAY_TYPE_KLT_TRACKED_BOUNDING_BOX
VPIKLTTrackedBoundingBox element.
Definition: Types.h:287
VPIKLTBoundingBoxTrackerParams
Structure that defines the parameters for vpiCreateKLTBoundingBoxTracker.
Definition: KLTBoundingBoxTracker.h:104
VPIBoundingBox
Stores a generic 2D bounding box.
Definition: Types.h:456
VPIImage
struct VPIImageImpl * VPIImage
A handle to an image.
Definition: Types.h:183
VPIKLTBoundingBoxTrackerParams::nccThresholdUpdate
float nccThresholdUpdate
Threshold for requiring template update.
Definition: KLTBoundingBoxTracker.h:106
vpiImageGetSize
VPIStatus vpiImageGetSize(VPIImage img, uint32_t *width, uint32_t *height)
Get the image size in pixels.
vpiImageGetType
VPIStatus vpiImageGetType(VPIImage img, VPIImageType *type)
Get the image type.
VPIArrayData
Stores information about array characteristics and content.
Definition: Array.h:126
VPIPayload
struct VPIPayloadImpl * VPIPayload
A handle to an algorithm payload.
Definition: Types.h:195
VPIKLTTrackedBoundingBox::bbox
VPIBoundingBox bbox
Bounding box being tracked.
Definition: Types.h:469
VPIHomographyTransform2D
Stores a generic 2D homography transform.
Definition: Types.h:436
vpiArrayLock
VPIStatus vpiArrayLock(VPIArray array, VPILockMode mode, VPIArrayData *arrayData)
Acquires the lock on array object and returns a pointer to array data.
VPIArray
struct VPIArrayImpl * VPIArray
A handle to an array.
Definition: Types.h:159
vpiArrayInvalidate
VPIStatus vpiArrayInvalidate(VPIArray array)
This method is useful for unmanaged arrays only (created with 'vpiArrayWrap*`).
VPI_KLT_INVERSE_COMPOSITIONAL
@ VPI_KLT_INVERSE_COMPOSITIONAL
Inverse compositional algorithm for KLT tracker.
Definition: KLTBoundingBoxTracker.h:90
VPIKLTBoundingBoxTrackerParams::numberOfIterationsScaling
uint32_t numberOfIterationsScaling
Number of Inverse compositional iterations of scale estimations.
Definition: KLTBoundingBoxTracker.h:105