VPI - Vision Programming Interface

1.0 Release

Pyramidal LK Optical Flow

Overview

The Pyramidal Lucas-Kanade (LK) Optical Flow algorithm estimates the 2D translation of sparse feature points from a previous frame to the next. Image pyramids are used to improve the performance and robustness of tracking over larger translations. For more information, see [1] and [2].

Inputs are previous image pyramid, the next image pyramid, and the feature points on the previous image.

Outputs are the feature points on the next image and the tracking status of each feature point.

Frame #10

Implementation

Each feature point defines the their location in the image with x, y coordinates. These points are then tracked in the next image. The tracking status will inform whether the feature point is being tracked successfully or not.

Usage

  1. Initialization phase
    1. Include the header that defines the needed functions and structures.
      Declares functions that handle gaussian pyramids.
      Declares functions that implement the Harris Corner Detector algorithm.
      Declares functions that implement the Pyramidal LK Optical Flow algorithm.
    2. Create the stream where the algorithm will be submitted for execution.
      VPIStream stream;
      vpiStreamCreate(0, &stream);
      struct VPIStreamImpl * VPIStream
      A handle to a stream.
      Definition: Types.h:191
      VPIStatus vpiStreamCreate(uint32_t flags, VPIStream *stream)
      Create a stream instance.
    3. Fetch previous image
      VPIImage prevImage = NULL;
      FetchFrame(vid, &prevImage, VPI_IMAGE_FORMAT_U8);
      @ VPI_IMAGE_FORMAT_U8
      Single plane with one 8-bit unsigned integer channel.
      Definition: ImageFormat.h:104
      struct VPIImageImpl * VPIImage
      A handle to an image.
      Definition: Types.h:197
    4. Define the previous pyramid
      VPIPyramid pyrPrevFrame;
      vpiPyramidCreate(width, height, imgFormat, levels, 0.5, 0, &pyrPrevFrame);
      struct VPIPyramidImpl * VPIPyramid
      A handle to an image pyramid.
      Definition: Types.h:203
      VPIStatus vpiPyramidCreate(int32_t width, int32_t height, VPIImageFormat fmt, int32_t numLevels, float scale, uint32_t flags, VPIPyramid *pyr)
      Create an empty image pyramid instance with the specified flags.
    5. Define the current pyramid
      VPIPyramid pyrCurFrame;
      vpiPyramidCreate(width, height, imgFormat, levels, 0.5, 0, &pyrCurFrame);
    6. Define the feature points in the previous image
      VPIArray arrPrevPts;
      vpiArrayCreate(MAX_HARRIS_CORNERS, VPI_ARRAY_TYPE_KEYPOINT, 0, &arrPrevPts);
      VPIStatus vpiArrayCreate(int32_t capacity, VPIArrayType type, uint32_t flags, VPIArray *array)
      Create an empty array instance.
      struct VPIArrayImpl * VPIArray
      A handle to an array.
      Definition: Types.h:173
      @ VPI_ARRAY_TYPE_KEYPOINT
      VPIKeypoint element.
      Definition: ArrayType.h:74
    7. Create feature points for next image
      VPIArray arrCurPts;
      vpiArrayCreate(MAX_HARRIS_CORNERS, VPI_ARRAY_TYPE_KEYPOINT, 0, &arrCurPts);
    8. Create tracking status array for the feature point
      VPIArray arrStatus;
      vpiArrayCreate(MAX_HARRIS_CORNERS, VPI_ARRAY_TYPE_U8, 0, &arrStatus);
      @ VPI_ARRAY_TYPE_U8
      unsigned 8-bit.
      Definition: ArrayType.h:70
    9. Fill in previous feature points
      VPIArray scores;
      vpiArrayCreate(MAX_HARRIS_CORNERS, VPI_ARRAY_TYPE_U32, 0, &scores);
      VPIPayload harris;
      memset(&harrisParams, 0, sizeof(harrisParams));
      harrisParams.gradientSize = 5;
      harrisParams.blockSize = 5;
      harrisParams.sensitivity = 0.01;
      harrisParams.minNMSDistance = 8;
      VPI_CHECK_STATUS(
      vpiSubmitHarrisCornerDetector(stream, VPI_BACKEND_CUDA, harris, prevImage, arrPrevPts, scores, &harrisParams));
      vpiStreamSync(stream);
      SortKeypoints(arrPrevPts, scores, MAX_KEYPOINTS);
      @ VPI_ARRAY_TYPE_U32
      unsigned 32-bit.
      Definition: ArrayType.h:73
      int32_t gradientSize
      Gradient window size.
      Definition: HarrisCorners.h:82
      int32_t blockSize
      Block window size used to compute the Harris Corner score.
      Definition: HarrisCorners.h:85
      float minNMSDistance
      Non-maximum suppression radius, set to 0 to disable it.
      Definition: HarrisCorners.h:94
      float sensitivity
      Specifies sensitivity threshold from the Harris-Stephens equation.
      Definition: HarrisCorners.h:91
      VPIStatus vpiCreateHarrisCornerDetector(uint32_t backends, int32_t inputWidth, int32_t inputHeight, VPIPayload *payload)
      Creates a Harris Corner Detector payload.
      VPIStatus vpiSubmitHarrisCornerDetector(VPIStream stream, uint32_t backend, VPIPayload payload, VPIImage input, VPIArray outFeatures, VPIArray outScores, const VPIHarrisCornerDetectorParams *params)
      Submits Harris Corner Detector operation to the stream associated with the payload.
      Structure that defines the parameters for vpiSubmitHarrisCornerDetector.
      Definition: HarrisCorners.h:80
      struct VPIPayloadImpl * VPIPayload
      A handle to an algorithm payload.
      Definition: Types.h:209
      VPIStatus vpiStreamSync(VPIStream stream)
      Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
      @ VPI_BACKEND_CUDA
      CUDA backend.
      Definition: Types.h:92
    10. Create the payload that will contain all temporary buffers needed for processing.
      VPIPayload optflow;
      vpiCreateOpticalFlowPyrLK(VPI_BACKEND_CUDA, width, height, imgFormat, levels, 0.5, &optflow);
      VPIStatus vpiCreateOpticalFlowPyrLK(uint32_t backends, int32_t width, int32_t height, VPIImageFormat fmt, int32_t levels, float scale, VPIPayload *payload)
      Creates payload for vpiSubmitOpticalFlowPyrLK.
    11. Define the configuration parameters that guide the LK tracking process.
      memset(&lkParams, 0, sizeof(lkParams));
      lkParams.useInitialFlow = false;
      lkParams.epsilon = 0.0f;
      lkParams.windowDimension = 15;
      lkParams.numIterations = 6;
      #define VPI_TERMINATION_CRITERIA_EPSILON
      Termination based on maximum error (epsilon).
      Definition: Types.h:370
      int32_t numIterations
      Specifies the number of iterations.
      float epsilon
      Specifies the error for terminating the algorithm.
      VPIEpsilonType epsilonType
      Specifies the tracking error type.
      int32_t windowDimension
      Specifies the size of the window on which to perform the algorithm.
      uint32_t useInitialFlow
      Uses initial estimations stored in curPts when this flag is not 0, otherwise prevPts is copied to cur...
      uint32_t termination
      Specifies the termination criteria.
      @ VPI_LK_ERROR_L1
      L1 distance between previous feature and a next feature.
      Structure that defines the parameters for vpiSubmitOpticalFlowPyrLK.
      #define VPI_TERMINATION_CRITERIA_ITERATIONS
      Defines the termination criteria macros.
      Definition: Types.h:369
    12. Generate image pyramid for previous image
      VPIPyramid pyrPrevFrame;
      vpiPyramidCreate(width, height, imgFormat, levels, 0.5, 0, &pyrPrevFrame);
  2. Processing phase
    1. Start of the processing loop from the second frame. The previous frame is where the algorithm fetches the feature points from, the current frame is where these feature points are estimated on.
      for (int idframe = 1; idframe < frame_count; ++idframe)
      {
    2. Fetch current image
      FetchFrame(vid, &curImage, VPI_IMAGE_FORMAT_U8);
    3. Generate image pyramid for the current image
      vpiSubmitGaussianPyramidGenerator(stream, VPI_BACKEND_CUDA, curImage, pyrCurFrame);
      VPIStatus vpiSubmitGaussianPyramidGenerator(VPIStream stream, uint32_t backend, VPIImage input, VPIPyramid output)
      Computes the Gaussian pyramid from the input image.
    4. Submit the algorithm. It will go through all input feature points, and find the estimated points and tracking status in the next image. The user will decide whether to continue using the tracked feature points or re-generate a new set of feature points. In this example the tracked feature points are reused as input for the next frame.
      VPI_CHECK_STATUS(vpiSubmitOpticalFlowPyrLK(stream, VPI_BACKEND_CUDA, optflow, pyrPrevFrame, pyrCurFrame,
      arrPrevPts, arrCurPts, arrStatus, &lkParams));
      VPIStatus vpiSubmitOpticalFlowPyrLK(VPIStream stream, uint32_t backend, VPIPayload payload, VPIPyramid prevPyr, VPIPyramid curPyr, VPIArray prevPts, VPIArray curPts, VPIArray trackingStatus, const VPIOpticalFlowPyrLKParams *params)
      Runs Pyramidal LK Optical Flow on two frames.
    5. Wait until the processing is done.
      vpiStreamSync(stream);
    6. Prepare for the next iteration
      VPIImage tmpImg = prevImage;
      prevImage = curImage;
      curImage = tmpImg;
      VPIPyramid tmpPyr = pyrPrevFrame;
      pyrPrevFrame = pyrCurFrame;
      pyrCurFrame = tmpPyr;
      VPIArray tmpArray = arrPrevPts;
      arrPrevPts = arrCurPts;
      arrCurPts = tmpArray;
  3. Cleanup phase
    1. Free resources held by the stream, the payload, and the input and output arrays.
      vpiPyramidDestroy(pyrPrevFrame);
      vpiPyramidDestroy(pyrCurFrame);
      vpiArrayDestroy(arrPrevPts);
      vpiArrayDestroy(arrCurPts);
      vpiArrayDestroy(arrStatus);
      void vpiArrayDestroy(VPIArray array)
      Destroy an array instance.
      void vpiPayloadDestroy(VPIPayload payload)
      Deallocates the payload object and all associated resources.
      void vpiPyramidDestroy(VPIPyramid pyr)
      Destroy an image pyramid instance as well as all resources it owns.
      void vpiStreamDestroy(VPIStream stream)
      Destroy a stream instance and deallocate all HW resources.

For more information, see Pyramidal LK Optical Flow in the "API Reference" section of VPI - Vision Programming Interface.

Limitations and Constraints

Constraints for specific backends supersede the ones specified for all backends.

All Backends

PVA

  • Not implemented.

VIC

  • Not implemented.

References

  1. B. D. Lucas and T. Kanade (1981), "An iterative image registration technique with an application to stereo vision."
    Proceedings of Imaging Understanding Workshop, pages 121–130
  2. J. Y. Bouguet, (2000), "Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm."
    Intel Corporation, Microprocessor Research Labs