VPI - Vision Programming Interface

1.2 Release

Pyramidal LK Optical Flow

Overview

The Pyramidal Lucas-Kanade (LK) Optical Flow algorithm estimates the 2D translation of sparse feature points from a previous frame to the next. Image pyramids are used to improve the performance and robustness of tracking over larger translations.

Inputs are previous image pyramid, the next image pyramid, and the feature points on the previous image.

Outputs are the feature points on the next image and the tracking status of each feature point.

Frame #10

Implementation

Each feature point defines their location in the image with x, y coordinates. These points are then tracked in the next image. The tracking status will inform whether the feature point is being tracked successfully or not. For more information, see [1] and [2].

Usage

Language:
  1. Import VPI module
    import vpi
  2. Initialization phase
    1. Create the Pyramidal Optical Flow LK object, feeding it the initial frame and the VPI array with the keypoints to track. The CUDA backend will be used to execute the algorithm.
      with vpi.Backend.CUDA:
      optflow = vpi.OpticalFlowPyrLK(frame, curFeatures, 4)
  3. Processing phase
    1. Fetch a new frame from input video sequence into a VPI image.
      while inVideo.read(input)[0]:
    2. Feed this VPI image into the OptFlow object. I'll return the estimated keypoint positions in the passed frame, along with a vector that informs the keypoint state, i.e., whether it's being tracked or not.
      curFeatures, status = optflow(input)
  1. Initialization phase
    1. Include the header that defines the needed functions and structures.
      Declares functions that implement the Pyramidal LK Optical Flow algorithm.
    2. Create the stream where the algorithm will be submitted for execution.
      VPIStream stream;
      vpiStreamCreate(0, &stream);
      struct VPIStreamImpl * VPIStream
      A handle to a stream.
      Definition: Types.h:209
      VPIStatus vpiStreamCreate(uint32_t flags, VPIStream *stream)
      Create a stream instance.
    3. Define the required images, pyramids and arrays
      VPIImage prevImage = /* previous frame */;
      VPIPyramid pyrPrevFrame = /* pyramid out of previous frame */;
      VPIPyramid pyrCurFrame = /* pyramid for current frame */;
      VPIArray arrPrevPts = /* array with previous frame's keypoints, type VPI_ARRAY_TYPE_KEYPOINT */;
      VPIArray arrCurPts = /* array with current frame's keypoints, type VPI_ARRAY_TYPE_KEYPOINT */;
      VPIArray arrStatus = /* array with keypoint tracking status, type VPI_ARRAY_TYPE_U8 */;
      VPIArray scores = /* array with keypoint scores, type VPI_ARRAY_TYPE_U8 */;
      struct VPIArrayImpl * VPIArray
      A handle to an array.
      Definition: Types.h:191
      struct VPIImageImpl * VPIImage
      A handle to an image.
      Definition: Types.h:215
      struct VPIPyramidImpl * VPIPyramid
      A handle to an image pyramid.
      Definition: Types.h:221
    4. Create the payload that will contain all temporary buffers needed for processing. Its parameters are taken from the input pyramid and image used.
      int levels;
      vpiPyramidGetNumLevels(pyrPrevFrame, &levels);
      float scale;
      vpiPyramidGetScale(pyrPrevFrame, &scale);
      vpiImageGetFormat(prevImage, &format);
      int width, height;
      vpiImageGetSize(prevImage, &width, &height);
      VPIPayload optflow;
      vpiCreateOpticalFlowPyrLK(VPI_BACKEND_CUDA, width, height, format, levels, scale, &optflow);
      VPIImageFormat
      Pre-defined image formats.
      Definition: ImageFormat.h:99
      VPIStatus vpiImageGetFormat(VPIImage img, VPIImageFormat *format)
      Get the image format.
      VPIStatus vpiImageGetSize(VPIImage img, int32_t *width, int32_t *height)
      Get the image size in pixels.
      VPIStatus vpiCreateOpticalFlowPyrLK(uint32_t backends, int32_t width, int32_t height, VPIImageFormat fmt, int32_t levels, float scale, VPIPayload *payload)
      Creates payload for vpiSubmitOpticalFlowPyrLK.
      struct VPIPayloadImpl * VPIPayload
      A handle to an algorithm payload.
      Definition: Types.h:227
      VPIStatus vpiPyramidGetNumLevels(VPIPyramid pyr, int32_t *numLevels)
      Get the image pyramid level count.
      VPIStatus vpiPyramidGetScale(VPIPyramid pyr, float *scale)
      Returns the scale factor of the pyramid levels.
      @ VPI_BACKEND_CUDA
      CUDA backend.
      Definition: Types.h:93
    5. Define the configuration parameters that guide the LK tracking process.
      VPIStatus vpiInitOpticalFlowPyrLKParams(VPIOpticalFlowPyrLKParams *params)
      Initializes VPIOpticalFlowPyrLKParams with default values.
      Structure that defines the parameters for vpiSubmitOpticalFlowPyrLK.
  2. Processing phase
    1. Start of the processing loop from the second frame. The previous frame is where the algorithm fetches the feature points from, the current frame is where these feature points are estimated on.
      for (int idframe = 1; idframe < frame_count; ++idframe)
      {
    2. Fetch new frame from the input video.
      curImage = /* "new frame from video sequence */;
    3. Generate image pyramid for the current image using the CUDA backend.
      vpiSubmitGaussianPyramidGenerator(stream, VPI_BACKEND_CUDA, curImage, pyrCurFrame);
      VPIStatus vpiSubmitGaussianPyramidGenerator(VPIStream stream, uint32_t backend, VPIImage input, VPIPyramid output)
      Computes the Gaussian pyramid from the input image.
    4. Submit the algorithm to be executed by the CUDA backend. It will go through all input feature points, and find the estimated points and tracking status in the next image. The user will decide whether to continue using the tracked feature points or re-generate a new set of feature points. In this example the tracked feature points are reused as input for the next frame.
      vpiSubmitOpticalFlowPyrLK(stream, VPI_BACKEND_CUDA, optflow, pyrPrevFrame, pyrCurFrame, arrPrevPts, arrCurPts, arrStatus, &lkParams);
      VPIStatus vpiSubmitOpticalFlowPyrLK(VPIStream stream, uint32_t backend, VPIPayload payload, VPIPyramid prevPyr, VPIPyramid curPyr, VPIArray prevPts, VPIArray curPts, VPIArray trackingStatus, const VPIOpticalFlowPyrLKParams *params)
      Runs Pyramidal LK Optical Flow on two frames.
    5. Wait until the processing is done.
      vpiStreamSync(stream);
      VPIStatus vpiStreamSync(VPIStream stream)
      Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
    6. Prepare for the next iteration. Current iteration's *cur* buffers will be used as *prev* buffers for the next iteration.
      VPIImage tmpImg = prevImage;
      prevImage = curImage;
      curImage = tmpImg;
      VPIPyramid tmpPyr = pyrPrevFrame;
      pyrPrevFrame = pyrCurFrame;
      pyrCurFrame = tmpPyr;
      VPIArray tmpArray = arrPrevPts;
      arrPrevPts = arrCurPts;
      arrCurPts = tmpArray;
      }
  3. Cleanup phase
    1. Free resources held by the stream, the payload, and the input and output arrays.
      vpiPyramidDestroy(pyrPrevFrame);
      vpiPyramidDestroy(pyrCurFrame);
      vpiArrayDestroy(arrPrevPts);
      vpiArrayDestroy(arrCurPts);
      vpiArrayDestroy(arrStatus);
      void vpiArrayDestroy(VPIArray array)
      Destroy an array instance.
      void vpiPayloadDestroy(VPIPayload payload)
      Deallocates the payload object and all associated resources.
      void vpiPyramidDestroy(VPIPyramid pyr)
      Destroy an image pyramid instance as well as all resources it owns.
      void vpiStreamDestroy(VPIStream stream)
      Destroy a stream instance and deallocate all HW resources.

For more information, see Pyramidal LK Optical Flow in the "API Reference" section of VPI - Vision Programming Interface.

Limitations and Constraints

Constraints for specific backends supersede the ones specified for all backends.

CPU and CUDA backends

PVA

  • Not implemented.

VIC

  • Not implemented.

Performance

For information on how to use the performance table below, see Algorithm Performance Tables.
Before comparing measurements, consult Comparing Algorithm Elapsed Times.
For further information on how performance was benchmarked, see Performance Benchmark.

 - 

References

  1. B. D. Lucas and T. Kanade (1981), "An iterative image registration technique with an application to stereo vision."
    Proceedings of Imaging Understanding Workshop, pages 121–130
  2. J. Y. Bouguet, (2000), "Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm."
    Intel Corporation, Microprocessor Research Labs