Overview

The Dense Optical Flow algorithm estimates the motion vectors in every 4x4 pixel block between the previous and current frames. Its uses include motion detection and object tracking.

The output below represents each vector in the HSV color space, where the hue is related to the motion direction, and the value is proportional to the speed.

Input	Output motion vectors

Implementation

The algorithm analyzes the content of two images, previous and current, and writes an estimate of the motion to an output image.

As shown below, the algorithm splits input images into 4x4 pixel blocks. Then, for each block, it estimates the content translation from the previous to the current frame, and writes the estimate as a motion vector to the corresponding pixel in the output image.

Dense Optical Flow Estimation

The 2D motion vector is represented as a X,Y coordinate pair, with each coordinate in S10.5 signed fixed-point format, as shown below:

S10.5 signed fixed-point format

Conversion between S10.5 format and floating point format is done as follows:

\begin{align*} S_{10.5} &= \lfloor F \times 32 \rfloor \\ F &= \lfloor S_{10.5} / 32 \rfloor \end{align*}

C API functions

For list of limitations, constraints and backends that implements the algorithm, consult reference documentation of the following functions:

Function	Description
vpiCreateOpticalFlowDense	Creates payload for vpiSubmitOpticalFlowDense.
vpiSubmitOpticalFlowDense	Runs dense Optical Flow on two frames, outputting motion vectors.
vpiOpticalFlowDenseSetSGMParams	Sets the semi-global matching parameters to be used by the Dense Optical Flow operations with the given payload.
vpiOpticalFlowDenseGetSGMParams	Retrieves the semi-global matching parameters set up in the Dense Optical Flow payload.

Usage

Language: C/C++ Python

Import VPI module
import vpi
Fetch the first frame.
prevImage = inVideo.read()[1]
Fetch the next frame.
while inVideo.read(curImage)[0]:
Execute the algorithm using OFA backend, passing to it the previous and the current frame.
with vpi.Backend.OFA:

motion = vpi.optflow_dense(prevImage, curImage)
Prepare for next iteration by assigning current frame to previous frame.
prevImage = curImage

Initialization phase:
1. Include the header that defines the needed functions and types:
  #include <vpi/algo/OpticalFlowDense.h>
  
  OpticalFlowDense.h
  Declares functions that implement the dense optical flow.
2. Create the stream where the algorithm will be submitted for execution:
  VPIStream stream;
  
  vpiStreamCreate(0, &stream);
  
  VPIStream
  struct VPIStreamImpl * VPIStream
  A handle to a stream.
  Definition: Types.h:250
  
  vpiStreamCreate
  VPIStatus vpiStreamCreate(uint64_t flags, VPIStream *stream)
  Create a stream instance.
3. Define the motion vector image with block-linear memory layout. The motion vector is in the form [x, y], representing the estimated translation, with both coordinates in S10.5 format. The output dimensions are calculated taking into account that one 4x4 input pixel block corresponds to one output vector:
  VPIImage mvImage;
  
  int32_t mvWidth = (width + 3) / 4;
  
  int32_t mvHeight = (height + 3) / 4;
  
  vpiImageCreate(mvWidth, mvHeight, VPI_IMAGE_FORMAT_2S16_BL, 0, &mvImage);
  
  VPI_IMAGE_FORMAT_2S16_BL
  #define VPI_IMAGE_FORMAT_2S16_BL
  Single plane with two interleaved block-linear 16-bit signed integer channel.
  Definition: ImageFormat.h:131
  
  VPIImage
  struct VPIImageImpl * VPIImage
  A handle to an image.
  Definition: Types.h:256
  
  vpiImageCreate
  VPIStatus vpiImageCreate(int32_t width, int32_t height, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
  Create an empty image instance with the specified flags.
4. Create the payload to contain temporary buffers. The payload is configured for processing by the OFA backend:
  VPIPayload optflow;
  
  int32_t gridSize = 4;
  
  VPI_CHECK_STATUS(
  
  vpiCreateOpticalFlowDense(VPI_BACKEND_OFA, width, height, imgFmtBL, &gridSize, 1, quality, &optflow));
  
  vpiCreateOpticalFlowDense
  VPIStatus vpiCreateOpticalFlowDense(uint64_t backends, int32_t width, int32_t height, VPIImageFormat inputFmt, const int32_t *gridSize, int32_t numLevels, VPIOpticalFlowQuality quality, VPIPayload *payload)
  Creates payload for vpiSubmitOpticalFlowDense.
  
  VPIPayload
  struct VPIPayloadImpl * VPIPayload
  A handle to an algorithm payload.
  Definition: Types.h:268
  
  VPI_BACKEND_OFA
  @ VPI_BACKEND_OFA
  OFA backend.
  Definition: Types.h:97
5. Fetch first frame:
  VPIImage prevImage = /* previous frame */;
Processing phase:
1. Start the processing loop at the second frame:
  for (int idframe = 1; idframe < frame_count; ++idframe)
  
  {
2. Fetch the current frame:
  VPIImage curImage = /* current frame */;
3. Submit the algorithm. The algorithm must feed both previous and current images to the NVIDIA encoder engine and generate motion vectors for each 4x4 pixel block:
  vpiSubmitOpticalFlowDense(stream, VPI_BACKEND_OFA, optflow, prevImage, curImage, mvImage);
  
  vpiSubmitOpticalFlowDense
  VPIStatus vpiSubmitOpticalFlowDense(VPIStream stream, uint64_t backend, VPIPayload payload, VPIImage prevImg, VPIImage curImg, VPIImage mvImg)
  Runs dense Optical Flow on two frames, outputting motion vectors.
4. (optional) If there are no more tasks to be submitted to the stream, wait until the stream finishes processing. Once the sync is done, you can use the output motion vectors calculated in this iteration:
  vpiStreamSync(stream);
  
  vpiStreamSync
  VPIStatus vpiStreamSync(VPIStream stream)
  Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
5. Swap the previous and current images so that the current image becomes the next iteration's previous image:
  VPIImage tmpImg = prevImage;
  
  prevImage = curImage;
  
  curImage = tmpImg;
  
  }
Cleanup phase:
1. Free resources held by the stream, the payload, and the input and output arrays:
  vpiStreamDestroy(stream);
  
  vpiPayloadDestroy(optflow);
  
  vpiImageDestroy(prevImage);
  
  vpiImageDestroy(curImage);
  
  vpiImageDestroy(mvImage);
  
  vpiImageDestroy
  void vpiImageDestroy(VPIImage img)
  Destroy an image instance.
  
  vpiPayloadDestroy
  void vpiPayloadDestroy(VPIPayload payload)
  Deallocates the payload object and all associated resources.
  
  vpiStreamDestroy
  void vpiStreamDestroy(VPIStream stream)
  Destroy a stream instance and deallocate all HW resources.

Consult the Dense Optical Flow sample for a complete example.

For more information, see Dense Optical Flow in the "C API Reference" section of VPI - Vision Programming Interface.

Performance

For information on how to use the performance table below, see Algorithm Performance Tables.
Before comparing measurements, consult Comparing Algorithm Elapsed Times.
For further information on how performance was benchmarked, see Performance Benchmark.

VPI - Vision Programming Interface

3.2 Release

Overview

Implementation

C API functions

Usage

Performance