Overview

The Dense Optical Flow algorithm estimates the motion vectors in every 4x4 pixel block between the previous and current frames. Its uses include motion detection and object tracking.

The output below represents each vector in the HSV color space, where the hue is related to the motion direction, and the value is proportional to the speed.

Input	Output motion vectors

Implementation

The algorithm analyzes the content of two images, previous and current, and writes an estimate of the motion to an output image.

As shown below, the algorithm splits input images into 4x4 pixel blocks. Then, for each block, it estimates the content translation from the previous to the current frame, and writes the estimate as a motion vector to the corresponding pixel in the output image.

Dense Optical Flow Estimation

The 2D motion vector is represented as a X,Y coordinate pair, with each coordinate in S10.5 signed fixed-point format, as shown below:

S10.5 signed fixed-point format

Conversion between S10.5 format and floating point format is done as follows:

\begin{align*} S_{10.5} &= \lfloor F \times 32 \rfloor \\ F &= \lfloor S_{10.5} / 32 \rfloor \end{align*}

Usage

Language: C/C++ Python

Import VPI module
import vpi
Fetch the first frame.
prevImage = inVideo.read()[1]
Fetch the next frame.
while inVideo.read(curImage)[0]:
Execute the algorithm using NVENC backend, passing to it the previous and the current frame.
with vpi.Backend.NVENC:

motion = vpi.optflow_dense(prevImage, curImage)
Prepare for next iteration by assigning current frame to previous frame.
prevImage = curImage

Initialization phase:
1. Include the header that defines the needed functions and types:
  #include <vpi/algo/OpticalFlowDense.h>
  
  OpticalFlowDense.h
  Declares functions that implement the dense optical flow.
2. Create the stream where the algorithm will be submitted for execution:
  VPIStream stream;
  
  vpiStreamCreate(0, &stream);
  
  VPIStream
  struct VPIStreamImpl * VPIStream
  A handle to a stream.
  Definition: Types.h:209
  
  vpiStreamCreate
  VPIStatus vpiStreamCreate(uint32_t flags, VPIStream *stream)
  Create a stream instance.
3. Define the motion vector image with block-linear memory layout. The motion vector is in the form [x, y], representing the estimated translation, with both coordinates in S10.5 format. The output dimensions are calculated taking into account that one 4x4 input pixel block corresponds to one output vector:
  VPIImage mvImage;
  
  int32_t mvWidth = (width + 3) / 4;
  
  int32_t mvHeight = (height + 3) / 4;
  
  vpiImageCreate(mvWidth, mvHeight, VPI_IMAGE_FORMAT_2S16_BL, 0, &mvImage);
  
  VPI_IMAGE_FORMAT_2S16_BL
  @ VPI_IMAGE_FORMAT_2S16_BL
  Single plane with two interleaved 16-bit signed integer channel.
  Definition: ImageFormat.h:125
  
  VPIImage
  struct VPIImageImpl * VPIImage
  A handle to an image.
  Definition: Types.h:215
  
  vpiImageCreate
  VPIStatus vpiImageCreate(int32_t width, int32_t height, VPIImageFormat fmt, uint32_t flags, VPIImage *img)
  Create an empty image instance with the specified flags.
4. Create the payload to contain temporary buffers. The payload is configured for processing by the NVENC backend:
  VPIPayload optflow;
  
  vpiCreateOpticalFlowDense(VPI_BACKEND_NVENC, width, height, imgFmtBL, quality, &optflow);
  
  vpiCreateOpticalFlowDense
  VPIStatus vpiCreateOpticalFlowDense(uint32_t backends, int32_t width, int32_t height, VPIImageFormat inputFmt, VPIOpticalFlowQuality quality, VPIPayload *payload)
  Creates payload for vpiSubmitOpticalFlowDense.
  
  VPIPayload
  struct VPIPayloadImpl * VPIPayload
  A handle to an algorithm payload.
  Definition: Types.h:227
  
  VPI_BACKEND_NVENC
  @ VPI_BACKEND_NVENC
  NVENC backend.
  Definition: Types.h:96
5. Fetch first frame:
  VPIImage prevImage = /* previous frame */;
Processing phase:
1. Start the processing loop at the second frame:
  for (int idframe = 1; idframe < frame_count; ++idframe)
  
  {
2. Fetch the current frame:
  VPIImage curImage = /* current frame */;
3. Submit the algorithm. The algorithm must feed both previous and current images to the NVIDIA encoder engine and generate motion vectors for each 4x4 pixel block:
  vpiSubmitOpticalFlowDense(stream, VPI_BACKEND_NVENC, optflow, prevImage, curImage, mvImage);
  
  vpiSubmitOpticalFlowDense
  VPIStatus vpiSubmitOpticalFlowDense(VPIStream stream, uint32_t backend, VPIPayload payload, VPIImage prevImg, VPIImage curImg, VPIImage mvImg)
  Runs dense Optical Flow on two frames.
4. (optional) If there are no more tasks to be submitted to the stream, wait until the stream finishes processing. Once the sync is done, you can use the output motion vectors calculated in this iteration:
  vpiStreamSync(stream);
  
  vpiStreamSync
  VPIStatus vpiStreamSync(VPIStream stream)
  Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
5. Swap the previous and current images so that the current image becomes the next iteration's previous image:
  VPIImage tmpImg = prevImage;
  
  prevImage = curImage;
  
  curImage = tmpImg;
  
  }
Cleanup phase:
1. Free resources held by the stream, the payload, and the input and output arrays:
  vpiStreamDestroy(stream);
  
  vpiPayloadDestroy(optflow);
  
  vpiImageDestroy(prevImage);
  
  vpiImageDestroy(curImage);
  
  vpiImageDestroy(mvImage);
  
  vpiImageDestroy
  void vpiImageDestroy(VPIImage img)
  Destroy an image instance.
  
  vpiPayloadDestroy
  void vpiPayloadDestroy(VPIPayload payload)
  Deallocates the payload object and all associated resources.
  
  vpiStreamDestroy
  void vpiStreamDestroy(VPIStream stream)
  Destroy a stream instance and deallocate all HW resources.

Consult the Dense Optical Flow sample for a complete example.

For more information, see Dense Optical Flow in the "API Reference" section of VPI - Vision Programming Interface.

Limitations and Constraints

NVENC

Only supported on Jetson Xavier NX and Jetson AGX Xavier series.
The previous and current image must have the same dimensions and type.
The output motion vector image must have dimensions \((\lceil w/4 \rceil, \lceil h/4 \rceil)\), where the previous and current images' dimensions are \((w, h)\).
Accepted input types:
- VPI_IMAGE_FORMAT_NV12_BL
- VPI_IMAGE_FORMAT_NV12_ER_BL
Accepted output types:
- VPI_IMAGE_FORMAT_2S16_BL

Other backends

Not supported

Performance

For information on how to use the performance table below, see Algorithm Performance Tables.
Before comparing measurements, consult Comparing Algorithm Elapsed Times.
For further information on how performance was benchmarked, see Performance Benchmark.

VPI - Vision Programming Interface