The Dense Optical Flow algorithm estimates the motion vectors in every 4x4 pixel block between the previous and current frames. Its uses include motion detection and object tracking.
The output below represents each vector in the HSV color space, where the hue is related to the motion direction, and the value is proportional to the speed.
Input
Output motion vectors
Implementation
The algorithm analyzes the content of two images, previous and current, and writes an estimate of the motion to an output image.
As shown below, the algorithm splits input images into 4x4 pixel blocks. Then, for each block, it estimates the content translation from the previous to the current frame, and writes the estimate as a motion vector to the corresponding pixel in the output image.
Dense Optical Flow Estimation
The 2D motion vector is represented as a X,Y coordinate pair, with each coordinate in S10.5 signed fixed-point format, as shown below:
S10.5 signed fixed-point format
Conversion between S10.5 format and floating point format is done as follows:
\begin{align*} S_{10.5} &= \lfloor F \times 32 \rfloor \\ F &= \lfloor S_{10.5} / 32 \rfloor \end{align*}
Usage
Language:
Import VPI module
import vpi
Fetch the first frame.
prevImage = inVideo.read()[1]
Fetch the next frame.
while inVideo.read(curImage)[0]:
Execute the algorithm using NVENC backend, passing to it the previous and the current frame.
with vpi.Backend.NVENC:
motion = vpi.optflow_dense(prevImage, curImage)
Prepare for next iteration by assigning current frame to previous frame.
prevImage = curImage
Initialization phase:
Include the header that defines the needed functions and types:
Define the motion vector image with block-linear memory layout. The motion vector is in the form [x, y], representing the estimated translation, with both coordinates in S10.5 format. The output dimensions are calculated taking into account that one 4x4 input pixel block corresponds to one output vector:
Submit the algorithm. The algorithm must feed both previous and current images to the NVIDIA encoder engine and generate motion vectors for each 4x4 pixel block:
(optional) If there are no more tasks to be submitted to the stream, wait until the stream finishes processing. Once the sync is done, you can use the output motion vectors calculated in this iteration:
For more information, see Dense Optical Flow in the "API Reference" section of VPI - Vision Programming Interface.
Limitations and Constraints
NVENC
Only supported on Jetson Xavier NX and Jetson AGX Xavier series.
The previous and current image must have the same dimensions and type.
The output motion vector image must have dimensions \((\lceil w/4 \rceil, \lceil h/4 \rceil)\), where the previous and current images' dimensions are \((w, h)\).