VPI - Vision Programming Interface

1.1 Release

Dense Optical Flow


The Dense Optical Flow algorithm estimates the motion vectors in every 4x4 pixel block between the previous and current frames. Its uses include motion detection and object tracking.

The output below represents each vector in the HSV color space, where the hue is related to the motion direction, and the value is proportional to the speed.

InputOutput motion vectors


The algorithm analyzes the content of two images, previous and current, and writes an estimate of the motion to an output image.

As shown below, the algorithm splits input images into 4x4 pixel blocks. Then, for each block, it estimates the content translation from the previous to the current frame, and writes the estimate as a motion vector to the corresponding pixel in the output image.

Dense Optical Flow Estimation

The 2D motion vector is represented as a X,Y coordinate pair, with each coordinate in S10.5 signed fixed-point format, as shown below:

S10.5 signed fixed-point format

Conversion between S10.5 format and floating point format is done as follows:

\begin{align*} S_{10.5} &= \lfloor F \times 32 \rfloor \\ F &= \lfloor S_{10.5} / 32 \rfloor \end{align*}


  1. Import VPI module
    import vpi
  2. Fetch the first frame.
    prevImage = inVideo.read()[1]
  3. Fetch the next frame.
    while inVideo.read(curImage)[0]:
  4. Execute the algorithm using NVENC backend, passing to it the previous and the current frame.
    with vpi.Backend.NVENC:
    motion = vpi.optflow_dense(prevImage, curImage)
  5. Prepare for next iteration by assigning current frame to previous frame.
    prevImage = curImage
  1. Initialization phase:
    1. Include the header that defines the needed functions and types:
      Declares functions that implement the dense optical flow.
    2. Create the stream where the algorithm will be submitted for execution:
      VPIStream stream;
      vpiStreamCreate(0, &stream);
      struct VPIStreamImpl * VPIStream
      A handle to a stream.
      Definition: Types.h:209
      VPIStatus vpiStreamCreate(uint32_t flags, VPIStream *stream)
      Create a stream instance.
    3. Define the motion vector image with block-linear memory layout. The motion vector is in the form [x, y], representing the estimated translation, with both coordinates in S10.5 format. The output dimensions are calculated taking into account that one 4x4 input pixel block corresponds to one output vector:
      VPIImage mvImage;
      int32_t mvWidth = (width + 3) / 4;
      int32_t mvHeight = (height + 3) / 4;
      vpiImageCreate(mvWidth, mvHeight, VPI_IMAGE_FORMAT_2S16_BL, 0, &mvImage);
      Single plane with two interleaved 16-bit signed integer channel.
      Definition: ImageFormat.h:119
      struct VPIImageImpl * VPIImage
      A handle to an image.
      Definition: Types.h:215
      VPIStatus vpiImageCreate(int32_t width, int32_t height, VPIImageFormat fmt, uint32_t flags, VPIImage *img)
      Create an empty image instance with the specified flags.
    4. Create the payload to contain temporary buffers. The payload is configured for processing by the NVENC backend:
      VPIPayload optflow;
      vpiCreateOpticalFlowDense(VPI_BACKEND_NVENC, width, height, imgFmtBL, quality, &optflow);
      VPIStatus vpiCreateOpticalFlowDense(uint32_t backends, int32_t width, int32_t height, VPIImageFormat inputFmt, VPIOpticalFlowQuality quality, VPIPayload *payload)
      Creates payload for vpiSubmitOpticalFlowDense.
      struct VPIPayloadImpl * VPIPayload
      A handle to an algorithm payload.
      Definition: Types.h:227
      NVENC backend.
      Definition: Types.h:96
    5. Fetch first frame:
      VPIImage prevImage = /* previous frame */;
  2. Processing phase:
    1. Start the processing loop at the second frame:
      for (int idframe = 1; idframe < frame_count; ++idframe)
    2. Fetch the current frame:
      VPIImage curImage = /* current frame */;
    3. Submit the algorithm. The algorithm must feed both previous and current images to the NVIDIA encoder engine and generate motion vectors for each 4x4 pixel block:
      vpiSubmitOpticalFlowDense(stream, VPI_BACKEND_NVENC, optflow, prevImage, curImage, mvImage);
      VPIStatus vpiSubmitOpticalFlowDense(VPIStream stream, uint32_t backend, VPIPayload payload, VPIImage prevImg, VPIImage curImg, VPIImage mvImg)
      Runs dense Optical Flow on two frames.
    4. (optional) If there are no more tasks to be submitted to the stream, wait until the stream finishes processing. Once the sync is done, you can use the output motion vectors calculated in this iteration:
      VPIStatus vpiStreamSync(VPIStream stream)
      Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
    5. Swap the previous and current images so that the current image becomes the next iteration's previous image:
      VPIImage tmpImg = prevImage;
      prevImage = curImage;
      curImage = tmpImg;
  3. Cleanup phase:
    1. Free resources held by the stream, the payload, and the input and output arrays:
      void vpiImageDestroy(VPIImage img)
      Destroy an image instance.
      void vpiPayloadDestroy(VPIPayload payload)
      Deallocates the payload object and all associated resources.
      void vpiStreamDestroy(VPIStream stream)
      Destroy a stream instance and deallocate all HW resources.

Consult the Dense Optical Flow sample for a complete example.

For more information, see Dense Optical Flow in the "API Reference" section of VPI - Vision Programming Interface.

Limitations and Constraints


Other backends

  • Not supported


For information on how to use the performance table below, see Algorithm Performance Tables.
Before comparing measurements, consult Comparing Algorithm Elapsed Times.
For further information on how performance was benchmarked, see Performance Benchmark.