VPI - Vision Programming Interface

3.0 Release

Dense Optical Flow

Overview

This application fetches frames from input video source, runs the algorithms on the previous and current images, and then calculate the motion vectors for every 4x4 pixel block. The output motion vectors will be mapped to the HSV colorspace, where hue relates to motion angle, value relates to motion speed, and the result will be saved to a video file.

Instructions

The command line parameters are:

<backend> <input video> <quality> <gridsize> <numlevels>

where

  • backend: Defines the backend that will perform the processing. Only OFA backend supported. ofa is only supported on Jetson AGX Orin.
  • input video: Input video file name, it accepts .mp4, .avi and possibly others, depending on OpenCV's support.
  • quality: Specify the quality that the algorithm will use. Available options are: low (fastest), medium (balanced perf and quality) and high (slowest).
  • gridsize: size of the regular grid over the image, each cell will result in one motion vector. Use 1 for dense grid.
  • numlevels: number of pyramid levels used.

Here's one example for Jetson AGX Orin.

  • C++
    ./vpi_sample_13_optflow_dense ofa ../assets/pedestrians.mp4 high 1 5
  • Python
    python3 main.py ofa ../assets/pedestrians.mp4 high 2

The application will process pedestrians.mp4 and create denseoptflow_mv_ofa.mp4.

Results

Input videoMotion vector video

Source Code

For convenience, here's the code that is also installed in the samples directory.

Language:
27 import sys
28 import vpi
29 import numpy as np
30 from os import path
31 from argparse import ArgumentParser
32 from contextlib import contextmanager
33 import cv2
34 
35 
36 # ----------------------------
37 # Some utility functions
38 
39 def process_motion_vectors(mv):
40  with mv.rlock_cpu() as data:
41  # convert S10.5 format to float
42  flow = np.float32(data)/(1<<5)
43 
44  # Create an image where the motion vector angle is
45  # mapped to a color hue, and intensity is proportional
46  # to vector's magnitude
47  magnitude, angle = cv2.cartToPolar(flow[:,:,0], flow[:,:,1], angleInDegrees=True)
48 
49  clip = 5.0
50  cv2.threshold(magnitude, clip, clip, cv2.THRESH_TRUNC, magnitude)
51 
52  # build the hsv image
53  hsv = np.ndarray([flow.shape[0], flow.shape[1], 3], np.float32)
54  hsv[:,:,0] = angle
55  hsv[:,:,1] = np.ones((angle.shape[0], angle.shape[1]), np.float32)
56  hsv[:,:,2] = magnitude / clip
57 
58  # Convert HSV to BGR8
59  bgr = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)
60  return np.uint8(bgr*255)
61 
62 # ----------------------------
63 # Parse command line arguments
64 
65 parser = ArgumentParser()
66 parser.add_argument('backend', choices=['ofa'],
67  help='Backend to be used for processing')
68 
69 parser.add_argument('input',
70  help='Input video to be processed')
71 
72 parser.add_argument('quality', choices=['low', 'medium', 'high'],
73  help='Quality setting')
74 
75 parser.add_argument('gridSize', type=int, choices=[1,2,4,8],
76  help='Grid size')
77 
78 parser.add_argument('numLevels', type=int, choices=[1,2,3,4,5],
79  help='Number of pyramid levels')
80 
81 args = parser.parse_args();
82 
83 assert args.backend == 'ofa'
84 if args.backend == 'ofa':
85  backend = vpi.Backend.OFA
86 
87 if args.quality == "low":
88  quality = vpi.OptFlowQuality.LOW
89 elif args.quality == "medium":
90  quality = vpi.OptFlowQuality.MEDIUM
91 else:
92  assert args.quality == "high"
93  quality = vpi.OptFlowQuality.HIGH
94 
95 # -----------------------------
96 # Open input and output videos
97 
98 inVideo = cv2.VideoCapture(args.input)
99 
100 fourcc = cv2.VideoWriter_fourcc(*'MPEG')
101 inSize = (int(inVideo.get(cv2.CAP_PROP_FRAME_WIDTH)), int(inVideo.get(cv2.CAP_PROP_FRAME_HEIGHT)))
102 fps = inVideo.get(cv2.CAP_PROP_FPS)
103 
104 # Calculate the output dimensions based on the input's and the chosen grid size
105 outSize = ((inSize[0] + args.gridSize-1)//args.gridSize, (inSize[1]+args.gridSize-1)//args.gridSize)
106 
107 outVideo = cv2.VideoWriter('denseoptflow_mv_python'+str(sys.version_info[0])+'_'+args.backend+'.mp4',
108  fourcc, fps, outSize)
109 
110 #---------------------------------
111 # Main processing loop
112 
113 prevFrame = None
114 
115 idFrame = 0
116 while True:
117  # Read one input frame
118  ret, cvFrame = inVideo.read()
119  if not ret:
120  break
121 
122  # Convert it to Y8_ER_BL pyramid format to be used by VPI
123  # No single backend can convert from OpenCV's BGR8 to Y8_ER_BL
124  # required by the algorithm. We must do in two steps using CUDA and VIC.
125  curFrame = vpi.asimage(cvFrame, vpi.Format.BGR8) \
126  .convert(vpi.Format.Y8_ER, backend=vpi.Backend.CUDA) \
127  .gaussian_pyramid(args.numLevels, backend=vpi.Backend.CUDA) \
128  .convert(vpi.Format.Y8_ER_BL, backend=vpi.Backend.VIC)
129 
130  # Need at least 2 frames to start processing
131  if prevFrame is not None:
132  print("Processing frame {}".format(idFrame))
133 
134  # Calculate the motion vectors from previous to current frame
135  with backend:
136  motion_vectors = vpi.optflow_dense(prevFrame, curFrame, quality = quality, gridsize = args.gridSize)
137 
138  # Turn motion vectors into an image
139  motion_image = process_motion_vectors(motion_vectors)
140 
141  # Save it to output video
142  outVideo.write(motion_image)
143 
144  # Prepare next iteration
145  prevFrame = curFrame
146  idFrame += 1
29 #include <opencv2/core/version.hpp>
30 #include <opencv2/imgcodecs.hpp>
31 #include <opencv2/imgproc/imgproc.hpp>
32 #include <opencv2/videoio.hpp>
33 #include <vpi/OpenCVInterop.hpp>
34 
35 #include <vpi/Array.h>
36 #include <vpi/Image.h>
37 #include <vpi/ImageFormat.h>
38 #include <vpi/Pyramid.h>
39 #include <vpi/Status.h>
40 #include <vpi/Stream.h>
44 
45 #include <iostream>
46 #include <sstream>
47 
48 #define CHECK_STATUS(STMT) \
49  do \
50  { \
51  VPIStatus status = (STMT); \
52  if (status != VPI_SUCCESS) \
53  { \
54  char buffer[VPI_MAX_STATUS_MESSAGE_LENGTH]; \
55  vpiGetLastStatusMessage(buffer, sizeof(buffer)); \
56  std::ostringstream ss; \
57  ss << "line " << __LINE__ << ": "; \
58  ss << vpiStatusGetName(status) << ": " << buffer; \
59  throw std::runtime_error(ss.str()); \
60  } \
61  } while (0);
62 
63 static void ProcessMotionVector(VPIImage mvImg, cv::Mat &outputImage)
64 {
65  // Lock the input image to access it from CPU
66  VPIImageData mvData;
68 
69  // Create a cv::Mat that points to the input image data
70  cv::Mat mvImage;
71  CHECK_STATUS(vpiImageDataExportOpenCVMat(mvData, &mvImage));
72 
73  // Convert S10.5 format to float
74  cv::Mat flow(mvImage.size(), CV_32FC2);
75  mvImage.convertTo(flow, CV_32F, 1.0f / (1 << 5));
76 
77  // Image not needed anymore, we can unlock it.
78  CHECK_STATUS(vpiImageUnlock(mvImg));
79 
80  // Create an image where the motion vector angle is
81  // mapped to a color hue, and intensity is proportional
82  // to vector's magnitude.
83  cv::Mat magnitude, angle;
84  {
85  cv::Mat flowChannels[2];
86  split(flow, flowChannels);
87  cv::cartToPolar(flowChannels[0], flowChannels[1], magnitude, angle, true);
88  }
89 
90  float clip = 5;
91  cv::threshold(magnitude, magnitude, clip, clip, cv::THRESH_TRUNC);
92 
93  // build hsv image
94  cv::Mat _hsv[3], hsv, bgr;
95  _hsv[0] = angle;
96  _hsv[1] = cv::Mat::ones(angle.size(), CV_32F);
97  _hsv[2] = magnitude / clip; // intensity must vary from 0 to 1
98  merge(_hsv, 3, hsv);
99 
100  cv::cvtColor(hsv, bgr, cv::COLOR_HSV2BGR);
101  bgr.convertTo(outputImage, CV_8U, 255.0);
102 }
103 
104 int main(int argc, char *argv[])
105 {
106  // OpenCV image that will be wrapped by a VPIImage.
107  // Define it here so that it's destroyed *after* wrapper is destroyed
108  cv::Mat cvPrevFrame, cvCurFrame;
109 
110  // VPI objects that will be used
111  VPIStream stream = NULL;
112  VPIImage imgPrevFramePL = NULL;
113  VPIImage imgPrevFrameTmp = NULL;
114  VPIImage imgCurFramePL = NULL;
115  VPIImage imgCurFrameTmp = NULL;
116  VPIImage imgMotionVecBL = NULL;
117 
118  VPIPyramid prevPyrTmp = NULL;
119  VPIPyramid prevPyrBL = NULL;
120  VPIPyramid curPyrTmp = NULL;
121  VPIPyramid curPyrBL = NULL;
122 
123  VPIPayload payload = NULL;
124 
125  int retval = 0;
126 
127  try
128  {
129  if (argc != 6)
130  {
131  throw std::runtime_error(std::string("Usage: ") + argv[0] +
132  " <ofa> <input_video> <low|medium|high> <gridsize> <numlevels>");
133  }
134 
135  // Parse input parameters
136  std::string strBackend = argv[1];
137  std::string strInputVideo = argv[2];
138  std::string strQuality = argv[3];
139  std::string strGridSize = argv[4];
140  std::string strNumLevels = argv[5];
141 
142  VPIOpticalFlowQuality quality;
143  if (strQuality == "low")
144  {
146  }
147  else if (strQuality == "medium")
148  {
150  }
151  else if (strQuality == "high")
152  {
154  }
155  else
156  {
157  throw std::runtime_error("Unknown quality provided");
158  }
159 
160  VPIBackend backend;
161  if (strBackend == "ofa")
162  {
163  backend = VPI_BACKEND_OFA;
164  }
165  else
166  {
167  throw std::runtime_error("Backend '" + strBackend + "' not recognized, it must be ofa.");
168  }
169 
170  char *endptr;
171  int gridSize = strtol(strGridSize.c_str(), &endptr, 10);
172  if (*endptr != '\0')
173  {
174  throw std::runtime_error("Syntax error parsing gridsize " + strGridSize);
175  }
176 
177  int numLevels = strtol(strNumLevels.c_str(), &endptr, 10);
178  if (*endptr != '\0')
179  {
180  throw std::runtime_error("Syntax error parsing numlevels " + strNumLevels);
181  }
182 
183  // Load the input video
184  cv::VideoCapture invid;
185  if (!invid.open(strInputVideo))
186  {
187  throw std::runtime_error("Can't open '" + strInputVideo + "'");
188  }
189 
190  // Create the stream where processing will happen. We'll use user-provided backend
191  // for Optical Flow, and CUDA/VIC for image format conversions.
192  CHECK_STATUS(vpiStreamCreate(backend | VPI_BACKEND_CUDA | VPI_BACKEND_VIC, &stream));
193 
194  // Fetch the first frame
195  if (!invid.read(cvPrevFrame))
196  {
197  throw std::runtime_error("Cannot read frame from input video");
198  }
199 
200  // Create the previous and current frame wrapper using the first frame. This wrapper will
201  // be set to point to every new frame in the main loop.
202  CHECK_STATUS(vpiImageCreateWrapperOpenCVMat(cvPrevFrame, 0, &imgPrevFramePL));
203  CHECK_STATUS(vpiImageCreateWrapperOpenCVMat(cvPrevFrame, 0, &imgCurFramePL));
204 
205  // Define the image formats we'll use throughout this sample.
208 
209  int32_t width = cvPrevFrame.cols;
210  int32_t height = cvPrevFrame.rows;
211 
212  // Create Dense Optical Flow payload to be executed on the given backend
213  std::vector<int32_t> pyrGridSize(numLevels, gridSize); // all levels will have the same grid size
214  CHECK_STATUS(vpiCreateOpticalFlowDense(backend, width, height, imgFmtBL, &pyrGridSize[0], pyrGridSize.size(),
215  quality, &payload));
216 
217  // The Dense Optical Flow on NVENC or OFA backends expects input to be in block-linear format.
218  // Since Convert Image Format algorithm doesn't currently support direct BGR
219  // pitch-linear (from OpenCV) to Y8 block-linear conversion, it must be done in two
220  // passes, first from BGR/PL to Y8/PL using CUDA, then from Y8/PL to Y8/BL using VIC.
221  // The temporary image buffer below will store the intermediate Y8/PL representation.
222  CHECK_STATUS(vpiImageCreate(width, height, imgFmt, 0, &imgPrevFrameTmp));
223  CHECK_STATUS(vpiImageCreate(width, height, imgFmt, 0, &imgCurFrameTmp));
224 
225  // Now create the final block-linear buffer that'll be used as input to the
226  // algorithm.
227 
228  CHECK_STATUS(vpiPyramidCreate(width, height, imgFmt, pyrGridSize.size(), 0.5, 0, &prevPyrTmp));
229  CHECK_STATUS(vpiPyramidCreate(width, height, imgFmt, pyrGridSize.size(), 0.5, 0, &curPyrTmp));
230 
231  CHECK_STATUS(vpiPyramidCreate(width, height, imgFmtBL, pyrGridSize.size(), 0.5, 0, &prevPyrBL));
232  CHECK_STATUS(vpiPyramidCreate(width, height, imgFmtBL, pyrGridSize.size(), 0.5, 0, &curPyrBL));
233 
234  // Motion vector image width and height, align to be multiple of gridSize
235  int32_t mvWidth = (width + gridSize - 1) / gridSize;
236  int32_t mvHeight = (height + gridSize - 1) / gridSize;
237 
238  // The output video will be heatmap of motion vector image
239  int fourcc = cv::VideoWriter::fourcc('M', 'P', 'E', 'G');
240  double fps = invid.get(cv::CAP_PROP_FPS);
241 
242  cv::VideoWriter outVideo("denseoptflow_mv_" + strBackend + ".mp4", fourcc, fps, cv::Size(mvWidth, mvHeight));
243  if (!outVideo.isOpened())
244  {
245  throw std::runtime_error("Can't create output video");
246  }
247 
248  // Create the output motion vector buffer
249  CHECK_STATUS(vpiImageCreate(mvWidth, mvHeight, VPI_IMAGE_FORMAT_2S16_BL, 0, &imgMotionVecBL));
250 
251  // First convert the first frame to Y8_BL pyramid. It'll be used as previous frame when the algorithm is called.
252  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, imgPrevFramePL, imgPrevFrameTmp, nullptr));
253  CHECK_STATUS(
254  vpiSubmitGaussianPyramidGenerator(stream, VPI_BACKEND_CUDA, imgPrevFrameTmp, prevPyrTmp, VPI_BORDER_CLAMP));
255  CHECK_STATUS(vpiSubmitConvertImageFormatPyramid(stream, VPI_BACKEND_VIC, prevPyrTmp, prevPyrBL, NULL));
256 
257  // Create a output image which holds the rendered motion vector image.
258  cv::Mat mvOutputImage;
259 
260  // Fetch a new frame until video ends
261  int idxFrame = 1;
262  while (invid.read(cvCurFrame))
263  {
264  printf("Processing frame %d\n", idxFrame++);
265  // Wrap frame into a VPIImage, reusing the existing imgCurFramePL.
266  CHECK_STATUS(vpiImageSetWrappedOpenCVMat(imgCurFramePL, cvCurFrame));
267 
268  // Convert current frame to Y8_BL pyramid format
269  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, imgCurFramePL, imgCurFrameTmp, nullptr));
270  CHECK_STATUS(vpiSubmitGaussianPyramidGenerator(stream, VPI_BACKEND_CUDA, imgCurFrameTmp, curPyrTmp,
272  CHECK_STATUS(vpiSubmitConvertImageFormatPyramid(stream, VPI_BACKEND_VIC, curPyrTmp, curPyrBL, NULL));
273 
274  CHECK_STATUS(
275  vpiSubmitOpticalFlowDensePyramid(stream, backend, payload, prevPyrBL, curPyrBL, imgMotionVecBL));
276 
277  // Wait for processing to finish.
278  CHECK_STATUS(vpiStreamSync(stream));
279 
280  // Render the resulting motion vector in the output image
281  ProcessMotionVector(imgMotionVecBL, mvOutputImage);
282 
283  // Save to output video
284  outVideo << mvOutputImage;
285 
286  // Swap previous frame and next frame
287  std::swap(cvPrevFrame, cvCurFrame);
288  std::swap(imgPrevFramePL, imgCurFramePL);
289  std::swap(prevPyrBL, curPyrBL);
290  }
291  }
292  catch (std::exception &e)
293  {
294  std::cerr << e.what() << std::endl;
295  retval = 1;
296  }
297 
298  // Destroy all resources used
299  vpiStreamDestroy(stream);
300  vpiPayloadDestroy(payload);
301 
302  vpiImageDestroy(imgPrevFramePL);
303  vpiImageDestroy(imgPrevFrameTmp);
304  vpiImageDestroy(imgCurFramePL);
305  vpiImageDestroy(imgCurFrameTmp);
306  vpiImageDestroy(imgMotionVecBL);
307 
308  vpiPyramidDestroy(prevPyrTmp);
309  vpiPyramidDestroy(prevPyrBL);
310  vpiPyramidDestroy(curPyrTmp);
311  vpiPyramidDestroy(curPyrBL);
312 
313  return retval;
314 }
Functions and structures for dealing with VPI arrays.
Declares functions that handle image format conversion.
Declares functions that handle gaussian pyramids.
Defines types and functions to handle image formats.
#define VPI_IMAGE_FORMAT_Y8_ER_BL
Single plane with one block-linear 8-bit unsigned integer channel with full-range luma (grayscale) in...
Definition: ImageFormat.h:156
#define VPI_IMAGE_FORMAT_2S16_BL
Single plane with two interleaved block-linear 16-bit signed integer channel.
Definition: ImageFormat.h:127
#define VPI_IMAGE_FORMAT_Y8_ER
Single plane with one pitch-linear 8-bit unsigned integer channel with full-range luma (grayscale) in...
Definition: ImageFormat.h:151
Functions and structures for dealing with VPI images.
Functions for handling OpenCV interoperability with VPI.
Declares functions that implement the dense optical flow.
Functions and structures for dealing with VPI pyramids.
Declaration of VPI status codes handling functions.
Declares functions dealing with VPI streams.
VPIStatus vpiSubmitConvertImageFormatPyramid(VPIStream stream, uint64_t backend, VPIPyramid input, VPIPyramid output, const VPIConvertImageFormatParams *params)
Converts the pyramid contents to the desired format, with optional scaling and offset.
VPIStatus vpiSubmitConvertImageFormat(VPIStream stream, uint64_t backend, VPIImage input, VPIImage output, const VPIConvertImageFormatParams *params)
Converts the image contents to the desired format, with optional scaling and offset.
VPIStatus vpiSubmitGaussianPyramidGenerator(VPIStream stream, uint64_t backend, VPIImage input, VPIPyramid output, VPIBorderExtension border)
Computes the Gaussian pyramid from the input image.
uint64_t VPIImageFormat
Pre-defined image formats.
Definition: ImageFormat.h:94
void vpiImageDestroy(VPIImage img)
Destroy an image instance.
struct VPIImageImpl * VPIImage
A handle to an image.
Definition: Types.h:256
VPIStatus vpiImageLockData(VPIImage img, VPILockMode mode, VPIImageBufferType bufType, VPIImageData *data)
Acquires the lock on an image object and returns the image contents.
VPIStatus vpiImageCreate(int32_t width, int32_t height, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
Create an empty image instance with the specified flags.
VPIStatus vpiImageUnlock(VPIImage img)
Releases the lock on an image object.
@ VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR
Host-accessible with planes in pitch-linear memory layout.
Definition: Image.h:172
Stores information about image characteristics and content.
Definition: Image.h:234
VPIStatus vpiImageCreateWrapperOpenCVMat(const cv::Mat &mat, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
Wraps a cv::Mat in an VPIImage with the given image format.
VPIStatus vpiImageDataExportOpenCVMat(const VPIImageData &imgData, cv::Mat *mat)
Fills an existing cv::Mat with data from VPIImageData coming from a locked VPIImage.
VPIStatus vpiImageSetWrappedOpenCVMat(VPIImage img, const cv::Mat &mat)
Redefines the wrapped cv::Mat of an existing VPIImage wrapper.
VPIStatus vpiCreateOpticalFlowDense(uint64_t backends, int32_t width, int32_t height, VPIImageFormat inputFmt, const int32_t *gridSize, int32_t numLevels, VPIOpticalFlowQuality quality, VPIPayload *payload)
Creates payload for vpiSubmitOpticalFlowDense.
VPIStatus vpiSubmitOpticalFlowDensePyramid(VPIStream stream, uint64_t backend, VPIPayload payload, VPIPyramid prevPyr, VPIPyramid curPyr, VPIImage mvImg)
Runs dense Optical Flow on two frames, outputting motion vectors.
struct VPIPayloadImpl * VPIPayload
A handle to an algorithm payload.
Definition: Types.h:268
void vpiPayloadDestroy(VPIPayload payload)
Deallocates the payload object and all associated resources.
VPIStatus vpiPyramidCreate(int32_t width, int32_t height, VPIImageFormat fmt, int32_t numLevels, float scale, uint64_t flags, VPIPyramid *pyr)
Create an empty image pyramid instance with the specified flags.
struct VPIPyramidImpl * VPIPyramid
A handle to an image pyramid.
Definition: Types.h:262
void vpiPyramidDestroy(VPIPyramid pyr)
Destroy an image pyramid instance as well as all resources it owns.
struct VPIStreamImpl * VPIStream
A handle to a stream.
Definition: Types.h:250
VPIStatus vpiStreamSync(VPIStream stream)
Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
VPIBackend
VPI Backend types.
Definition: Types.h:91
void vpiStreamDestroy(VPIStream stream)
Destroy a stream instance and deallocate all HW resources.
VPIStatus vpiStreamCreate(uint64_t flags, VPIStream *stream)
Create a stream instance.
@ VPI_BACKEND_CUDA
CUDA backend.
Definition: Types.h:93
@ VPI_BACKEND_OFA
OFA backend.
Definition: Types.h:97
@ VPI_BACKEND_VIC
VIC backend.
Definition: Types.h:95
VPIOpticalFlowQuality
Defines the quality of the optical flow algorithm.
Definition: Types.h:576
@ VPI_BORDER_CLAMP
Border pixels are repeated indefinitely.
Definition: Types.h:279
@ VPI_OPTICAL_FLOW_QUALITY_LOW
Fast but low quality optical flow implementation.
Definition: Types.h:578
@ VPI_OPTICAL_FLOW_QUALITY_HIGH
Slow but high quality optical flow implementation.
Definition: Types.h:584
@ VPI_OPTICAL_FLOW_QUALITY_MEDIUM
Speed and quality in between of VPI_OPTICAL_FLOW_QUALITY_LOW and VPI_OPTICAL_FLOW_QUALITY_HIGH.
Definition: Types.h:581
@ VPI_LOCK_READ
Lock memory only for reading.
Definition: Types.h:595