{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Optical Flow example\n", "This notebook presents how to use Dali to calculate optical flow for given sequence of frames." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's start with some handy imports" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from __future__ import print_function\n", "from __future__ import division\n", "import os.path\n", "import numpy as np\n", "\n", "from nvidia.dali.pipeline import Pipeline\n", "import nvidia.dali.ops as ops\n", "\n", "from matplotlib import pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Setting metaparameters. \n", "As an example we use [Sintel trailer](https://durian.blender.org/), included in [DALI_extra](https://github.com/NVIDIA/DALI_extra) repository. Feel free to verify against your own video data.\n", "\n", "`DALI_EXTRA_PATH` environment variable should point to the place where data from [DALI extra repository](https://github.com/NVIDIA/DALI_extra) is downloaded. Please make sure that the proper release tag is checked out." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "batch_size = 1\n", "sequence_length = 10\n", "dali_extra_path = os.environ['DALI_EXTRA_PATH']\n", "video_filename = dali_extra_path + \"/db/optical_flow/sintel_trailer/sintel_trailer_short.mp4\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Functions used for Optical flow visualization. \n", "The code comes from [Tomrunia's GitHub](https://github.com/tomrunia/OpticalFlow_Visualization \"OpticalFlow_Visualization\")" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "def make_colorwheel():\n", " '''\n", " Generates a color wheel for optical flow visualization as presented in:\n", " Baker et al. \"A Database and Evaluation Methodology for Optical Flow\" (ICCV, 2007)\n", " URL: http://vision.middlebury.edu/flow/flowEval-iccv07.pdf\n", " According to the C++ source code of Daniel Scharstein\n", " According to the Matlab source code of Deqing Sun\n", " '''\n", "\n", " RY = 15\n", " YG = 6\n", " GC = 4\n", " CB = 11\n", " BM = 13\n", " MR = 6\n", "\n", " ncols = RY + YG + GC + CB + BM + MR\n", " colorwheel = np.zeros((ncols, 3))\n", " col = 0\n", "\n", " # RY\n", " colorwheel[0:RY, 0] = 255\n", " colorwheel[0:RY, 1] = np.floor(255 * np.arange(0, RY) / RY)\n", " col = col + RY\n", " # YG\n", " colorwheel[col:col + YG, 0] = 255 - np.floor(255 * np.arange(0, YG) / YG)\n", " colorwheel[col:col + YG, 1] = 255\n", " col = col + YG\n", " # GC\n", " colorwheel[col:col + GC, 1] = 255\n", " colorwheel[col:col + GC, 2] = np.floor(255 * np.arange(0, GC) / GC)\n", " col = col + GC\n", " # CB\n", " colorwheel[col:col + CB, 1] = 255 - np.floor(255 * np.arange(CB) / CB)\n", " colorwheel[col:col + CB, 2] = 255\n", " col = col + CB\n", " # BM\n", " colorwheel[col:col + BM, 2] = 255\n", " colorwheel[col:col + BM, 0] = np.floor(255 * np.arange(0, BM) / BM)\n", " col = col + BM\n", " # MR\n", " colorwheel[col:col + MR, 2] = 255 - np.floor(255 * np.arange(MR) / MR)\n", " colorwheel[col:col + MR, 0] = 255\n", " return colorwheel\n", "\n", "\n", "def flow_compute_color(u, v, convert_to_bgr=False):\n", " '''\n", " Applies the flow color wheel to (possibly clipped) flow components u and v.\n", " According to the C++ source code of Daniel Scharstein\n", " According to the Matlab source code of Deqing Sun\n", " :param u: np.ndarray, input horizontal flow\n", " :param v: np.ndarray, input vertical flow\n", " :param convert_to_bgr: bool, whether to change ordering and output BGR instead of RGB\n", " :return:\n", " '''\n", "\n", " flow_image = np.zeros((u.shape[0], u.shape[1], 3), np.uint8)\n", "\n", " colorwheel = make_colorwheel() # shape [55x3]\n", " ncols = colorwheel.shape[0]\n", "\n", " rad = np.sqrt(np.square(u) + np.square(v))\n", " a = np.arctan2(-v, -u) / np.pi\n", "\n", " fk = (a + 1) / 2 * (ncols - 1) + 1\n", " k0 = np.floor(fk).astype(np.int32)\n", " k1 = k0 + 1\n", " k1[k1 == ncols] = 1\n", " f = fk - k0\n", "\n", " for i in range(colorwheel.shape[1]):\n", " tmp = colorwheel[:, i]\n", " col0 = tmp[k0] / 255.0\n", " col1 = tmp[k1] / 255.0\n", " col = (1 - f) * col0 + f * col1\n", "\n", " idx = (rad <= 1)\n", " col[idx] = 1 - rad[idx] * (1 - col[idx])\n", " col[~idx] = col[~idx] * 0.75 # out of range?\n", "\n", " # Note the 2-i => BGR instead of RGB\n", " ch_idx = 2 - i if convert_to_bgr else i\n", " flow_image[:, :, ch_idx] = np.floor(255 * col)\n", "\n", " return flow_image\n", "\n", "\n", "def flow_to_color(flow_uv, clip_flow=None, convert_to_bgr=False):\n", " '''\n", " Expects a two dimensional flow image of shape [H,W,2]\n", " According to the C++ source code of Daniel Scharstein\n", " According to the Matlab source code of Deqing Sun\n", " :param flow_uv: np.ndarray of shape [H,W,2]\n", " :param clip_flow: float, maximum clipping value for flow\n", " :return:\n", " '''\n", "\n", " assert flow_uv.ndim == 3, 'input flow must have three dimensions'\n", " assert flow_uv.shape[2] == 2, 'input flow must have shape [H,W,2]'\n", "\n", " if clip_flow is not None:\n", " flow_uv = np.clip(flow_uv, 0, clip_flow)\n", "\n", " u = flow_uv[:, :, 0]\n", " v = flow_uv[:, :, 1]\n", "\n", " rad = np.sqrt(np.square(u) + np.square(v))\n", " rad_max = np.max(rad)\n", "\n", " epsilon = 1e-5\n", " u = u / (rad_max + epsilon)\n", " v = v / (rad_max + epsilon)\n", "\n", " return flow_compute_color(u, v, convert_to_bgr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using Dali\n", "### Define the Pipeline. \n", "For advanced usage, refer to SequenceReader and VideoReader docs." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "class OFPipeline(Pipeline):\n", " def __init__(self, batch_size, num_threads, device_id):\n", " super(OFPipeline, self).__init__(batch_size, num_threads, device_id, seed=16)\n", "\n", " self.input = ops.VideoReader(device=\"gpu\", filenames=video_filename, sequence_length=sequence_length)\n", " self.of_op = ops.OpticalFlow(device=\"gpu\", output_format=4)\n", "\n", " def define_graph(self):\n", " seq = self.input(name=\"Reader\")\n", " of = self.of_op(seq.gpu())\n", " return of" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Build and run DALI Pipeline." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 10, 180, 320, 2)\n" ] } ], "source": [ "pipe = OFPipeline(batch_size=batch_size, num_threads=1, device_id=0)\n", "pipe.build()\n", "pipe_out = pipe.run()\n", "flow_vector = pipe_out[0].as_cpu().as_array()\n", "print(flow_vector.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Above you can see the shape of calculated `flow_vector` (in NFHWC format). It contains 2 channels: flow vector in `x` axis and flow vector in `y` axis. Output resolution is determined by `output_format` option passed to `OpticalFlow` operator: for `output_format = 4`, 4x4 grid is used for flow calculation, thus resolution in every dimension being 4 times smaller, than resolution of the input image.\n", "\n", "### Visualize results" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "of_result = flow_to_color(flow_vector[0][int(sequence_length/2)])\n", "plt.imshow(of_result)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.9" } }, "nbformat": 4, "nbformat_minor": 1 }