{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Optical Flow\n", "This notebook presents how to use Dali to calculate optical flow for given sequence of frames." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's start with some handy imports" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import os.path\n", "import numpy as np\n", "\n", "from nvidia.dali import pipeline_def\n", "import nvidia.dali.fn as fn\n", "\n", "from matplotlib import pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Setting metaparameters. \n", "As an example we use [Sintel trailer](https://durian.blender.org/), included in [DALI_extra](https://github.com/NVIDIA/DALI_extra) repository. Feel free to verify against your own video data.\n", "\n", "`DALI_EXTRA_PATH` environment variable should point to the place where data from [DALI extra repository](https://github.com/NVIDIA/DALI_extra) is downloaded. Please make sure that the proper release tag is checked out." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "batch_size = 1\n", "sequence_length = 10\n", "dali_extra_path = os.environ[\"DALI_EXTRA_PATH\"]\n", "video_filename = dali_extra_path + \"/db/optical_flow/sintel_trailer/sintel_trailer_short.mp4\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Functions used for Optical flow visualization. \n", "The code comes from [Tomrunia's GitHub](https://github.com/tomrunia/OpticalFlow_Visualization \"OpticalFlow_Visualization\")" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "def make_colorwheel():\n", " \"\"\"\n", " Generates a color wheel for optical flow visualization as presented in:\n", " Baker et al. \"A Database and Evaluation Methodology for Optical Flow\" (ICCV, 2007)\n", " URL: http://vision.middlebury.edu/flow/flowEval-iccv07.pdf\n", " According to the C++ source code of Daniel Scharstein\n", " According to the Matlab source code of Deqing Sun\n", " \"\"\"\n", "\n", " RY = 15\n", " YG = 6\n", " GC = 4\n", " CB = 11\n", " BM = 13\n", " MR = 6\n", "\n", " ncols = RY + YG + GC + CB + BM + MR\n", " colorwheel = np.zeros((ncols, 3))\n", " col = 0\n", "\n", " # RY\n", " colorwheel[0:RY, 0] = 255\n", " colorwheel[0:RY, 1] = np.floor(255 * np.arange(0, RY) / RY)\n", " col = col + RY\n", " # YG\n", " colorwheel[col : col + YG, 0] = 255 - np.floor(255 * np.arange(0, YG) / YG)\n", " colorwheel[col : col + YG, 1] = 255\n", " col = col + YG\n", " # GC\n", " colorwheel[col : col + GC, 1] = 255\n", " colorwheel[col : col + GC, 2] = np.floor(255 * np.arange(0, GC) / GC)\n", " col = col + GC\n", " # CB\n", " colorwheel[col : col + CB, 1] = 255 - np.floor(255 * np.arange(CB) / CB)\n", " colorwheel[col : col + CB, 2] = 255\n", " col = col + CB\n", " # BM\n", " colorwheel[col : col + BM, 2] = 255\n", " colorwheel[col : col + BM, 0] = np.floor(255 * np.arange(0, BM) / BM)\n", " col = col + BM\n", " # MR\n", " colorwheel[col : col + MR, 2] = 255 - np.floor(255 * np.arange(MR) / MR)\n", " colorwheel[col : col + MR, 0] = 255\n", " return colorwheel\n", "\n", "\n", "def flow_compute_color(u, v, convert_to_bgr=False):\n", " \"\"\"\n", " Applies the flow color wheel to (possibly clipped) flow components u and v.\n", " According to the C++ source code of Daniel Scharstein\n", " According to the Matlab source code of Deqing Sun\n", " :param u: np.ndarray, input horizontal flow\n", " :param v: np.ndarray, input vertical flow\n", " :param convert_to_bgr: bool, whether to change ordering and output BGR instead of RGB\n", " :return:\n", " \"\"\"\n", "\n", " flow_image = np.zeros((u.shape[0], u.shape[1], 3), np.uint8)\n", "\n", " colorwheel = make_colorwheel() # shape [55x3]\n", " ncols = colorwheel.shape[0]\n", "\n", " rad = np.sqrt(np.square(u) + np.square(v))\n", " a = np.arctan2(-v, -u) / np.pi\n", "\n", " fk = (a + 1) / 2 * (ncols - 1)\n", " k0 = np.floor(fk).astype(np.int32)\n", " k1 = k0 + 1\n", " k1[k1 == ncols] = 0\n", " f = fk - k0\n", "\n", " for i in range(colorwheel.shape[1]):\n", " tmp = colorwheel[:, i]\n", " col0 = tmp[k0] / 255.0\n", " col1 = tmp[k1] / 255.0\n", " col = (1 - f) * col0 + f * col1\n", "\n", " idx = rad <= 1\n", " col[idx] = 1 - rad[idx] * (1 - col[idx])\n", " col[~idx] = col[~idx] * 0.75 # out of range?\n", "\n", " # Note the 2-i => BGR instead of RGB\n", " ch_idx = 2 - i if convert_to_bgr else i\n", " flow_image[:, :, ch_idx] = np.floor(255 * col)\n", "\n", " return flow_image\n", "\n", "\n", "def flow_to_color(flow_uv, clip_flow=None, convert_to_bgr=False):\n", " \"\"\"\n", " Expects a two dimensional flow image of shape [H,W,2]\n", " According to the C++ source code of Daniel Scharstein\n", " According to the Matlab source code of Deqing Sun\n", " :param flow_uv: np.ndarray of shape [H,W,2]\n", " :param clip_flow: float, maximum clipping value for flow\n", " :return:\n", " \"\"\"\n", "\n", " assert flow_uv.ndim == 3, \"input flow must have three dimensions\"\n", " assert flow_uv.shape[2] == 2, \"input flow must have shape [H,W,2]\"\n", "\n", " if clip_flow is not None:\n", " flow_uv = np.clip(flow_uv, 0, clip_flow)\n", "\n", " u = flow_uv[:, :, 0]\n", " v = flow_uv[:, :, 1]\n", "\n", " rad = np.sqrt(np.square(u) + np.square(v))\n", " rad_max = np.max(rad)\n", "\n", " epsilon = 1e-5\n", " u = u / (rad_max + epsilon)\n", " v = v / (rad_max + epsilon)\n", "\n", " return flow_compute_color(u, v, convert_to_bgr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using Dali\n", "### Define the Pipeline.\n", "The pipeline below loads video files and computes optical flow for the sequence of frames.\n", "For more information, please refer to [readers.video](../../operations/nvidia.dali.fn.readers.video.html) and [optical_flow](../../operations/nvidia.dali.fn.optical_flow.html) documentation." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "@pipeline_def\n", "def optical_flow_pipe():\n", " video = fn.readers.video(\n", " device=\"gpu\", filenames=video_filename, sequence_length=sequence_length\n", " )\n", " of = fn.optical_flow(video, output_grid=4)\n", " return of" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Build and Run DALI Pipeline." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(9, 180, 320, 2)\n" ] } ], "source": [ "pipe = optical_flow_pipe(batch_size=batch_size, num_threads=1, device_id=0)\n", "pipe.build()\n", "pipe_out = pipe.run()\n", "flow_vector = np.array(pipe_out[0][0].as_cpu())\n", "print(flow_vector.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Above you can see the shape of calculated `flow_vector` (in NFHWC format). It contains 2 channels: flow vector in `x` axis and flow vector in `y` axis. Output resolution is determined by `output_grid` option passed to `optical_flow` operator: for `output_grid = 4`, 4x4 grid is used for flow calculation, thus resolution in every dimension being 4 times smaller, than resolution of the input image.\n", "\n", "### Visualize Results" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "of_result = flow_to_color(flow_vector[sequence_length // 2])\n", "plt.imshow(of_result)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" } }, "nbformat": 4, "nbformat_minor": 1 }