{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Normalize Operator" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This shows you how to use the Normalize operator.\n", "\n", "## Introduction\n", "Normalization is the process of shifting and scaling the data values to match the desired distribution. It calculates the mean $\\mu$ and the standard deviation $\\sigma$ and modifies the data as follows:\n", "\n", "$$Y_i = \\frac{X_i - \\mu}{\\sigma}$$\n", "\n", "There are more advanced features in Normalize that will be explained later in this documentation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using the Normalize Operator\n", "\n", "We need some boilerplate code to import DALI and some other useful libraries and to visualize the results." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from nvidia.dali.pipeline import Pipeline\n", "import math\n", "import nvidia.dali.ops as ops \n", "import nvidia.dali.fn as fn \n", "import nvidia.dali.types as types\n", "import nvidia.dali.backend as backend\n", "\n", "batch_size = 10\n", "image_filename = \"../data/images\"\n", "\n", "import matplotlib.pyplot as plt\n", "import matplotlib.gridspec as gridspec\n", "\n", "def display(outputs, idx, columns = 2, captions = None):\n", " rows = int(math.ceil(len(outputs) / columns))\n", " fig = plt.figure()\n", " fig.set_size_inches(16, 6 * rows)\n", " gs = gridspec.GridSpec(rows, columns)\n", " row = 0\n", " col = 0\n", " for i, out in enumerate(outputs):\n", " if isinstance(out, backend.TensorListGPU):\n", " out = out.as_cpu()\n", " plt.subplot(gs[i])\n", " plt.axis(\"off\")\n", " if captions is not None:\n", " plt.title(captions[i])\n", " plt.imshow(out.at(idx));\n", " \n", "def show(pipe, idx, columns = 2, captions = None):\n", " pipe.build()\n", " display(pipe.run(), idx, columns, captions)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### A Simple Pipeline\n", "\n", "Create a simple pipeline that just loads some images and normalizes them, and treats the image data as a flat array that contains 3*W*H numbers (3 for RGB channels)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "pipe = Pipeline(batch_size=batch_size, num_threads=1, device_id=0)\n", "with pipe:\n", " jpegs, _ = fn.readers.file(file_root=image_filename)\n", " images = fn.decoders.image(jpegs, device=\"mixed\", output_type=types.RGB)\n", " norm = fn.normalize(images)\n", "\n", " pipe.set_outputs(images, norm)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "output_type": "stream", "name": "stderr", "text": [ "Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).\n" ] }, { "output_type": "display_data", "data": { "text/plain": "