Data loading: MXNet recordIO

Overview

This example shows you how to use the data that is stored in the MXNet recordIO format with DALI.

Creating an Index

To use data that is stored in the recordIO format, we need to use the readers.mxnet operator. In addition to the arguments that are common to all readers, such as random_shuffle, this operator takes the path and index_path arguments:

  • path is the list of paths to recordIO files

  • index_path is a list (of size 1) that contains the path to the index file. This file, with .idx extension, is automatically created when you use MXNet’s im2rec.py utility; and can also be obtained from the recordIO file by using the rec2idx utility that is included with DALI.

The DALI_EXTRA_PATH environment variable should point to the location where data from DALI extra repository is downloaded.

Important: Ensure that you check out the correct release tag that corresponds to the installed version of DALI.

[1]:
from nvidia.dali.pipeline import Pipeline
import nvidia.dali.fn as fn
import nvidia.dali.types as types
import numpy as np
import os.path

test_data_root = os.environ["DALI_EXTRA_PATH"]
base = os.path.join(test_data_root, "db", "recordio")
batch_size = 16

idx_files = [base + "/train.idx"]
rec_files = [base + "/train.rec"]

Defining and Running the Pipeline

  1. Define a simple pipeline that takes the images that are stored in the recordIO format, decodes them and prepares them for ingestion in DL framework.

    Processing images involves cropping, normalizing, and HWC -> CHW conversion process.

[2]:
pipe = Pipeline(batch_size=batch_size, num_threads=4, device_id=0)
with pipe:
    jpegs, labels = fn.readers.mxnet(path=rec_files, index_path=idx_files)
    images = fn.decoders.image(jpegs, device="mixed", output_type=types.RGB)
    output = fn.crop_mirror_normalize(
        images, dtype=types.FLOAT, crop=(224, 224), mean=[0.0, 0.0, 0.0], std=[1.0, 1.0, 1.0]
    )
    pipe.set_outputs(output, labels)
  1. Let us now build and run the pipeline:

[3]:
pipe.build()
pipe_out = pipe.run()
  1. To visualize the results, use the matplotlib library, which expects images in HWC format, but the output of the pipeline is in CHW.

    Note: CHW is the preferred format for most Deep Learning frameworks.

  2. For the visualization purposes, transpose the images back to the HWC layout.

[4]:
import matplotlib.gridspec as gridspec
import matplotlib.pyplot as plt

%matplotlib inline


def show_images(image_batch):
    columns = 4
    rows = (batch_size + 1) // (columns)
    fig = plt.figure(figsize=(32, (32 // columns) * rows))
    gs = gridspec.GridSpec(rows, columns)
    for j in range(rows * columns):
        plt.subplot(gs[j])
        plt.axis("off")
        img_chw = image_batch.at(j)
        img_hwc = np.transpose(img_chw, (1, 2, 0)) / 255.0
        plt.imshow(img_hwc)
[5]:
images, labels = pipe_out
show_images(images.as_cpu())
../../../_images/examples_general_data_loading_dataloading_recordio_8_0.svg