Using Tensorflow DALI plugin with sparse tensors

Overview

Using our DALI data loading and augmentation pipeline with Tensorflow is pretty simple.

However, sometimes a batch of data that uses wants to extract from the pipeline cannot be represented as a dense tensor. In such case, DALI op utilizes TensorFlow SparseTensor. Please keep in mind that SparseTensors are supported only for the CPU based piepline.

Defining the Data Loading Pipeline

First, we start by defining some simple pipeline that will return data as a sparse tensor. To ochieve this, we will use well known COCO data set. Each image may have 0 or more bounding boxes with labels describing objects present in it.Wa want to return images in a normalized way, while labels and bounding boxes will be represented as sparse tensors. At the beginning let us define some global parameters

DALI_EXTRA_PATH environment variable should point to the place where data from DALI extra repository is downloaded. Please make sure that the proper release tag is checked out.

[1]:
from nvidia.dali.pipeline import Pipeline
import nvidia.dali.ops as ops
import nvidia.dali.types as types
import os.path

test_data_root = os.environ['DALI_EXTRA_PATH']

BATCH_SIZE = 32
DEVICES = 1
test_data_root = os.environ['DALI_EXTRA_PATH']
file_root = os.path.join(test_data_root, 'db', 'coco', 'images')
annotations_file = os.path.join(test_data_root, 'db', 'coco', 'instances.json')

Pipeline with the COCO reader is created. Please notice that while images are processed, other data from COCO ara passes through.

[2]:
class COCOPipeline(Pipeline):
    def __init__(self, batch_size, num_threads, device_id, num_gpus):
        super(COCOPipeline, self).__init__(batch_size, num_threads, device_id, seed = 15)
        self.input = ops.COCOReader(file_root = file_root, annotations_file = annotations_file,
                                     shard_id = device_id, num_shards = num_gpus, ratio=False, save_img_ids=True)
        self.decode = ops.ImageDecoder(device = "cpu", output_type = types.RGB)
        self.resize = ops.Resize(device = "cpu",
                                 interp_type = types.INTERP_LINEAR)
        self.cmn = ops.CropMirrorNormalize(device = "cpu",
                                            dtype = types.FLOAT,
                                            crop = (224, 224),
                                            mean = [128., 128., 128.],
                                            std = [1., 1., 1.])
        self.res_uniform = ops.random.Uniform(range = (256.,480.))
        self.uniform = ops.random.Uniform(range = (0.0, 1.0))
        self.cast = ops.Cast(device = "cpu",
                             dtype = types.INT32)

    def define_graph(self):
        inputs, bboxes, labels, im_ids = self.input()
        images = self.decode(inputs)
        images = self.resize(images, resize_shorter = self.res_uniform())
        output = self.cmn(images, crop_pos_x = self.uniform(),
                          crop_pos_y = self.uniform())
        output = self.cast(output)
        return (output, bboxes, labels, im_ids)

Next, we instatiate the pipelines with the right parameters. We will create one pipeline per GPU, by specifying the right device_id for each pipeline.

The difference is that instead of calling pipeline.build and using it, we will pass the pipeline object to the TensorFlow operator.

[3]:
pipes = [COCOPipeline(batch_size=BATCH_SIZE, num_threads=2, device_id = device_id, num_gpus = DEVICES) for device_id in range(DEVICES)]

Using DALI TensorFlow Plugin

Let’s start by importing Tensorflow and the DALI Tensorflow plugin as dali_tf.

[4]:
import tensorflow as tf
import nvidia.dali.plugin.tf as dali_tf
import time
try:
    from tensorflow.compat.v1 import GPUOptions
    from tensorflow.compat.v1 import ConfigProto
    from tensorflow.compat.v1 import Session
    from tensorflow.compat.v1 import placeholder
except:
    # Older TF versions don't have compat.v1 layer
    from tensorflow import GPUOptions
    from tensorflow import ConfigProto
    from tensorflow import Session
    from tensorflow import placeholder

try:
    tf.compat.v1.disable_eager_execution()
except:
    pass


We can now use nvidia.dali.plugin.tf.DALIIterator() method to get the Tensorflow Op that will produce the tensors we will use in the Tensorflow graph.

For each DALI pipeline, we use daliop that returns a Tensorflow tensor tuple that we will store in image, bouding boxes, labels and image ids.To enable sparse tensor generation sparse argument need to be filled with True values for the output elements that are going to be represented as a sparse tensors.

[5]:
daliop = dali_tf.DALIIterator()

images = []
bboxes = []
labels = []
image_ids = []
for d in range(DEVICES):
    with tf.device('/cpu'):
        image, bbox, label, id = daliop(pipeline = pipes[d],
            shapes = [(BATCH_SIZE, 3, 224, 224), (), (), ()],
            dtypes = [tf.int32, tf.float32, tf.int32, tf.int32], sparse = [False, True, True])

        images.append(image)
        bboxes.append(bbox)
        labels.append(label)
        image_ids.append(id)

Using the Tensors in a Simple Tensorflow Graph

We will use images, bboxes, labels and image_ids tensors list in our Tensorflow graph definition. Then run a very simple one op graph session that will output the batch of data. Then we will print bounding boxes, labels and image_ids.

[6]:
with Session() as sess:
    all_img_per_sec = []
    total_batch_size = BATCH_SIZE * DEVICES

    start_time = time.time()

    # The actual run with our dali_tf tensors
    res_cpu = sess.run([images, bboxes, labels, image_ids])
print(res_cpu[1])
print(res_cpu[2])
print(res_cpu[3])
[SparseTensorValue(indices=array([[ 0,  0,  0],
       [ 0,  0,  1],
       [ 0,  0,  2],
       ...,
       [31,  4,  1],
       [31,  4,  2],
       [31,  4,  3]]), values=array([313., 168., 162., 120., 100., 216., 182., 237., 138.,  15., 404.,
       172., 215., 305.,  69.,  80., 248.,  64., 344., 311., 123.,  66.,
        95., 176., 194., 209.,  48., 207., 122., 178.,  47., 248., 400.,
       115., 176., 158.,  88., 217.,  91., 114.,  49., 148., 257., 184.,
        99.,  40., 361., 130.,  89.,  84., 259., 246., 213., 455., 270.,
       158., 144., 137.,  92., 150., 275.,  39., 286.,  32., 185.,  78.,
        12.,  90., 273.,  39., 275., 220., 180., 311., 226.,  12., 351.,
        96.,  85., 168., 178.,   9.,  23., 183., 167., 194., 355.,  90.,
        95., 193., 151., 226., 298., 315., 370.,  63., 381., 311., 210.,
       110., 247.,  84., 385., 175., 137.,  44., 161., 112., 282.,  15.,
       336., 130., 159., 332., 387.,  97., 100., 285., 300., 116., 374.,
        73., 142.,  20., 272.,  93., 348.,  62.,  22.,   1., 266., 226.,
       376.,  79., 143., 157., 285.,  69., 280., 232., 208., 143., 300.,
       107.,  62., 129., 350., 171., 166.,  93., 331., 183., 334.,   7.,
        95., 125., 221.,  54., 354.,  84., 240., 131., 258.,  22., 290.,
       173., 337.,  61., 460., 144.,  52., 187., 157., 221., 279., 150.,
       172., 306., 322.,  38., 263., 143., 325., 114.,  82.,  61., 317.,
       110., 280.,  88., 162.,  46., 222., 102., 258., 177., 103., 135.,
        83., 200., 338., 105., 286., 288., 428., 229.,  63.,  30.,  54.,
         3., 392., 338., 498., 169.,  63., 166.,  86., 237.,  61., 110.,
       397., 130.,  13.,  32.,   8.,  30., 232., 142.,  31., 189., 233.,
        29., 183.,  76., 339.,  79., 254.,  23., 309., 231., 234., 316.,
       262.,  61., 110., 152., 339.,  11., 188.,  19., 136., 202., 498.,
         1., 159., 124., 392., 197., 155.,  41.,  44.,  70., 335., 126.,
       239., 159.,  59., 344., 230.,   8., 288., 324., 185.,  88., 233.,
       116., 124.,   7.,  90.,  90.,  24., 156., 363., 219., 484., 262.,
       198., 186., 546., 381., 117.,  60., 246.,  96., 260., 248., 103.,
       108.,  17., 184., 134., 169., 236., 212., 177., 125., 268., 183.,
        95., 220., 298., 124., 143., 116., 247., 222., 347.,  44., 318.,
        80., 353., 211., 293.,  53.,  76.,  29.,  52., 172., 192.,  83.,
       198., 185.,  33., 221., 329., 149., 181., 298., 396., 102., 202.,
       136., 269., 222.,  13., 229., 236., 149., 311.,  14., 309., 183.,
       474., 359., 127.,  79., 258., 143., 189., 170., 348., 222., 211.,
        13., 129., 205., 190.,  61., 391., 142.,  14., 201.,  12., 172.,
       217.,  16.], dtype=float32), dense_shape=array([32,  5,  4]))]
[SparseTensorValue(indices=array([[ 0,  0],
       [ 1,  0],
       [ 2,  0],
       [ 3,  0],
       [ 3,  1],
       [ 3,  2],
       [ 3,  3],
       [ 4,  0],
       [ 5,  0],
       [ 5,  1],
       [ 5,  2],
       [ 6,  0],
       [ 6,  1],
       [ 6,  2],
       [ 7,  0],
       [ 7,  1],
       [ 7,  2],
       [ 7,  3],
       [ 8,  0],
       [ 8,  1],
       [ 8,  2],
       [ 8,  3],
       [ 8,  4],
       [ 9,  0],
       [ 9,  1],
       [10,  0],
       [10,  1],
       [10,  2],
       [11,  0],
       [11,  1],
       [12,  0],
       [12,  1],
       [12,  2],
       [12,  3],
       [12,  4],
       [13,  0],
       [14,  0],
       [14,  1],
       [14,  2],
       [14,  3],
       [14,  4],
       [15,  0],
       [15,  1],
       [15,  2],
       [16,  0],
       [16,  1],
       [16,  2],
       [16,  3],
       [17,  0],
       [17,  1],
       [17,  2],
       [17,  3],
       [17,  4],
       [18,  0],
       [18,  1],
       [19,  0],
       [19,  1],
       [19,  2],
       [19,  3],
       [20,  0],
       [20,  1],
       [20,  2],
       [20,  3],
       [20,  4],
       [21,  0],
       [21,  1],
       [22,  0],
       [22,  1],
       [22,  2],
       [23,  0],
       [24,  0],
       [24,  1],
       [24,  2],
       [25,  0],
       [25,  1],
       [25,  2],
       [26,  0],
       [27,  0],
       [27,  1],
       [27,  2],
       [27,  3],
       [28,  0],
       [29,  0],
       [29,  1],
       [29,  2],
       [29,  3],
       [29,  4],
       [30,  0],
       [30,  1],
       [31,  0],
       [31,  1],
       [31,  2],
       [31,  3],
       [31,  4]]), values=array([33, 34, 12, 20,  8, 34, 28, 49, 36, 70, 56, 23, 25, 24, 64,  1, 42,
       44, 73, 72,  5, 39,  8, 10, 14, 75, 50, 22, 77, 71, 31, 63, 32, 70,
       59, 27, 69, 74, 37, 14, 22, 45, 16, 60, 16, 78, 15, 30, 29, 58, 38,
       25, 79, 28, 74, 47, 67, 28,  1, 27, 11, 25, 17, 39, 31, 16, 32, 75,
       59, 72, 15, 58, 11, 18, 25, 72, 32, 44, 17, 45, 80, 77, 61, 68,  3,
       20, 45, 70, 47,  2, 42, 73, 51, 64], dtype=int32), dense_shape=array([32,  5]))]
[array([[ 0],
       [ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5],
       [ 6],
       [ 7],
       [ 8],
       [ 9],
       [10],
       [11],
       [12],
       [13],
       [14],
       [15],
       [16],
       [17],
       [18],
       [19],
       [20],
       [21],
       [22],
       [23],
       [24],
       [25],
       [26],
       [27],
       [28],
       [29],
       [30],
       [31]], dtype=int32)]

Let us check the output images with their augmentations! Tensorflow outputs numpy arrays, so we can visualize them easily with matplotlib.

We define a show_images helper function that will display a sample of our batch.

The batch layout is NCHW so we use transpose to get HWC images, that matplotlib can show.

[7]:
import matplotlib.gridspec as gridspec
import matplotlib.pyplot as plt
%matplotlib inline

def show_images(image_batch, nb_images):
    columns = 4
    rows = (nb_images + 1) // (columns)
    fig = plt.figure(figsize = (32,(32 // columns) * rows))
    gs = gridspec.GridSpec(rows, columns)
    for j in range(nb_images):
        plt.subplot(gs[j])
        plt.axis("off")
        img = image_batch[0][j].transpose((1,2,0)) + 128
        plt.imshow(img.astype('uint8'))
show_images(res_cpu[0], 8)
../../../_images/examples_frameworks_tensorflow_tensorflow-plugin-sparse-tensor_14_0.png