Reinterpreting Tensors#

Sometimes the data in tensors needs to be interpreted as if it had different type or shape. For example, reading a binary file into memory produces a flat tensor of byte-valued data, which the application code may want to interpret as an array of data of specific shape and possibly different type.

DALI provides the following operations which affect tensor metadata (shape, type, layout): * reshape * reinterpret * squeeze * expand_dims

Thsese operations neither modify nor copy the data - the output tensor is just another view of the same region of memory, making these operations very cheap.

Fixed Output Shape#

This example demonstrates the simplest use of the reshape operation, assigning a new fixed shape to an existing tensor.

First, we’ll import DALI and other necessary modules, and define a utility for displaying the data, which will be used throughout this tutorial.

[1]:
import nvidia.dali as dali
import nvidia.dali.fn as fn
from nvidia.dali import pipeline_def
import nvidia.dali.types as types
import numpy as np


def show_result(outputs, names=["Input", "Output"], formatter=None):
    if not isinstance(outputs, tuple):
        return show_result((outputs,))

    outputs = [out.as_cpu() if hasattr(out, "as_cpu") else out for out in outputs]

    for i in range(len(outputs[0])):
        print(f"---------------- Sample #{i} ----------------")
        for o, out in enumerate(outputs):
            a = np.array(out[i])
            s = "x".join(str(x) for x in a.shape)
            title = names[o] if names is not None and o < len(names) else f"Output #{o}"
            l = out.layout()
            if l:
                l += " "
            print(f"{title} ({l}{s})")
            np.set_printoptions(formatter=formatter)
            print(a)


def rand_shape(dims, lo, hi):
    return list(np.random.randint(lo, hi, [dims]))

Now let’s define out pipeline - it takes data from an external source and returns it both in original form and reshaped to a fixed square shape [5, 5]. Additionally, output tensors’ layout is set to HW

[2]:
@pipeline_def(device_id=0, num_threads=4, batch_size=3)
def example1(input_data):
    np.random.seed(1234)
    inp = fn.external_source(input_data, batch=False, dtype=types.INT32)
    return inp, fn.reshape(inp, shape=[5, 5], layout="HW")


pipe1 = example1(lambda: np.random.randint(0, 10, size=[25], dtype=np.int32))
pipe1.build()
show_result(pipe1.run())
---------------- Sample #0 ----------------
Input (25)
[3 6 5 4 8 9 1 7 9 6 8 0 5 0 9 6 2 0 5 2 6 3 7 0 9]
Output (HW 5x5)
[[3 6 5 4 8]
 [9 1 7 9 6]
 [8 0 5 0 9]
 [6 2 0 5 2]
 [6 3 7 0 9]]
---------------- Sample #1 ----------------
Input (25)
[0 3 2 3 1 3 1 3 7 1 7 4 0 5 1 5 9 9 4 0 9 8 8 6 8]
Output (HW 5x5)
[[0 3 2 3 1]
 [3 1 3 7 1]
 [7 4 0 5 1]
 [5 9 9 4 0]
 [9 8 8 6 8]]
---------------- Sample #2 ----------------
Input (25)
[6 3 1 2 5 2 5 6 7 4 3 5 6 4 6 2 4 2 7 9 7 7 2 9 7]
Output (HW 5x5)
[[6 3 1 2 5]
 [2 5 6 7 4]
 [3 5 6 4 6]
 [2 4 2 7 9]
 [7 7 2 9 7]]

As we can see, the numbers from flat input tensors have been rearranged into 5x5 matrices.

Reshape with Wildcards#

Let’s now consider a more advanced use case. Imagine you have some flattened array that represents a fixed number of columns, but the number of rows is free to vary from sample to sample. In that case, you can put a wildcard dimension by specifying its shape as -1. Whe using wildcards, the output is resized so that the total number of elements is the same as in the input.

[3]:
@pipeline_def(device_id=0, num_threads=4, batch_size=3)
def example2(input_data):
    np.random.seed(12345)
    inp = fn.external_source(input_data, batch=False, dtype=types.INT32)
    return inp, fn.reshape(inp, shape=[-1, 5])


pipe2 = example2(
    lambda: np.random.randint(0, 10, size=[5 * np.random.randint(3, 10)], dtype=np.int32)
)
pipe2.build()
show_result(pipe2.run())
---------------- Sample #0 ----------------
Input (25)
[5 1 4 9 5 2 1 6 1 9 7 6 0 2 9 1 2 6 7 7 7 8 7 1 7]
Output (5x5)
[[5 1 4 9 5]
 [2 1 6 1 9]
 [7 6 0 2 9]
 [1 2 6 7 7]
 [7 8 7 1 7]]
---------------- Sample #1 ----------------
Input (35)
[0 3 5 7 3 1 5 2 5 3 8 5 2 5 3 0 6 8 0 5 6 8 9 2 2 2 9 7 5 7 1 0 9 3 0]
Output (7x5)
[[0 3 5 7 3]
 [1 5 2 5 3]
 [8 5 2 5 3]
 [0 6 8 0 5]
 [6 8 9 2 2]
 [2 9 7 5 7]
 [1 0 9 3 0]]
---------------- Sample #2 ----------------
Input (30)
[0 6 2 1 5 8 6 5 1 0 5 8 2 9 4 7 9 5 2 4 8 2 5 6 5 9 6 1 9 5]
Output (6x5)
[[0 6 2 1 5]
 [8 6 5 1 0]
 [5 8 2 9 4]
 [7 9 5 2 4]
 [8 2 5 6 5]
 [9 6 1 9 5]]

Removing and Adding Unit Dimensions#

There are two dedicated operators squeeze and expand_dims which can be used for removing and adding dimensions with unit extent. The following example demonstrates the removal of a redundant dimension as well as adding two new dimensions.

[4]:
@pipeline_def(device_id=0, num_threads=4, batch_size=3)
def example_squeeze_expand(input_data):
    np.random.seed(4321)
    inp = fn.external_source(input_data, batch=False, layout="CHW", dtype=types.INT32)
    squeezed = fn.squeeze(inp, axes=[0])
    expanded = fn.expand_dims(squeezed, axes=[0, 3], new_axis_names="FC")
    return inp, fn.squeeze(inp, axes=[0]), expanded


def single_channel_generator():
    return np.random.randint(0, 10, size=[1] + rand_shape(2, 1, 7), dtype=np.int32)


pipe_squeeze_expand = example_squeeze_expand(single_channel_generator)
pipe_squeeze_expand.build()
show_result(pipe_squeeze_expand.run())
---------------- Sample #0 ----------------
Input (CHW 1x6x3)
[[[8 2 1]
  [7 5 9]
  [2 4 6]
  [0 8 6]
  [5 3 1]
  [1 6 1]]]
Output (HW 6x3)
[[8 2 1]
 [7 5 9]
 [2 4 6]
 [0 8 6]
 [5 3 1]
 [1 6 1]]
Output #2 (FHWC 1x6x3x1)
[[[[8]
   [2]
   [1]]

  [[7]
   [5]
   [9]]

  [[2]
   [4]
   [6]]

  [[0]
   [8]
   [6]]

  [[5]
   [3]
   [1]]

  [[1]
   [6]
   [1]]]]
---------------- Sample #1 ----------------
Input (CHW 1x2x2)
[[[6 9]
  [0 9]]]
Output (HW 2x2)
[[6 9]
 [0 9]]
Output #2 (FHWC 1x2x2x1)
[[[[6]
   [9]]

  [[0]
   [9]]]]
---------------- Sample #2 ----------------
Input (CHW 1x2x6)
[[[4 4 6 6 6 3]
  [8 2 1 7 9 7]]]
Output (HW 2x6)
[[4 4 6 6 6 3]
 [8 2 1 7 9 7]]
Output #2 (FHWC 1x2x6x1)
[[[[4]
   [4]
   [6]
   [6]
   [6]
   [3]]

  [[8]
   [2]
   [1]
   [7]
   [9]
   [7]]]]

Rearranging Dimensions#

Reshape allows you to swap, insert or remove dimenions. The argument src_dims allows you to specify which source dimension is used for a given output dimension. You can also insert a new dimension by specifying -1 as a source dimension index.

[5]:
@pipeline_def(device_id=0, num_threads=4, batch_size=3)
def example_reorder(input_data):
    np.random.seed(4321)
    inp = fn.external_source(input_data, batch=False, dtype=types.INT32)
    return inp, fn.reshape(inp, src_dims=[1, 0])


pipe_reorder = example_reorder(
    lambda: np.random.randint(0, 10, size=rand_shape(2, 1, 7), dtype=np.int32)
)
pipe_reorder.build()
show_result(pipe_reorder.run())
---------------- Sample #0 ----------------
Input (6x3)
[[8 2 1]
 [7 5 9]
 [2 4 6]
 [0 8 6]
 [5 3 1]
 [1 6 1]]
Output (3x6)
[[8 2 1 7 5 9]
 [2 4 6 0 8 6]
 [5 3 1 1 6 1]]
---------------- Sample #1 ----------------
Input (2x2)
[[6 9]
 [0 9]]
Output (2x2)
[[6 9]
 [0 9]]
---------------- Sample #2 ----------------
Input (2x6)
[[4 4 6 6 6 3]
 [8 2 1 7 9 7]]
Output (6x2)
[[4 4]
 [6 6]
 [6 3]
 [8 2]
 [1 7]
 [9 7]]

Adding and Removing Dimensions#

Dimensions can be added or removed by specifying src_dims argument or by using dedicated squeeze and expand_dims operators.

The following example reinterprets single-channel data from CHW to HWC layout by discarding the leading dimension and adding a new trailing dimension. It also specifies the output layout.

[6]:
@pipeline_def(device_id=0, num_threads=4, batch_size=3)
def example_remove_add(input_data):
    np.random.seed(4321)
    inp = fn.external_source(input_data, batch=False, layout="CHW", dtype=types.INT32)
    return inp, fn.reshape(
        inp, src_dims=[1, 2, -1], layout="HWC"  # select HW and add a new one at the end
    )  # specify the layout string


pipe_remove_add = example_remove_add(lambda: np.random.randint(0, 10, [1, 4, 3], dtype=np.int32))
pipe_remove_add.build()
show_result(pipe_remove_add.run())
---------------- Sample #0 ----------------
Input (CHW 1x4x3)
[[[2 8 2]
  [1 7 5]
  [9 2 4]
  [6 0 8]]]
Output (HWC 4x3x1)
[[[2]
  [8]
  [2]]

 [[1]
  [7]
  [5]]

 [[9]
  [2]
  [4]]

 [[6]
  [0]
  [8]]]
---------------- Sample #1 ----------------
Input (CHW 1x4x3)
[[[6 5 3]
  [1 1 6]
  [1 1 9]
  [6 9 0]]]
Output (HWC 4x3x1)
[[[6]
  [5]
  [3]]

 [[1]
  [1]
  [6]]

 [[1]
  [1]
  [9]]

 [[6]
  [9]
  [0]]]
---------------- Sample #2 ----------------
Input (CHW 1x4x3)
[[[9 9 5]
  [4 4 6]
  [6 6 3]
  [8 2 1]]]
Output (HWC 4x3x1)
[[[9]
  [9]
  [5]]

 [[4]
  [4]
  [6]]

 [[6]
  [6]
  [3]]

 [[8]
  [2]
  [1]]]

Relative Shape#

The output shape may be calculated in relative terms, with a new extent being a multiple of a source extent. For example, you may want to combine two subsequent rows into one - doubling the number of columns and halving the number of rows. The use of relative shape can be combined with dimension rearranging, in which case the new output extent is a multiple of a different source extent.

The example below reinterprets the input as having twice as many columns as the input had rows.

[7]:
@pipeline_def(device_id=0, num_threads=4, batch_size=3)
def example_rel_shape(input_data):
    np.random.seed(1234)
    inp = fn.external_source(input_data, batch=False, dtype=types.INT32)
    return inp, fn.reshape(inp, rel_shape=[0.5, 2], src_dims=[1, 0])


pipe_rel_shape = example_rel_shape(
    lambda: np.random.randint(
        0, 10, [np.random.randint(1, 7), 2 * np.random.randint(1, 5)], dtype=np.int32
    )
)

pipe_rel_shape.build()
show_result(pipe_rel_shape.run())
---------------- Sample #0 ----------------
Input (4x6)
[[5 4 8 9 1 7]
 [9 6 8 0 5 0]
 [9 6 2 0 5 2]
 [6 3 7 0 9 0]]
Output (3x8)
[[5 4 8 9 1 7 9 6]
 [8 0 5 0 9 6 2 0]
 [5 2 6 3 7 0 9 0]]
---------------- Sample #1 ----------------
Input (4x6)
[[3 1 3 1 3 7]
 [1 7 4 0 5 1]
 [5 9 9 4 0 9]
 [8 8 6 8 6 3]]
Output (3x8)
[[3 1 3 1 3 7 1 7]
 [4 0 5 1 5 9 9 4]
 [0 9 8 8 6 8 6 3]]
---------------- Sample #2 ----------------
Input (2x6)
[[5 2 5 6 7 4]
 [3 5 6 4 6 2]]
Output (3x4)
[[5 2 5 6]
 [7 4 3 5]
 [6 4 6 2]]

Reinterpreting Data Type#

The reinterpret operation can view the data as if it was of different type. When a new shape is not specified, the innermost dimension is resized accordingly.

[8]:
@pipeline_def(device_id=0, num_threads=4, batch_size=3)
def example_reinterpret(input_data):
    np.random.seed(1234)
    inp = fn.external_source(input_data, batch=False, dtype=types.UINT8)
    return inp, fn.reinterpret(inp, dtype=dali.types.UINT32)


pipe_reinterpret = example_reinterpret(
    lambda: np.random.randint(
        0, 255, [np.random.randint(1, 7), 4 * np.random.randint(1, 5)], dtype=np.uint8
    )
)

pipe_reinterpret.build()


def hex_bytes(x):
    f = f"0x{{:0{2*x.nbytes}x}}"
    return f.format(x)


show_result(pipe_reinterpret.run(), formatter={"int": hex_bytes})
---------------- Sample #0 ----------------
Input (4x12)
[[0x35 0xdc 0x5d 0xd1 0xcc 0xec 0x0e 0x70 0x74 0x5d 0xb3 0x9c]
 [0x98 0x42 0x0d 0xc9 0xf9 0xd7 0x77 0xc5 0x8f 0x7e 0xac 0xc7]
 [0xb1 0xda 0x54 0xdc 0x17 0xa1 0xc8 0x45 0xe9 0x24 0x90 0x26]
 [0x9a 0x5c 0xc6 0x46 0x1e 0x20 0xd2 0x32 0xab 0x7e 0x47 0xcd]]
Output (4x3)
[[0xd15ddc35 0x700eeccc 0x9cb35d74]
 [0xc90d4298 0xc577d7f9 0xc7ac7e8f]
 [0xdc54dab1 0x45c8a117 0x269024e9]
 [0x46c65c9a 0x32d2201e 0xcd477eab]]
---------------- Sample #1 ----------------
Input (5x4)
[[0x1a 0x1f 0x3d 0xe0]
 [0x76 0x35 0xbb 0x1d]
 [0xba 0xe9 0x99 0x5b]
 [0x78 0xe8 0x4d 0x03]
 [0x70 0x37 0x41 0x80]]
Output (5x1)
[[0xe03d1f1a]
 [0x1dbb3576]
 [0x5b99e9ba]
 [0x034de878]
 [0x80413770]]
---------------- Sample #2 ----------------
Input (5x8)
[[0x50 0x6d 0xbd 0x54 0xc9 0xa3 0x73 0xb6]
 [0x7f 0xc9 0x79 0xcd 0xf6 0xc0 0xc8 0x5e]
 [0xfe 0x09 0x27 0x19 0xaf 0x8d 0xaa 0x8f]
 [0x32 0x96 0x55 0x0e 0xf0 0x0e 0xca 0x80]
 [0xfb 0x56 0x52 0x71 0x4c 0x54 0x86 0x03]]
Output (5x2)
[[0x54bd6d50 0xb673a3c9]
 [0xcd79c97f 0x5ec8c0f6]
 [0x192709fe 0x8faa8daf]
 [0x0e559632 0x80ca0ef0]
 [0x715256fb 0x0386544c]]