Operation Reference

The data processing graph within a DALI Pipeline is defined by calling operation functions. They accept and return instances of DataNode, which are symbolic representations of batches of Tensors. The operation functions cannot be used to process data directly.

The constraints for defining the processing pipeline can be found in this section of Pipeline documentation.

The following table lists all operations available in DALI:

Function

Device support

Short description

audio_decoder

CPU

Legacy alias for decoders.audio().

batch_permutation

CPU

Produces a batch of random integers which can be used as indices for indexing samples in the batch.

bb_flip

CPU, GPU

Flips bounding boxes horizontaly or verticaly (mirror).

bbox_paste

CPU

Transforms bounding boxes so that the boxes remain in the same place in the image after the image is pasted on a larger canvas.

box_encoder

CPU, GPU

Encodes the input bounding boxes and labels using a set of default boxes (anchors) passed as an argument.

brightness

CPU, GPU

Adjusts the brightness of the images.

brightness_contrast

CPU, GPU

Adjusts the brightness and contrast of the images.

caffe2_reader

CPU

Legacy alias for readers.caffe2().

caffe_reader

CPU

Legacy alias for readers.caffe().

cast

CPU, GPU

Cast tensor to a different type.

cat

CPU, GPU

Joins the input tensors along an existing axis.

coco_reader

CPU

Legacy alias for readers.coco().

coin_flip

CPU, GPU

Generates random boolean values following a bernoulli distribution.

color_space_conversion

CPU, GPU

Converts between various image color models.

color_twist

CPU, GPU

Adjusts hue, saturation, brightness and contrast of the image.

constant

CPU, GPU

Produces a batch of constant tensors.

contrast

CPU, GPU

Adjusts the contrast of the images.

coord_flip

CPU, GPU

Transforms vectors or points by flipping (reflecting) their coordinates with respect to a given center.

coord_transform

CPU, GPU

Applies a linear transformation to points or vectors.

copy

CPU, GPU

Creates a copy of the input tensor.

crop

CPU, GPU

Crops the images with the specified window dimensions and window position (upper left corner).

crop_mirror_normalize

CPU, GPU

Performs fused cropping, normalization, format conversion (NHWC to NCHW) if desired, and type casting.

dl_tensor_python_function

CPU, GPU

Executes a Python function that operates on DLPack tensors.

dump_image

CPU, GPU

Save images in batch to disk in PPM format.

element_extract

CPU, GPU

Extracts one or more elements from input sequence.

erase

CPU, GPU

Erases one or more regions from the input tensors.

expand_dims

CPU, GPU

Insert new dimension(s) with extent 1 to the data shape.

external_source

CPU, GPU

Allows externally provided data to be passed as an input to the pipeline.

fast_resize_crop_mirror

CPU

Performs a fused resize, crop, mirror operation.

file_reader

CPU

Legacy alias for readers.file().

flip

CPU, GPU

Flips the images in selected dimensions (horizontal, vertical, and depthwise).

gaussian_blur

CPU, GPU

Applies a Gaussian Blur to the input.

get_property

CPU, GPU

Returns a property of the tensor passed as an input.

grid_mask

CPU, GPU

Performs the gridmask augumentation (https://arxiv.org/abs/2001.04086).

hsv

CPU, GPU

Adjusts hue, saturation and value (brightness) of the images.

hue

CPU, GPU

Changes the hue level of the image.

image_decoder

CPU, Mixed

Legacy alias for decoders.image().

image_decoder_crop

CPU, Mixed

Legacy alias for decoders.image_crop().

image_decoder_random_crop

CPU, Mixed

Legacy alias for decoders.image_random_crop().

image_decoder_slice

CPU, Mixed

Legacy alias for decoders.image_slice().

jitter

GPU

Performs a random Jitter augmentation.

jpeg_compression_distortion

CPU, GPU

Introduces JPEG compression artifacts to RGB images.

laplacian

CPU, GPU

Computes the Laplacian of an input.

lookup_table

CPU, GPU

Maps the input to output by using a lookup table that is specified by keys and values, and a default_value for unspecified keys.

mel_filter_bank

CPU, GPU

Converts a spectrogram to a mel spectrogram by applying a bank of triangular filters.

mfcc

CPU, GPU

Computes Mel Frequency Cepstral Coefficiencs (MFCC) from a mel spectrogram.

multi_paste

CPU, GPU

Performs multiple pastes from image batch to each of outputs

mxnet_reader

CPU

Legacy alias for readers.mxnet().

nemo_asr_reader

CPU

Legacy alias for readers.nemo_asr().

nonsilent_region

CPU, GPU

Performs leading and trailing silence detection in an audio buffer.

normal_distribution

CPU, GPU

Generates random numbers following a normal distribution.

normalize

CPU, GPU

Normalizes the input by removing the mean and dividing by the standard deviation.

numba_function

CPU

Invokes a njit compiled Numba function.

numpy_reader

CPU, GPU

Legacy alias for readers.numpy().

one_hot

CPU, GPU

Produces a one-hot encoding of the input.

optical_flow

GPU

Calculates the optical flow between images in the input.

pad

CPU, GPU

Pads all samples with the fill_value in the specified axes to match the biggest extent in the batch for those axes or to match the minimum shape specified.

paste

GPU

Pastes the input images on a larger canvas, where the canvas size is equal to input size * ratio.

peek_image_shape

CPU

Obtains the shape of the encoded image.

per_frame

CPU, GPU

Marks the input tensor as a sequence.

permute_batch

CPU, GPU

Returns a batch of tensors constructed by selecting tensors from the input based on indices given in indices argument.

power_spectrum

CPU

Calculates power spectrum of the signal.

preemphasis_filter

CPU, GPU

Applies preemphasis filter to the input data.

python_function

CPU, GPU

Executes a Python function.

random_bbox_crop

CPU

Applies a prospective random crop to an image coordinate space while keeping the bounding boxes, and optionally labels, consistent.

random_resized_crop

CPU, GPU

Performs a crop with a randomly selected area and aspect ratio and resizes it to the specified size.

reinterpret

CPU, GPU

Treats content of the input as if it had a different type, shape, and/or layout.

reshape

CPU, GPU

Treats content of the input as if it had a different shape and/or layout.

resize

CPU, GPU

Resize images.

resize_crop_mirror

CPU

Performs a fused resize, crop, mirror operation. Both fixed and random resizing and cropping are supported.

roi_random_crop

CPU

Produces a fixed shape cropping window, randomly placed so that as much of the provided region of interest (ROI) is contained in it.

rotate

CPU, GPU

Rotates the images by the specified angle.

saturation

CPU, GPU

Changes the saturation level of the image.

sequence_reader

CPU

Legacy alias for readers.sequence().

sequence_rearrange

CPU, GPU

Rearranges frames in a sequence.

shapes

CPU, GPU

Returns the shapes of inputs.

slice

CPU, GPU

Extracts a subtensor, or slice.

spectrogram

CPU, GPU

Produces a spectrogram from a 1D signal (for example, audio).

sphere

CPU, GPU

Performs a sphere augmentation.

squeeze

CPU, GPU

Removes the dimensions given as axes or axis_names.

ssd_random_crop

CPU

Performs a random crop with bounding boxes where Intersection Over Union (IoU) meets a randomly selected threshold between 0-1.

stack

CPU, GPU

Joins the input tensors along a new axis.

tfrecord_reader

CPU

Legacy alias for readers.tfrecord().

to_decibels

CPU, GPU

Converts a magnitude (real, positive) to the decibel scale.

torch_python_function

CPU, GPU

Executes a function that is operating on Torch tensors.

transpose

CPU, GPU

Transposes the tensors by reordering the dimensions based on the perm parameter.

uniform

CPU, GPU

Generates random numbers following a uniform distribution.

video_reader

GPU

Legacy alias for readers.video().

video_reader_resize

GPU

Legacy alias for readers.video_resize().

warp_affine

CPU, GPU

Applies an affine transformation to the images.

water

CPU, GPU

Performs a water augmentation, which makes the image appear to be underwater.

decoders.audio

CPU

Decodes waveforms from encoded audio data.

decoders.image

CPU, Mixed

Decodes images.

decoders.image_crop

CPU, Mixed

Decodes images and extracts regions-of-interest (ROI) that are specified by fixed window dimensions and variable anchors.

decoders.image_random_crop

CPU, Mixed

Decodes images and randomly crops them.

decoders.image_slice

CPU, Mixed

Decodes images and extracts regions of interest.

experimental.audio_resample

CPU

Resamples an audio signal.

experimental.readers.video

CPU, GPU

Loads and decodes video files using FFmpeg.

noise.gaussian

CPU, GPU

Applies gaussian noise to the input.

noise.salt_and_pepper

CPU, GPU

Applies salt-and-pepper noise to the input.

noise.shot

CPU, GPU

Applies shot noise to the input.

random.coin_flip

CPU, GPU

Generates random boolean values following a bernoulli distribution.

random.normal

CPU, GPU

Generates random numbers following a normal distribution.

random.uniform

CPU, GPU

Generates random numbers following a uniform distribution.

readers.caffe

CPU

Reads (Image, label) pairs from a Caffe LMDB.

readers.caffe2

CPU

Reads sample data from a Caffe2 Lightning Memory-Mapped Database (LMDB).

readers.coco

CPU

Reads data from a COCO dataset that is composed of a directory with images and annotation JSON files.

readers.file

CPU

Reads file contents and returns file-label pairs.

readers.mxnet

CPU

Reads the data from an MXNet RecordIO.

readers.nemo_asr

CPU

Reads automatic speech recognition (ASR) data (audio, text) from an NVIDIA NeMo compatible manifest.

readers.numpy

CPU, GPU

Reads Numpy arrays from a directory.

readers.sequence

CPU

Reads [Frame] sequences from a directory representing a collection of streams.

readers.tfrecord

CPU

Reads samples from a TensorFlow TFRecord file.

readers.video

GPU

Loads and decodes video files using FFmpeg and NVDECODE, which is the hardware-accelerated video decoding feature in the NVIDIA(R) GPU.

readers.video_resize

GPU

Loads, decodes and resizes video files with FFmpeg and NVDECODE, which is NVIDIA GPU’s hardware-accelerated video decoding.

readers.webdataset

CPU

A reader for the webdataset format.

reductions.max

CPU, GPU

Gets maximal input element along provided axes.

reductions.mean

CPU, GPU

Gets mean of elements along provided axes.

reductions.mean_square

CPU, GPU

Gets mean square of elements along provided axes.

reductions.min

CPU, GPU

Gets minimal input element along provided axes.

reductions.rms

CPU, GPU

Gets root mean square of elements along provided axes.

reductions.std_dev

CPU, GPU

Gets standard deviation of elements along provided axes.

reductions.sum

CPU, GPU

Gets sum of elements along provided axes.

reductions.variance

CPU, GPU

Gets variance of elements along provided axes.

segmentation.random_mask_pixel

CPU

Selects random pixel coordinates in a mask, sampled from a uniform distribution.

segmentation.random_object_bbox

CPU

Randomly selects an object from a mask and returns its bounding box.

segmentation.select_masks

CPU

Selects a subset of polygons by their mask ids.

transforms.combine

CPU

Combines two or more affine transforms.

transforms.crop

CPU

Produces an affine transform matrix that maps a reference coordinate space to another one.

transforms.rotation

CPU

Produces a rotation affine transform matrix.

transforms.scale

CPU

Produces a scale affine transform matrix.

transforms.shear

CPU

Produces a shear affine transform matrix.

transforms.translation

CPU

Produces a translation affine transform matrix.