Operation Reference¶
The data processing graph within a DALI Pipeline is defined by calling operation
functions. They accept and return instances of DataNode
,
which are symbolic representations of batches of Tensors.
The operation functions cannot be used to process data directly.
The constraints for defining the processing pipeline can be found in this section of Pipeline documentation.
The following table lists all operations available in DALI:
Function |
Device support |
Short description |
---|---|---|
CPU |
Legacy alias for |
|
CPU |
Produces a batch of random integers which can be used as indices for indexing samples in the batch. |
|
CPU, GPU |
Flips bounding boxes horizontaly or verticaly (mirror). |
|
CPU |
Transforms bounding boxes so that the boxes remain in the same place in the image after the image is pasted on a larger canvas. |
|
CPU, GPU |
Encodes the input bounding boxes and labels using a set of default boxes (anchors) passed as an argument. |
|
CPU, GPU |
Adjusts the brightness of the images. |
|
CPU, GPU |
Adjusts the brightness and contrast of the images. |
|
CPU |
Legacy alias for |
|
CPU |
Legacy alias for |
|
CPU, GPU |
Cast tensor to a different type. |
|
CPU, GPU |
Joins the input tensors along an existing axis. |
|
CPU |
Legacy alias for |
|
CPU, GPU |
Generates random boolean values following a bernoulli distribution. |
|
CPU, GPU |
Converts between various image color models. |
|
CPU, GPU |
Adjusts hue, saturation, brightness and contrast of the image. |
|
CPU, GPU |
Produces a batch of constant tensors. |
|
CPU, GPU |
Adjusts the contrast of the images. |
|
CPU, GPU |
Transforms vectors or points by flipping (reflecting) their coordinates with respect to a given center. |
|
CPU, GPU |
Applies a linear transformation to points or vectors. |
|
CPU, GPU |
Creates a copy of the input tensor. |
|
CPU, GPU |
Crops the images with the specified window dimensions and window position (upper left corner). |
|
CPU, GPU |
Performs fused cropping, normalization, format conversion (NHWC to NCHW) if desired, and type casting. |
|
CPU, GPU |
Executes a Python function that operates on DLPack tensors. |
|
CPU, GPU |
Save images in batch to disk in PPM format. |
|
CPU, GPU |
Extracts one or more elements from input sequence. |
|
CPU, GPU |
Erases one or more regions from the input tensors. |
|
CPU, GPU |
Insert new dimension(s) with extent 1 to the data shape. |
|
CPU, GPU |
Allows externally provided data to be passed as an input to the pipeline. |
|
CPU |
Performs a fused resize, crop, mirror operation. |
|
CPU |
Legacy alias for |
|
CPU, GPU |
Flips the images in selected dimensions (horizontal, vertical, and depthwise). |
|
CPU, GPU |
Applies a Gaussian Blur to the input. |
|
CPU, GPU |
Returns a property of the tensor passed as an input. |
|
CPU, GPU |
Performs the gridmask augumentation (https://arxiv.org/abs/2001.04086). |
|
CPU, GPU |
Adjusts hue, saturation and value (brightness) of the images. |
|
CPU, GPU |
Changes the hue level of the image. |
|
CPU, Mixed |
Legacy alias for |
|
CPU, Mixed |
Legacy alias for |
|
CPU, Mixed |
Legacy alias for |
|
CPU, Mixed |
Legacy alias for |
|
GPU |
Performs a random Jitter augmentation. |
|
CPU, GPU |
Introduces JPEG compression artifacts to RGB images. |
|
CPU, GPU |
Computes the Laplacian of an input. |
|
CPU, GPU |
Maps the input to output by using a lookup table that is specified by |
|
CPU, GPU |
Converts a spectrogram to a mel spectrogram by applying a bank of triangular filters. |
|
CPU, GPU |
Computes Mel Frequency Cepstral Coefficiencs (MFCC) from a mel spectrogram. |
|
CPU, GPU |
Performs multiple pastes from image batch to each of outputs |
|
CPU |
Legacy alias for |
|
CPU |
Legacy alias for |
|
CPU |
Performs leading and trailing silence detection in an audio buffer. |
|
CPU, GPU |
Generates random numbers following a normal distribution. |
|
CPU, GPU |
Normalizes the input by removing the mean and dividing by the standard deviation. |
|
CPU |
Invokes a njit compiled Numba function. |
|
CPU, GPU |
Legacy alias for |
|
CPU, GPU |
Produces a one-hot encoding of the input. |
|
GPU |
Calculates the optical flow between images in the input. |
|
CPU, GPU |
Pads all samples with the |
|
GPU |
Pastes the input images on a larger canvas, where the canvas size is equal to |
|
CPU |
Obtains the shape of the encoded image. |
|
CPU |
Marks the input tensor as a sequence. |
|
CPU, GPU |
Returns a batch of tensors constructed by selecting tensors from the input based on indices given in |
|
CPU |
Calculates power spectrum of the signal. |
|
CPU, GPU |
Applies preemphasis filter to the input data. |
|
CPU, GPU |
Executes a Python function. |
|
CPU |
Applies a prospective random crop to an image coordinate space while keeping the bounding boxes, and optionally labels, consistent. |
|
CPU, GPU |
Performs a crop with a randomly selected area and aspect ratio and resizes it to the specified size. |
|
CPU, GPU |
Treats content of the input as if it had a different type, shape, and/or layout. |
|
CPU, GPU |
Treats content of the input as if it had a different shape and/or layout. |
|
CPU, GPU |
Resize images. |
|
CPU |
Performs a fused resize, crop, mirror operation. Both fixed and random resizing and cropping are supported. |
|
CPU |
Produces a fixed shape cropping window, randomly placed so that as much of the provided region of interest (ROI) is contained in it. |
|
CPU, GPU |
Rotates the images by the specified angle. |
|
CPU, GPU |
Changes the saturation level of the image. |
|
CPU |
Legacy alias for |
|
CPU, GPU |
Rearranges frames in a sequence. |
|
CPU, GPU |
Returns the shapes of inputs. |
|
CPU, GPU |
Extracts a subtensor, or slice. |
|
CPU, GPU |
Produces a spectrogram from a 1D signal (for example, audio). |
|
CPU, GPU |
Performs a sphere augmentation. |
|
CPU, GPU |
Removes the dimensions given as |
|
CPU |
Performs a random crop with bounding boxes where Intersection Over Union (IoU) meets a randomly selected threshold between 0-1. |
|
CPU, GPU |
Joins the input tensors along a new axis. |
|
CPU |
Legacy alias for |
|
CPU, GPU |
Converts a magnitude (real, positive) to the decibel scale. |
|
CPU, GPU |
Executes a function that is operating on Torch tensors. |
|
CPU, GPU |
Transposes the tensors by reordering the dimensions based on the |
|
CPU, GPU |
Generates random numbers following a uniform distribution. |
|
GPU |
Legacy alias for |
|
GPU |
Legacy alias for |
|
CPU, GPU |
Applies an affine transformation to the images. |
|
CPU, GPU |
Performs a water augmentation, which makes the image appear to be underwater. |
|
CPU |
Decodes waveforms from encoded audio data. |
|
CPU, Mixed |
Decodes images. |
|
CPU, Mixed |
Decodes images and extracts regions-of-interest (ROI) that are specified by fixed window dimensions and variable anchors. |
|
CPU, Mixed |
Decodes images and randomly crops them. |
|
CPU, Mixed |
Decodes images and extracts regions of interest. |
|
CPU, GPU |
Loads and decodes video files using FFmpeg. |
|
CPU, GPU |
Applies gaussian noise to the input. |
|
CPU, GPU |
Applies salt-and-pepper noise to the input. |
|
CPU, GPU |
Applies shot noise to the input. |
|
CPU, GPU |
Generates random boolean values following a bernoulli distribution. |
|
CPU, GPU |
Generates random numbers following a normal distribution. |
|
CPU, GPU |
Generates random numbers following a uniform distribution. |
|
CPU |
Reads (Image, label) pairs from a Caffe LMDB. |
|
CPU |
Reads sample data from a Caffe2 Lightning Memory-Mapped Database (LMDB). |
|
CPU |
Reads data from a COCO dataset that is composed of a directory with images and annotation JSON files. |
|
CPU |
Reads file contents and returns file-label pairs. |
|
CPU |
Reads the data from an MXNet RecordIO. |
|
CPU |
Reads automatic speech recognition (ASR) data (audio, text) from an NVIDIA NeMo compatible manifest. |
|
CPU, GPU |
Reads Numpy arrays from a directory. |
|
CPU |
Reads [Frame] sequences from a directory representing a collection of streams. |
|
CPU |
Reads samples from a TensorFlow TFRecord file. |
|
GPU |
Loads and decodes video files using FFmpeg and NVDECODE, which is the hardware-accelerated video decoding feature in the NVIDIA(R) GPU. |
|
GPU |
Loads, decodes and resizes video files with FFmpeg and NVDECODE, which is NVIDIA GPU’s hardware-accelerated video decoding. |
|
CPU |
A reader for the webdataset format. |
|
CPU, GPU |
Gets maximal input element along provided axes. |
|
CPU, GPU |
Gets mean of elements along provided axes. |
|
CPU, GPU |
Gets mean square of elements along provided axes. |
|
CPU, GPU |
Gets minimal input element along provided axes. |
|
CPU, GPU |
Gets root mean square of elements along provided axes. |
|
CPU, GPU |
Gets standard deviation of elements along provided axes. |
|
CPU, GPU |
Gets sum of elements along provided axes. |
|
CPU, GPU |
Gets variance of elements along provided axes. |
|
CPU |
Selects random pixel coordinates in a mask, sampled from a uniform distribution. |
|
CPU |
Randomly selects an object from a mask and returns its bounding box. |
|
CPU |
Selects a subset of polygons by their mask ids. |
|
CPU |
Combines two or more affine transforms. |
|
CPU |
Produces an affine transform matrix that maps a reference coordinate space to another one. |
|
CPU |
Produces a rotation affine transform matrix. |
|
CPU |
Produces a scale affine transform matrix. |
|
CPU |
Produces a shear affine transform matrix. |
|
CPU |
Produces a translation affine transform matrix. |