Operation Reference#
The data processing graph within a DALI Pipeline is defined by calling operation
functions. They accept and return instances of DataNode
,
which are symbolic representations of batches of Tensors.
The operation functions cannot be used to process data directly.
The constraints for defining the processing pipeline can be found in this section of Pipeline documentation.
The following table lists all operations available in DALI:
Function |
Device support |
Short description |
---|---|---|
CPU |
Legacy alias for |
|
CPU, GPU |
Resamples an audio signal. |
|
CPU |
Produces a batch of random integers which can be used as indices for indexing samples in the batch. |
|
CPU, GPU |
Flips bounding boxes horizontally or vertically (mirror). |
|
CPU |
Transforms bounding boxes so that the boxes remain in the same place in the image after the image is pasted on a larger canvas. |
|
CPU, GPU |
Encodes the input bounding boxes and labels using a set of default boxes (anchors) passed as an argument. |
|
CPU, GPU |
Adjusts the brightness of the images. |
|
CPU, GPU |
Adjusts the brightness and contrast of the images. |
|
CPU |
Legacy alias for |
|
CPU |
Legacy alias for |
|
CPU, GPU |
Cast a tensor to a different type. |
|
CPU, GPU |
Cast the first tensor to the type of the second tensor. |
|
CPU, GPU |
Joins the input tensors along an existing axis. |
|
CPU |
Legacy alias for |
|
CPU, GPU |
Generates random boolean values following a bernoulli distribution. |
|
CPU, GPU |
Converts between various image color models. |
|
CPU, GPU |
Adjusts hue, saturation, brightness and contrast of the image. |
|
CPU, GPU |
Adjusts the contrast of the images. |
|
CPU, GPU |
Transforms vectors or points by flipping (reflecting) their coordinates with respect to a given center. |
|
CPU, GPU |
Applies a linear transformation to points or vectors. |
|
CPU, GPU |
Creates a copy of the input tensor. |
|
CPU, GPU |
Crops the images with the specified window dimensions and window position (upper left corner). |
|
CPU, GPU |
Performs fused cropping, normalization, format conversion (NHWC to NCHW) if desired, and type casting. |
|
CPU, GPU |
Executes a Python function that operates on DLPack tensors. |
|
CPU, GPU |
Save images in batch to disk in PPM format. |
|
CPU, GPU |
Extracts one or more elements from input sequence. |
|
CPU, GPU |
Erases one or more regions from the input tensors. |
|
CPU, GPU |
Insert new dimension(s) with extent 1 to the data shape. |
|
CPU, GPU |
Allows externally provided data to be passed as an input to the pipeline. |
|
CPU, GPU |
Legacy alias for ResizedCropMirror, with antialiasing disabled by default. |
|
CPU |
Legacy alias for |
|
CPU, GPU |
Flips the images in selected dimensions (horizontal, vertical, and depthwise). |
|
CPU |
Returns new data of given shape and type, filled with a fill value. |
|
CPU |
Returns new data with the same shape and type as the input data, filled with a fill_value. |
|
CPU, GPU |
Applies a Gaussian Blur to the input. |
|
CPU, GPU |
Returns a property of the tensor passed as an input. |
|
CPU, GPU |
Performs the gridmask augmentation (https://arxiv.org/abs/2001.04086). |
|
CPU, GPU |
Adjusts hue, saturation and value (brightness) of the images. |
|
CPU, GPU |
Changes the hue level of the image. |
|
CPU, Mixed |
Legacy alias for |
|
CPU, Mixed |
Legacy alias for |
|
CPU, Mixed |
Legacy alias for |
|
CPU, Mixed |
Legacy alias for |
|
GPU |
Performs a random Jitter augmentation. |
|
CPU, GPU |
Introduces JPEG compression artifacts to RGB images. |
|
CPU, GPU |
Computes the Laplacian of an input. |
|
CPU, GPU |
Maps the input to output by using a lookup table that is specified by |
|
CPU, GPU |
Converts a spectrogram to a mel spectrogram by applying a bank of triangular filters. |
|
CPU, GPU |
Computes Mel Frequency Cepstral Coefficients (MFCC) from a mel spectrogram. |
|
CPU, GPU |
Performs multiple pastes from image batch to each of the outputs. |
|
CPU |
Legacy alias for |
|
CPU |
Legacy alias for |
|
CPU, GPU |
Performs leading and trailing silence detection in an audio buffer. |
|
CPU, GPU |
Generates random numbers following a normal distribution. |
|
CPU, GPU |
Normalizes the input by removing the mean and dividing by the standard deviation. |
|
CPU, GPU |
Invokes a njit compiled Numba function. |
|
CPU, GPU |
Legacy alias for |
|
CPU, GPU |
Produces a one-hot encoding of the input. |
|
CPU |
Returns new data of given shape and type, filled with ones. |
|
CPU |
Returns new data with the same shape and type as the input array, filled with ones. |
|
GPU |
Calculates the optical flow between images in the input. |
|
CPU, GPU |
Pads all samples with the |
|
GPU |
Pastes the input images on a larger canvas, where the canvas size is equal to |
|
CPU |
Obtains the shape of the encoded image. |
|
CPU, GPU |
Marks the input tensor as a sequence. |
|
CPU, GPU |
Returns a batch of tensors constructed by selecting tensors from the input based on indices given in |
|
CPU |
Calculates power spectrum of the signal. |
|
CPU, GPU |
Applies preemphasis filter to the input data. |
|
CPU, GPU |
Executes a Python function. |
|
CPU |
Applies a prospective random crop to an image coordinate space while keeping the bounding boxes, and optionally labels, consistent. |
|
CPU |
Produces a cropping window with a randomly selected area and aspect ratio. |
|
CPU, GPU |
Performs a crop with a randomly selected area and aspect ratio and resizes it to the specified size. |
|
CPU, GPU |
Treats content of the input as if it had a different type, shape, and/or layout. |
|
CPU, GPU |
Treats content of the input as if it had a different shape and/or layout. |
|
CPU, GPU |
Resize images. |
|
CPU, GPU |
Performs a fused resize, crop, mirror operation. |
|
CPU |
Produces a fixed shape cropping window, randomly placed so that as much of the provided region of interest (ROI) is contained in it. |
|
CPU, GPU |
Rotates the images by the specified angle. |
|
CPU, GPU |
Changes the saturation level of the image. |
|
CPU |
Legacy alias for |
|
CPU, GPU |
Rearranges frames in a sequence. |
|
CPU, GPU |
Returns the shapes of inputs. |
|
CPU, GPU |
Extracts a subtensor, or slice. |
|
CPU, GPU |
Produces a spectrogram from a 1D signal (for example, audio). |
|
CPU, GPU |
Performs a sphere augmentation. |
|
CPU, GPU |
Removes the dimensions given as |
|
CPU |
Performs a random crop with bounding boxes where Intersection Over Union (IoU) meets a randomly selected threshold between 0-1. |
|
CPU, GPU |
Joins the input tensors along a new axis. |
|
CPU |
Legacy alias for |
|
CPU, GPU |
Converts a magnitude (real, positive) to the decibel scale. |
|
CPU, GPU |
Executes a function that is operating on Torch tensors. |
|
CPU, GPU |
Transposes the tensors by reordering the dimensions based on the |
|
CPU, GPU |
Generates random numbers following a uniform distribution. |
|
GPU |
Legacy alias for |
|
GPU |
Legacy alias for |
|
CPU, GPU |
Applies an affine transformation to the images. |
|
CPU, GPU |
Performs a water augmentation, which makes the image appear to be underwater. |
|
CPU |
Returns new data of given shape and type, filled with zeros. |
|
CPU |
Returns new data with the same shape and type as the input array, filled with zeros. |
|
CPU |
Decodes waveforms from encoded audio data. |
|
CPU, Mixed |
Decodes images. |
|
CPU, Mixed |
Decodes images and extracts regions-of-interest (ROI) that are specified by fixed window dimensions and variable anchors. |
|
CPU, Mixed |
Decodes images and randomly crops them. |
|
CPU, Mixed |
Decodes images and extracts regions of interest. |
|
CPU, GPU |
Legacy alias for |
|
GPU |
Performs image demosaicing/debayering. |
|
GPU |
Performs a dilation operation on the input image. |
|
CPU, GPU |
Performs grayscale/per-channel histogram equalization. |
|
GPU |
Performs an erosion operation on the input image. |
|
CPU, GPU |
Convolves the image with the provided filter. |
|
GPU |
Inflates/decompresses the input using specified decompression algorithm. |
|
GPU |
Median blur performs smoothing of an image or sequence of images by replacing each pixel with the median color of a surrounding rectangular region. |
|
CPU |
Obtains the shape of the encoded image. |
|
GPU |
The remap operation applies a generic geometrical transformation to an image. In other words, it takes pixels from one place in the input image and puts them in another place in the output image. The transformation is described by |
|
CPU, GPU |
Resize tensors. |
|
GPU |
Performs a perspective transform on the images. |
|
CPU, Mixed |
Decodes images. |
|
CPU, Mixed |
Decodes images and extracts regions-of-interest (ROI) that are specified by fixed window dimensions and variable anchors. |
|
CPU, Mixed |
Decodes images and randomly crops them. |
|
CPU, Mixed |
Decodes images and extracts regions of interest. |
|
CPU, Mixed |
Decodes a video file from a memory buffer (e.g. provided by external source). |
|
CPU, Mixed |
Streams and decodes a video from a memory buffer. To be used with long and high resolution videos. |
|
CPU, GPU |
Reads Fits image HDUs from a directory. |
|
CPU, GPU |
Loads and decodes video files using FFmpeg. |
|
CPU |
Reads raw file contents from an encoded filename represented by a 1D byte array. |
|
CPU, GPU |
Applies gaussian noise to the input. |
|
CPU, GPU |
Applies salt-and-pepper noise to the input. |
|
CPU, GPU |
Applies shot noise to the input. |
|
Mixed |
Decodes a video file from a memory buffer (e.g. provided by external source). |
|
CPU |
Generates a random number from |
|
CPU |
Generates a random sample from a given 1D array. |
|
CPU, GPU |
Generates random boolean values following a bernoulli distribution. |
|
CPU, GPU |
Generates random numbers following a normal distribution. |
|
CPU, GPU |
Generates random numbers following a uniform distribution. |
|
CPU |
Reads (Image, label) pairs from a Caffe LMDB. |
|
CPU |
Reads sample data from a Caffe2 Lightning Memory-Mapped Database (LMDB). |
|
CPU |
Reads data from a COCO dataset that is composed of a directory with images and annotation JSON files. |
|
CPU |
Reads file contents and returns file-label pairs. |
|
CPU |
Reads the data from an MXNet RecordIO. |
|
CPU |
Reads automatic speech recognition (ASR) data (audio, text) from an NVIDIA NeMo compatible manifest. |
|
CPU, GPU |
Reads Numpy arrays from a directory. |
|
CPU |
Reads [Frame] sequences from a directory representing a collection of streams. |
|
CPU |
Reads samples from a TensorFlow TFRecord file. |
|
GPU |
Loads and decodes video files using FFmpeg and NVDECODE, which is the hardware-accelerated video decoding feature in the NVIDIA(R) GPU. |
|
GPU |
Loads, decodes and resizes video files with FFmpeg and NVDECODE, which is NVIDIA GPU’s hardware-accelerated video decoding. |
|
CPU |
A reader for the webdataset format. |
|
CPU, GPU |
Gets maximal input element along provided axes. |
|
CPU, GPU |
Gets mean of elements along provided axes. |
|
CPU, GPU |
Gets mean square of elements along provided axes. |
|
CPU, GPU |
Gets minimal input element along provided axes. |
|
CPU, GPU |
Gets root mean square of elements along provided axes. |
|
CPU, GPU |
Gets standard deviation of elements along provided axes. |
|
CPU, GPU |
Gets sum of elements along provided axes. |
|
CPU, GPU |
Gets variance of elements along provided axes. |
|
CPU |
Selects random pixel coordinates in a mask, sampled from a uniform distribution. |
|
CPU |
Randomly selects an object from a mask and returns its bounding box. |
|
CPU |
Selects a subset of polygons by their mask ids. |
|
CPU |
Combines two or more affine transforms. |
|
CPU |
Produces an affine transform matrix that maps a reference coordinate space to another one. |
|
CPU |
Produces a rotation affine transform matrix. |
|
CPU |
Produces a scale affine transform matrix. |
|
CPU |
Produces a shear affine transform matrix. |
|
CPU |
Produces a translation affine transform matrix. |