Operation Reference#

The data processing graph within a DALI Pipeline is defined by calling operation functions. They accept and return instances of DataNode, which are symbolic representations of batches of Tensors. The operation functions cannot be used to process data directly.

The constraints for defining the processing pipeline can be found in this section of Pipeline documentation.

The following table lists all operations available in DALI:

Function	Device support	Short description
`audio_decoder`	CPU	Legacy alias for `decoders.audio()`.
`audio_resample`	CPU, GPU	Resamples an audio signal.
`batch_permutation`	CPU	Produces a batch of random integers which can be used as indices for indexing samples in the batch.
`bb_flip`	CPU, GPU	Flips bounding boxes horizontally or vertically (mirror).
`bbox_paste`	CPU	Transforms bounding boxes so that the boxes remain in the same place in the image after the image is pasted on a larger canvas.
`box_encoder`	CPU, GPU	Encodes the input bounding boxes and labels using a set of default boxes (anchors) passed as an argument.
`brightness`	CPU, GPU	Adjusts the brightness of the images.
`brightness_contrast`	CPU, GPU	Adjusts the brightness and contrast of the images.
`caffe2_reader`	CPU	Legacy alias for `readers.caffe2()`.
`caffe_reader`	CPU	Legacy alias for `readers.caffe()`.
`cast`	CPU, GPU	Cast a tensor to a different type.
`cast_like`	CPU, GPU	Cast the first tensor to the type of the second tensor.
`cat`	CPU, GPU	Joins the input tensors along an existing axis.
`coco_reader`	CPU	Legacy alias for `readers.coco()`.
`coin_flip`	CPU, GPU	Generates random boolean values following a bernoulli distribution.
`color_space_conversion`	CPU, GPU	Converts between various image color models.
`color_twist`	CPU, GPU	Adjusts hue, saturation, brightness and contrast of the image.
`contrast`	CPU, GPU	Adjusts the contrast of the images.
`coord_flip`	CPU, GPU	Transforms vectors or points by flipping (reflecting) their coordinates with respect to a given center.
`coord_transform`	CPU, GPU	Applies a linear transformation to points or vectors.
`copy`	CPU, GPU	Creates a copy of the input tensor.
`crop`	CPU, GPU	Crops the images with the specified window dimensions and window position (upper left corner).
`crop_mirror_normalize`	CPU, GPU	Performs fused cropping, normalization, format conversion (NHWC to NCHW) if desired, and type casting.
`dl_tensor_python_function`	CPU, GPU	Executes a Python function that operates on DLPack tensors.
`dump_image`	CPU, GPU	Save images in batch to disk in PPM format.
`element_extract`	CPU, GPU	Extracts one or more elements from input sequence.
`erase`	CPU, GPU	Erases one or more regions from the input tensors.
`expand_dims`	CPU, GPU	Insert new dimension(s) with extent 1 to the data shape.
`external_source`	CPU, GPU	Allows externally provided data to be passed as an input to the pipeline.
`fast_resize_crop_mirror`	CPU, GPU	Legacy alias for ResizedCropMirror, with antialiasing disabled by default.
`file_reader`	CPU	Legacy alias for `readers.file()`.
`flip`	CPU, GPU	Flips the images in selected dimensions (horizontal, vertical, and depthwise).
`full`	CPU	Returns new data of given shape and type, filled with a fill value.
`full_like`	CPU	Returns new data with the same shape and type as the input data, filled with a fill_value.
`gaussian_blur`	CPU, GPU	Applies a Gaussian Blur to the input.
`get_property`	CPU, GPU	Returns a property of the tensor passed as an input.
`grid_mask`	CPU, GPU	Performs the gridmask augmentation (https://arxiv.org/abs/2001.04086).
`hsv`	CPU, GPU	Adjusts hue, saturation and value (brightness) of the images.
`hue`	CPU, GPU	Changes the hue level of the image.
`image_decoder`	CPU, Mixed	Legacy alias for `decoders.image()`.
`image_decoder_crop`	CPU, Mixed	Legacy alias for `decoders.image_crop()`.
`image_decoder_random_crop`	CPU, Mixed	Legacy alias for `decoders.image_random_crop()`.
`image_decoder_slice`	CPU, Mixed	Legacy alias for `decoders.image_slice()`.
`jitter`	GPU	Performs a random Jitter augmentation.
`jpeg_compression_distortion`	CPU, GPU	Introduces JPEG compression artifacts to RGB images.
`laplacian`	CPU, GPU	Computes the Laplacian of an input.
`lookup_table`	CPU, GPU	Maps the input to output by using a lookup table that is specified by keys and values, and a default_value for unspecified keys.
`mel_filter_bank`	CPU, GPU	Converts a spectrogram to a mel spectrogram by applying a bank of triangular filters.
`mfcc`	CPU, GPU	Computes Mel Frequency Cepstral Coefficients (MFCC) from a mel spectrogram.
`multi_paste`	CPU, GPU	Performs multiple pastes from image batch to each of the outputs.
`mxnet_reader`	CPU	Legacy alias for `readers.mxnet()`.
`nemo_asr_reader`	CPU	Legacy alias for `readers.nemo_asr()`.
`nonsilent_region`	CPU, GPU	Performs leading and trailing silence detection in an audio buffer.
`normal_distribution`	CPU, GPU	Generates random numbers following a normal distribution.
`normalize`	CPU, GPU	Normalizes the input by removing the mean and dividing by the standard deviation.
`numba_function`	CPU, GPU	Invokes a njit compiled Numba function.
`numpy_reader`	CPU, GPU	Legacy alias for `readers.numpy()`.
`one_hot`	CPU, GPU	Produces a one-hot encoding of the input.
`ones`	CPU	Returns new data of given shape and type, filled with ones.
`ones_like`	CPU	Returns new data with the same shape and type as the input array, filled with ones.
`optical_flow`	GPU	Calculates the optical flow between images in the input.
`pad`	CPU, GPU	Pads all samples with the fill_value in the specified axes to match the biggest extent in the batch for those axes or to match the minimum shape specified.
`paste`	GPU	Pastes the input images on a larger canvas, where the canvas size is equal to `input size * ratio`.
`peek_image_shape`	CPU	Obtains the shape of the encoded image.
`per_frame`	CPU, GPU	Marks the input tensor as a sequence.
`permute_batch`	CPU, GPU	Returns a batch of tensors constructed by selecting tensors from the input based on indices given in indices argument.
`power_spectrum`	CPU	Calculates power spectrum of the signal.
`preemphasis_filter`	CPU, GPU	Applies preemphasis filter to the input data.
`python_function`	CPU, GPU	Executes a Python function.
`random_bbox_crop`	CPU	Applies a prospective random crop to an image coordinate space while keeping the bounding boxes, and optionally labels, consistent.
`random_crop_generator`	CPU	Produces a cropping window with a randomly selected area and aspect ratio.
`random_resized_crop`	CPU, GPU	Performs a crop with a randomly selected area and aspect ratio and resizes it to the specified size.
`reinterpret`	CPU, GPU	Treats content of the input as if it had a different type, shape, and/or layout.
`reshape`	CPU, GPU	Treats content of the input as if it had a different shape and/or layout.
`resize`	CPU, GPU	Resize images.
`resize_crop_mirror`	CPU, GPU	Performs a fused resize, crop, mirror operation.
`roi_random_crop`	CPU	Produces a fixed shape cropping window, randomly placed so that as much of the provided region of interest (ROI) is contained in it.
`rotate`	CPU, GPU	Rotates the images by the specified angle.
`saturation`	CPU, GPU	Changes the saturation level of the image.
`sequence_reader`	CPU	Legacy alias for `readers.sequence()`.
`sequence_rearrange`	CPU, GPU	Rearranges frames in a sequence.
`shapes`	CPU, GPU	Returns the shapes of tensors in the input batch.
`slice`	CPU, GPU	Extracts a subtensor, or slice.
`spectrogram`	CPU, GPU	Produces a spectrogram from a 1D signal (for example, audio).
`sphere`	CPU, GPU	Performs a sphere augmentation.
`squeeze`	CPU, GPU	Removes the dimensions given as axes or axis_names.
`ssd_random_crop`	CPU	Performs a random crop with bounding boxes where Intersection Over Union (IoU) meets a randomly selected threshold between 0-1.
`stack`	CPU, GPU	Joins the input tensors along a new axis.
`tfrecord_reader`	CPU	Legacy alias for `readers.tfrecord()`.
`to_decibels`	CPU, GPU	Converts a magnitude (real, positive) to the decibel scale.
`torch_python_function`	CPU, GPU	Executes a function that is operating on Torch tensors.
`transpose`	CPU, GPU	Transposes the tensors by reordering the dimensions based on the perm parameter.
`uniform`	CPU, GPU	Generates random numbers following a uniform distribution.
`video_reader`	GPU	Legacy alias for `readers.video()`.
`video_reader_resize`	GPU	Legacy alias for `readers.video_resize()`.
`warp_affine`	CPU, GPU	Applies an affine transformation to the images.
`water`	CPU, GPU	Performs a water augmentation, which makes the image appear to be underwater.
`zeros`	CPU	Returns new data of given shape and type, filled with zeros.
`zeros_like`	CPU	Returns new data with the same shape and type as the input array, filled with zeros.
`decoders.audio`	CPU	Decodes waveforms from encoded audio data.
`decoders.image`	CPU, Mixed	Decodes images.
`decoders.image_crop`	CPU, Mixed	Decodes images and extracts regions-of-interest (ROI) that are specified by fixed window dimensions and variable anchors.
`decoders.image_random_crop`	CPU, Mixed	Decodes images and randomly crops them.
`decoders.image_slice`	CPU, Mixed	Decodes images and extracts regions of interest.
`experimental.audio_resample`	CPU, GPU	Legacy alias for `audio_resample()`.
`experimental.debayer`	CPU, GPU	Performs image demosaicing/debayering.
`experimental.dilate`	GPU	Performs a dilation operation on the input image.
`experimental.equalize`	CPU, GPU	Performs grayscale/per-channel histogram equalization.
`experimental.erode`	GPU	Performs an erosion operation on the input image.
`experimental.filter`	CPU, GPU	Convolves the image with the provided filter.
`experimental.inflate`	GPU	Inflates/decompresses the input using specified decompression algorithm.
`experimental.median_blur`	GPU	Median blur performs smoothing of an image or sequence of images by replacing each pixel with the median color of a surrounding rectangular region.
`experimental.peek_image_shape`	CPU	Obtains the shape of the encoded image.
`experimental.remap`	GPU	The remap operation applies a generic geometrical transformation to an image. In other words, it takes pixels from one place in the input image and puts them in another place in the output image. The transformation is described by `mapx` and `mapy` parameters, where:
`experimental.resize`	GPU	Resize images.
`experimental.tensor_resize`	CPU, GPU	Resize tensors.
`experimental.warp_perspective`	CPU, GPU	Performs a perspective transform on the images.
`experimental.decoders.image`	CPU, Mixed	Decodes images.
`experimental.decoders.image_crop`	CPU, Mixed	Decodes images and extracts regions-of-interest (ROI) that are specified by fixed window dimensions and variable anchors.
`experimental.decoders.image_random_crop`	CPU, Mixed	Decodes images and randomly crops them.
`experimental.decoders.image_slice`	CPU, Mixed	Decodes images and extracts regions of interest.
`experimental.decoders.video`	CPU, Mixed	Decodes videos from in-memory streams.
`experimental.inputs.video`	CPU, Mixed	Streams and decodes a video from a memory buffer. To be used with long and high resolution videos.
`experimental.readers.fits`	CPU, GPU	Reads Fits image HDUs from a directory.
`experimental.readers.video`	CPU, GPU	Loads and decodes video files from disk.
`io.file.read`	CPU	Reads raw file contents from an encoded filename represented by a 1D byte array.
`noise.gaussian`	CPU, GPU	Applies gaussian noise to the input.
`noise.salt_and_pepper`	CPU, GPU	Applies salt-and-pepper noise to the input.
`noise.shot`	CPU, GPU	Applies shot noise to the input.
`plugin.video.decoder`	Mixed	Decodes a video file from a memory buffer (e.g. provided by external source).
`random.beta`	CPU	Generates a random number from `[0, 1]` range following the beta distribution.
`random.choice`	CPU	Generates a random sample from a given 1D array.
`random.coin_flip`	CPU, GPU	Generates random boolean values following a bernoulli distribution.
`random.normal`	CPU, GPU	Generates random numbers following a normal distribution.
`random.uniform`	CPU, GPU	Generates random numbers following a uniform distribution.
`readers.caffe`	CPU	Reads (Image, label) pairs from a Caffe LMDB.
`readers.caffe2`	CPU	Reads sample data from a Caffe2 Lightning Memory-Mapped Database (LMDB).
`readers.coco`	CPU	Reads data from a COCO dataset that is composed of a directory with images and annotation JSON files.
`readers.file`	CPU	Reads file contents and returns file-label pairs.
`readers.mxnet`	CPU	Reads the data from an MXNet RecordIO.
`readers.nemo_asr`	CPU	Reads automatic speech recognition (ASR) data (audio, text) from an NVIDIA NeMo compatible manifest.
`readers.numpy`	CPU, GPU	Reads Numpy arrays from a directory.
`readers.sequence`	CPU	Reads [Frame] sequences from a directory representing a collection of streams.
`readers.tfrecord`	CPU	Reads samples from a TensorFlow TFRecord file.
`readers.video`	GPU	Loads and decodes video files using FFmpeg and NVDECODE, which is the hardware-accelerated video decoding feature in the NVIDIA(R) GPU.
`readers.video_resize`	GPU	Loads, decodes and resizes video files with FFmpeg and NVDECODE, which is NVIDIA GPU’s hardware-accelerated video decoding.
`readers.webdataset`	CPU	A reader for the webdataset format.
`reductions.max`	CPU, GPU	Gets maximal input element along provided axes.
`reductions.mean`	CPU, GPU	Gets mean of elements along provided axes.
`reductions.mean_square`	CPU, GPU	Gets mean square of elements along provided axes.
`reductions.min`	CPU, GPU	Gets minimal input element along provided axes.
`reductions.rms`	CPU, GPU	Gets root mean square of elements along provided axes.
`reductions.std_dev`	CPU, GPU	Gets standard deviation of elements along provided axes.
`reductions.sum`	CPU, GPU	Gets sum of elements along provided axes.
`reductions.variance`	CPU, GPU	Gets variance of elements along provided axes.
`segmentation.random_mask_pixel`	CPU	Selects random pixel coordinates in a mask, sampled from a uniform distribution.
`segmentation.random_object_bbox`	CPU	Randomly selects an object from a mask and returns its bounding box.
`segmentation.select_masks`	CPU	Selects a subset of polygons by their mask ids.
`transforms.combine`	CPU	Combines two or more affine transforms.
`transforms.crop`	CPU	Produces an affine transform matrix that maps a reference coordinate space to another one.
`transforms.rotation`	CPU	Produces a rotation affine transform matrix.
`transforms.scale`	CPU	Produces a scale affine transform matrix.
`transforms.shear`	CPU	Produces a shear affine transform matrix.
`transforms.translation`	CPU	Produces a translation affine transform matrix.