Supported operations¶

class nvidia.dali.ops.Brightness(**kwargs)¶

Changes the brightness of an image

Parameters:	brightness (float or float tensor, optional, default = 1.0) – Brightness change factor. Values >= 0 are accepted. For example: 0 - black image, 1 - no change 2 - increase brightness twice image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image

class nvidia.dali.ops.Caffe2Reader(**kwargs)¶

Read sample data from a Caffe2 Lightning Memory-Mapped Database (LMDB).

Parameters:

path (str) – Path to Caffe2 LMDB directory.
additional_inputs (int, optional, default = 0) – Additional auxiliary data tensors provided for each sample.
bbox (bool, optional, default = False) – Denotes if bounding-box information is present.
initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
label_type (int, optional, default = 0) –
Type of label stored in dataset.
- 0 = SINGLE_LABEL : single integer label for multi-class classification
- 1 = MULTI_LABEL_SPARSE : sparse active label indices for multi-label classification
- 2 = MULTI_LABEL_DENSE : dense label embedding vector for label embedding regression
- 3 = MULTI_LABEL_WEIGHTED_SPARSE : sparse active label indices with per-label weights for multi-label classification.
num_labels (int, optional, default = 1) – Number of classes in dataset. Required when sparse labels are used.
num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
shard_id (int, optional, default = 0) – Id of the part to read.
tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.

class nvidia.dali.ops.CaffeReader(**kwargs)¶

Read (Image, label) pairs from a Caffe LMDB

Parameters:

path (str) – Path to Caffe LMDB directory.
initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
shard_id (int, optional, default = 0) – Id of the part to read.
tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.

class nvidia.dali.ops.Cast(**kwargs)¶

Cast tensor to a different type

Parameters:	dtype (nvidia.dali.types.DALIDataType) – Output data type.

class nvidia.dali.ops.CoinFlip(**kwargs)¶

Produce tensor filled with 0s and 1s - results of random coin flip, usable as an argument for select ops.

Parameters:	probability (float, optional, default = 0.5) – Probability of returning 1.

class nvidia.dali.ops.Contrast(**kwargs)¶

Changes the color contrast of the image.

Parameters:	contrast (float or float tensor, optional, default = 1.0) – Contrast change factor. Values >= 0 are accepted. For example: 0 - gray image, 1 - no change 2 - increase contrast twice image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image

class nvidia.dali.ops.Copy(**kwargs)¶: Make a copy of the input tensor

class nvidia.dali.ops.CropMirrorNormalize(**kwargs)¶

Perform fused cropping, normalization, format conversion (NHWC to NCHW) if desired, and type casting. Normalization takes input image and produces output using formula

output = (input - mean) / std

Parameters:

crop (int or list of int) – Size of the cropped image. If only a single value c is provided, the resulting crop will be square with size (c,c)
mean (float or list of float) – Mean pixel values for image normalization.
std (float or list of float) – Standard deviation values for image normalization.
crop_pos_x (float or float tensor, optional, default = 0.5) – Horizontal position of the crop in image coordinates (0.0 - 1.0).
crop_pos_y (float or float tensor, optional, default = 0.5) – Vertical position of the crop in image coordinates (0.0 - 1.0).
image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
mirror (int or int tensor, optional, default = 0) –
Mask for horizontal flip.
- 0 - do not perform horizontal flip for this image
- 1 - perform horizontal flip for this image.
output_dtype (nvidia.dali.types.DALIDataType, optional, default = DALIDataType.FLOAT) – Output data type.
output_layout (nvidia.dali.types.DALITensorLayout, optional, default = DALITensorLayout.NCHW) – Output tensor data layout
pad_output (bool, optional, default = False) – Whether to pad the output to number of channels being multiple of 4.

class nvidia.dali.ops.DummyOp(**kwargs)¶

Dummy operator for testing

Parameters:	num_outputs (int, optional, default = 2) – Number of outputs.

class nvidia.dali.ops.DumpImage(**kwargs)¶

Save images in batch to disk in PPM format. Useful for debugging.

Parameters:	input_layout (nvidia.dali.types.DALITensorLayout, optional, default = DALITensorLayout.NHWC) – Layout of input images. suffix (str, optional, default = '') – Suffix to be added to output file names.

class nvidia.dali.ops.ExternalSource(**kwargs)¶: Allows externally provided data to be passed as an input to the pipeline

class nvidia.dali.ops.FastResizeCropMirror(**kwargs)¶

Perform a fused resize, crop, mirror operation. Handles both fixed and random resizing and cropping. Backprojects the desired crop through the resize operation to reduce the amount of work performed.

Parameters:

crop (int or list of int) – Size of the cropped image. If only a single value c is provided, the resulting crop will be square with size (c,c)
crop_pos_x (float or float tensor, optional, default = 0.5) – Horizontal position of the crop in image coordinates (0.0 - 1.0).
crop_pos_y (float or float tensor, optional, default = 0.5) – Vertical position of the crop in image coordinates (0.0 - 1.0).
image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Type of interpolation used.
mirror (int or int tensor, optional, default = 0) –
Mask for horizontal flip.
- 0 - do not perform horizontal flip for this image
- 1 - perform horizontal flip for this image.
resize_shorter (float or float tensor, optional, default = 0.0) – The length of the shorter dimension of the resized image. This option is mutually exclusive with resize_x and resize_y. The op will keep the aspect ratio of the original image.
resize_x (float or float tensor, optional, default = 0.0) – The length of the X dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_y is left at 0, then the op will keep the aspect ratio of the original image.
resize_y (float or float tensor, optional, default = 0.0) – The length of the Y dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_x is left at 0, then the op will keep the aspect ratio of the original image.

class nvidia.dali.ops.FileReader(**kwargs)¶

Read (Image, label) pairs from a directory

Parameters:

file_root (str) – Path to a directory containing data files.
file_list (str, optional, default = '') – Path to the file with a list of pairs file label (leave empty to traverse the file_root directory to obtain files and labels)
initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
shard_id (int, optional, default = 0) – Id of the part to read.
tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.

class nvidia.dali.ops.HostDecoder(**kwargs)¶

Decode images on the host using OpenCV. When applicable, it will pass execution to faster, format-specific decoders (like libjpeg-turbo). Output of the decoder is in HWC ordering.

Parameters:	output_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of output image.

class nvidia.dali.ops.Hue(**kwargs)¶

Changes the hue level of the image.

Parameters:	hue (float or float tensor, optional, default = 0.0) – Hue change in angles. image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image

class nvidia.dali.ops.Jitter(**kwargs)¶

Perform a random Jitter augmentation. The output image is produced by moving each pixel by a random amount bounded by half of nDegree parameter (in both x and y dimensions).

Parameters:

fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
mask (int or int tensor, optional, default = 1) –
Whether to apply this augmentation to the input image.
- 0 - do not apply this transformation
- 1 - apply this transformation
nDegree (int, optional, default = 2) – Each pixel is moved by a random amount in range [-nDegree/2, nDegree/2].

class nvidia.dali.ops.MXNetReader(**kwargs)¶

Read sample data from a MXNet RecordIO

Parameters:

index_path (str or list of str) – List (of length 1) containing a path to index (.idx) file. It is generated by the MXNet’s im2rec.py script together with RecordIO file. It can also be generated using rec2idx script distributed with DALI.
path (str or list of str) – List of paths to RecordIO files.
initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
shard_id (int, optional, default = 0) – Id of the part to read.
tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.

class nvidia.dali.ops.NormalizePermute(**kwargs)¶

Perform fused normalization, format conversion from NHWC to NCHW and type casting. Normalization takes input image and produces output using formula

output = (input - mean) / std

Parameters:

height (int) – Height of the input image.
mean (float or list of float) – Mean pixel values for image normalization.
std (float or list of float) – Standard deviation values for image normalization.
width (int) – Width of the input image.
image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
output_dtype (nvidia.dali.types.DALIDataType, optional, default = DALIDataType.FLOAT) – Output data type.

class nvidia.dali.ops.RandomResizedCrop(**kwargs)¶

Perform a crop with randomly chosen area and aspect ratio, then resize it to given size.

Parameters:

size (int or list of int) – Size of resized image.
interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Type of interpolation used.
num_attempts (int, optional, default = 10) – Maximum number of attempts used to choose random area and aspect ratio.
random_area (float or list of float, optional, default = [0.08, 1.0]) – Range from which to choose random area factor A. Before resizing, the cropped image’s area will be equal to A * original image’s area.
random_aspect_ratio (float or list of float, optional, default = [0.75, 1.333333]) – Range from which to choose random aspect ratio.

class nvidia.dali.ops.Resize(**kwargs)¶

Resize images.

Parameters:

image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Type of interpolation used.
resize_shorter (float or float tensor, optional, default = 0.0) – The length of the shorter dimension of the resized image. This option is mutually exclusive with resize_x and resize_y. The op will keep the aspect ratio of the original image.
resize_x (float or float tensor, optional, default = 0.0) – The length of the X dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_y is left at 0, then the op will keep the aspect ratio of the original image.
resize_y (float or float tensor, optional, default = 0.0) – The length of the Y dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_x is left at 0, then the op will keep the aspect ratio of the original image.
save_attrs (bool, optional, default = False) – Save reshape attributes for testing.

class nvidia.dali.ops.ResizeCropMirror(**kwargs)¶

Perform a fused resize, crop, mirror operation. Handles both fixed and random resizing and cropping.

Parameters:

crop (int or list of int) – Size of the cropped image. If only a single value c is provided, the resulting crop will be square with size (c,c)
crop_pos_x (float or float tensor, optional, default = 0.5) – Horizontal position of the crop in image coordinates (0.0 - 1.0).
crop_pos_y (float or float tensor, optional, default = 0.5) – Vertical position of the crop in image coordinates (0.0 - 1.0).
image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Type of interpolation used.
mirror (int or int tensor, optional, default = 0) –
Mask for horizontal flip.
- 0 - do not perform horizontal flip for this image
- 1 - perform horizontal flip for this image.
resize_shorter (float or float tensor, optional, default = 0.0) – The length of the shorter dimension of the resized image. This option is mutually exclusive with resize_x and resize_y. The op will keep the aspect ratio of the original image.
resize_x (float or float tensor, optional, default = 0.0) – The length of the X dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_y is left at 0, then the op will keep the aspect ratio of the original image.
resize_y (float or float tensor, optional, default = 0.0) – The length of the Y dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_x is left at 0, then the op will keep the aspect ratio of the original image.

class nvidia.dali.ops.Rotate(**kwargs)¶

Rotate the image.

Parameters:

angle (float or float tensor) – Rotation angle.
fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
mask (int or int tensor, optional, default = 1) –
Whether to apply this augmentation to the input image.
- 0 - do not apply this transformation
- 1 - apply this transformation

class nvidia.dali.ops.Saturation(**kwargs)¶

Changes saturation level of the image.

Parameters:	image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image saturation (float or float tensor, optional, default = 1.0) – Saturation change factor. Values >= 0 are supported. For example: 0 - completely desaturated image 1 - no change to image’s saturation

class nvidia.dali.ops.Sphere(**kwargs)¶

Perform a sphere augmentation.

Parameters:

fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
mask (int or int tensor, optional, default = 1) –
Whether to apply this augmentation to the input image.
- 0 - do not apply this transformation
- 1 - apply this transformation

class nvidia.dali.ops.TFRecordReader(path, index_path, features, **kwargs)¶

Read sample data from a TensorFlow TFRecord file.

Parameters:

features (dict of (string, nvidia.dali.tfrecord.Feature)) – Dictionary of names and configuration of features existing in TFRecord file. Typically obtained using helper functions dali.tfrecord.FixedLenFeature and dali.tfrecord.VarLenFeature, they are equivalent to TensorFlow’s tf.FixedLenFeature and tf.VarLenFeature respectively.
index_path (str or list of str) – List of paths to index files (1 index file for every TFRecord file). Index files may be obtained from TFRecord files using tfrecord2idx script distributed with DALI.
path (str or list of str) – List of paths to TFRecord files.
initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
shard_id (int, optional, default = 0) – Id of the part to read.
tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.

class nvidia.dali.ops.Uniform(**kwargs)¶

Produce tensor filled with uniformly distributed random numbers.

Parameters:	range (float or list of float, optional, default = [-1.0, 1.0]) – Range of produced random numbers.

class nvidia.dali.ops.WarpAffine(**kwargs)¶

Apply an affine transformation to the image.

Parameters:

matrix (float or list of float) –
Matrix of the transform (dst -> src). Given list of values (M11, M12, M13, M21, M22, M23) this operation will produce a new image using formula

dst(x,y) = src(M11 * x + M12 * y + M13, M21 * x + M22 * y + M23)

It is equivalent to OpenCV’s warpAffine operation with a flag WARP_INVERSE_MAP set.
fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
mask (int or int tensor, optional, default = 1) –
Whether to apply this augmentation to the input image.
- 0 - do not apply this transformation
- 1 - apply this transformation
use_image_center (bool, optional, default = False) – Whether to use image center as the center of transformation. When this is True coordinates are calculated from the center of the image.

class nvidia.dali.ops.Water(**kwargs)¶

Perform a water augmentation (make image appear to be underwater).

Parameters:

ampl_x (float, optional, default = 10.0) – Amplitude of the wave in x direction.
ampl_y (float, optional, default = 10.0) – Amplitude of the wave in y direction.
fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
freq_x (float, optional, default = 0.049087) – Frequency of the wave in x direction.
freq_y (float, optional, default = 0.049087) – Frequence of the wave in y direction.
interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
mask (int or int tensor, optional, default = 1) –
Whether to apply this augmentation to the input image.
- 0 - do not apply this transformation
- 1 - apply this transformation
phase_x (float, optional, default = 0.0) – Phase of the wave in x direction.
phase_y (float, optional, default = 0.0) – Phase of the wave in y direction.

class nvidia.dali.ops.nvJPEGDecoder(**kwargs)¶

Decode JPEG images using the nvJPEG library. Output of the decoder is on the GPU and uses HWC ordering.

Parameters:	output_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of output image. use_batched_decode (bool, optional, default = False) – Use nvJPEG’s batched decoding API.