Supported operations

class nvidia.dali.ops.Brightness(**kwargs)

Changes the brightness of an image

Parameters:
  • brightness (float or float tensor, optional, default = 1.0) –

    Brightness change factor. Values >= 0 are accepted. For example:

    • 0 - black image,
    • 1 - no change
    • 2 - increase brightness twice
  • image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image
class nvidia.dali.ops.Caffe2Reader(**kwargs)

Read sample data from a Caffe2 Lightning Memory-Mapped Database (LMDB).

Parameters:
  • path (str) – Path to Caffe2 LMDB directory.
  • additional_inputs (int, optional, default = 0) – Additional auxiliary data tensors provided for each sample.
  • bbox (bool, optional, default = False) – Denotes if bounding-box information is present.
  • initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
  • label_type (int, optional, default = 0) –

    Type of label stored in dataset.

    • 0 = SINGLE_LABEL : single integer label for multi-class classification
    • 1 = MULTI_LABEL_SPARSE : sparse active label indices for multi-label classification
    • 2 = MULTI_LABEL_DENSE : dense label embedding vector for label embedding regression
    • 3 = MULTI_LABEL_WEIGHTED_SPARSE : sparse active label indices with per-label weights for multi-label classification.
  • num_labels (int, optional, default = 1) – Number of classes in dataset. Required when sparse labels are used.
  • num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
  • random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
  • shard_id (int, optional, default = 0) – Id of the part to read.
  • tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.
class nvidia.dali.ops.CaffeReader(**kwargs)

Read (Image, label) pairs from a Caffe LMDB

Parameters:
  • path (str) – Path to Caffe LMDB directory.
  • initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
  • num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
  • random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
  • shard_id (int, optional, default = 0) – Id of the part to read.
  • tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.
class nvidia.dali.ops.Cast(**kwargs)

Cast tensor to a different type

Parameters:dtype (nvidia.dali.types.DALIDataType) – Output data type.
class nvidia.dali.ops.CoinFlip(**kwargs)

Produce tensor filled with 0s and 1s - results of random coin flip, usable as an argument for select ops.

Parameters:probability (float, optional, default = 0.5) – Probability of returning 1.
class nvidia.dali.ops.Contrast(**kwargs)

Changes the color contrast of the image.

Parameters:
  • contrast (float or float tensor, optional, default = 1.0) –

    Contrast change factor. Values >= 0 are accepted. For example:

    • 0 - gray image,
    • 1 - no change
    • 2 - increase contrast twice
  • image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image
class nvidia.dali.ops.Copy(**kwargs)

Make a copy of the input tensor

class nvidia.dali.ops.Crop(**kwargs)

Perform a random crop.

Parameters:
  • crop (int or list of int) –
    Size of the cropped image. If only a single value c is provided,
    the resulting crop will be square with size (c,c)
  • crop_pos_x (float or float tensor, optional, default = 0.5) – Horizontal position of the crop in image coordinates (0.0 - 1.0)
  • crop_pos_y (float or float tensor, optional, default = 0.5) – Vertical position of the crop in image coordinates (0.0 - 1.0)
  • image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image
class nvidia.dali.ops.CropCastPermute(**kwargs)

Perform a random crop, data type cast and permute (from NHWC to NCHW).

Parameters:
  • crop (int or list of int) –
    Size of the cropped image. If only a single value c is provided,
    the resulting crop will be square with size (c,c)
  • crop_pos_x (float or float tensor, optional, default = 0.5) – Horizontal position of the crop in image coordinates (0.0 - 1.0)
  • crop_pos_y (float or float tensor, optional, default = 0.5) – Vertical position of the crop in image coordinates (0.0 - 1.0)
  • image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image
  • output_dtype (nvidia.dali.types.DALIDataType, optional, default = DALIDataType.FLOAT) –
    Output data type. If NO_TYPE is specified, the ouput data type is inferred
    from the input data type.
  • output_layout (nvidia.dali.types.DALITensorLayout, optional, default = DALITensorLayout.NCHW) – Output tensor data layout
class nvidia.dali.ops.CropMirrorNormalize(**kwargs)

Perform fused cropping, normalization, format conversion (NHWC to NCHW) if desired, and type casting. Normalization takes input image and produces output using formula

output = (input - mean) / std
Parameters:
  • crop (int or list of int) – Size of the cropped image. If only a single value c is provided, the resulting crop will be square with size (c,c)
  • mean (float or list of float) – Mean pixel values for image normalization.
  • std (float or list of float) – Standard deviation values for image normalization.
  • crop_pos_x (float or float tensor, optional, default = 0.5) – Horizontal position of the crop in image coordinates (0.0 - 1.0).
  • crop_pos_y (float or float tensor, optional, default = 0.5) – Vertical position of the crop in image coordinates (0.0 - 1.0).
  • image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
  • mirror (int or int tensor, optional, default = 0) –

    Mask for horizontal flip.

    • 0 - do not perform horizontal flip for this image
    • 1 - perform horizontal flip for this image.
  • output_dtype (nvidia.dali.types.DALIDataType, optional, default = DALIDataType.FLOAT) – Output data type.
  • output_layout (nvidia.dali.types.DALITensorLayout, optional, default = DALITensorLayout.NCHW) – Output tensor data layout
  • pad_output (bool, optional, default = False) – Whether to pad the output to number of channels being multiple of 4.
class nvidia.dali.ops.DummyOp(**kwargs)

Dummy operator for testing

Parameters:num_outputs (int, optional, default = 2) – Number of outputs.
class nvidia.dali.ops.DumpImage(**kwargs)

Save images in batch to disk in PPM format. Useful for debugging.

Parameters:
  • input_layout (nvidia.dali.types.DALITensorLayout, optional, default = DALITensorLayout.NHWC) – Layout of input images.
  • suffix (str, optional, default = '') – Suffix to be added to output file names.
class nvidia.dali.ops.ExternalSource(**kwargs)
Allows externally provided data to be passed as an input to the pipeline,
see nvidia.dali.pipeline.Pipeline.feed_input() and nvidia.dali.pipeline.Pipeline.iter_setup(). Currenlty this operator is not supported in TensorFlow.
class nvidia.dali.ops.FastResizeCropMirror(**kwargs)

Perform a fused resize, crop, mirror operation. Handles both fixed and random resizing and cropping. Backprojects the desired crop through the resize operation to reduce the amount of work performed.

Parameters:
  • crop (int or list of int) – Size of the cropped image. If only a single value c is provided, the resulting crop will be square with size (c,c)
  • crop_pos_x (float or float tensor, optional, default = 0.5) – Horizontal position of the crop in image coordinates (0.0 - 1.0).
  • crop_pos_y (float or float tensor, optional, default = 0.5) – Vertical position of the crop in image coordinates (0.0 - 1.0).
  • image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
  • interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Type of interpolation used.
  • mirror (int or int tensor, optional, default = 0) –

    Mask for horizontal flip.

    • 0 - do not perform horizontal flip for this image
    • 1 - perform horizontal flip for this image.
  • resize_shorter (float or float tensor, optional, default = 0.0) – The length of the shorter dimension of the resized image. This option is mutually exclusive with resize_x and resize_y. The op will keep the aspect ratio of the original image.
  • resize_x (float or float tensor, optional, default = 0.0) – The length of the X dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_y is left at 0, then the op will keep the aspect ratio of the original image.
  • resize_y (float or float tensor, optional, default = 0.0) – The length of the Y dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_x is left at 0, then the op will keep the aspect ratio of the original image.
class nvidia.dali.ops.FileReader(**kwargs)

Read (Image, label) pairs from a directory

Parameters:
  • file_root (str) – Path to a directory containing data files.
  • file_list (str, optional, default = '') – Path to the file with a list of pairs file label (leave empty to traverse the file_root directory to obtain files and labels)
  • initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
  • num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
  • random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
  • shard_id (int, optional, default = 0) – Id of the part to read.
  • tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.
class nvidia.dali.ops.Flip(**kwargs)

Flip the image on the horizontal and/or vertical axes.

Parameters:
  • fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
  • horizontal (bool or bool tensor, optional, default = True) – Perform a horizontal flip. Default value is True.
  • interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
  • mask (int or int tensor, optional, default = 1) –

    Whether to apply this augmentation to the input image.

    • 0 - do not apply this transformation
    • 1 - apply this transformation
  • vertical (bool or bool tensor, optional, default = False) – Perform a vertical flip. Default value is False.
class nvidia.dali.ops.HostDecoder(**kwargs)

Decode images on the host using OpenCV. When applicable, it will pass execution to faster, format-specific decoders (like libjpeg-turbo). Output of the decoder is in HWC ordering.

Parameters:output_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of output image.
class nvidia.dali.ops.Hue(**kwargs)

Changes the hue level of the image.

Parameters:
  • hue (float or float tensor, optional, default = 0.0) – Hue change in angles.
  • image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image
class nvidia.dali.ops.Jitter(**kwargs)

Perform a random Jitter augmentation. The output image is produced by moving each pixel by a random amount bounded by half of nDegree parameter (in both x and y dimensions).

Parameters:
  • fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
  • interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
  • mask (int or int tensor, optional, default = 1) –

    Whether to apply this augmentation to the input image.

    • 0 - do not apply this transformation
    • 1 - apply this transformation
  • nDegree (int, optional, default = 2) – Each pixel is moved by a random amount in range [-nDegree/2, nDegree/2].
class nvidia.dali.ops.MXNetReader(**kwargs)

Read sample data from a MXNet RecordIO

Parameters:
  • index_path (str or list of str) – List (of length 1) containing a path to index (.idx) file. It is generated by the MXNet’s im2rec.py script together with RecordIO file. It can also be generated using rec2idx script distributed with DALI.
  • path (str or list of str) – List of paths to RecordIO files.
  • initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
  • num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
  • random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
  • shard_id (int, optional, default = 0) – Id of the part to read.
  • tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.
class nvidia.dali.ops.NormalizePermute(**kwargs)

Perform fused normalization, format conversion from NHWC to NCHW and type casting. Normalization takes input image and produces output using formula

output = (input - mean) / std

Parameters:
  • height (int) – Height of the input image.
  • mean (float or list of float) – Mean pixel values for image normalization.
  • std (float or list of float) – Standard deviation values for image normalization.
  • width (int) – Width of the input image.
  • image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
  • output_dtype (nvidia.dali.types.DALIDataType, optional, default = DALIDataType.FLOAT) – Output data type.
class nvidia.dali.ops.RandomResizedCrop(**kwargs)

Perform a crop with randomly chosen area and aspect ratio, then resize it to given size.

Parameters:
  • size (int or list of int) – Size of resized image.
  • interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Type of interpolation used.
  • num_attempts (int, optional, default = 10) – Maximum number of attempts used to choose random area and aspect ratio.
  • random_area (float or list of float, optional, default = [0.08, 1.0]) – Range from which to choose random area factor A. Before resizing, the cropped image’s area will be equal to A * original image’s area.
  • random_aspect_ratio (float or list of float, optional, default = [0.75, 1.333333]) – Range from which to choose random aspect ratio.
class nvidia.dali.ops.Resize(**kwargs)

Resize images.

Parameters:
  • image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
  • interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Type of interpolation used.
  • resize_shorter (float or float tensor, optional, default = 0.0) – The length of the shorter dimension of the resized image. This option is mutually exclusive with resize_x and resize_y. The op will keep the aspect ratio of the original image.
  • resize_x (float or float tensor, optional, default = 0.0) – The length of the X dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_y is left at 0, then the op will keep the aspect ratio of the original image.
  • resize_y (float or float tensor, optional, default = 0.0) – The length of the Y dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_x is left at 0, then the op will keep the aspect ratio of the original image.
  • save_attrs (bool, optional, default = False) – Save reshape attributes for testing.
class nvidia.dali.ops.ResizeCropMirror(**kwargs)

Perform a fused resize, crop, mirror operation. Handles both fixed and random resizing and cropping.

Parameters:
  • crop (int or list of int) – Size of the cropped image. If only a single value c is provided, the resulting crop will be square with size (c,c)
  • crop_pos_x (float or float tensor, optional, default = 0.5) – Horizontal position of the crop in image coordinates (0.0 - 1.0).
  • crop_pos_y (float or float tensor, optional, default = 0.5) – Vertical position of the crop in image coordinates (0.0 - 1.0).
  • image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
  • interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Type of interpolation used.
  • mirror (int or int tensor, optional, default = 0) –

    Mask for horizontal flip.

    • 0 - do not perform horizontal flip for this image
    • 1 - perform horizontal flip for this image.
  • resize_shorter (float or float tensor, optional, default = 0.0) – The length of the shorter dimension of the resized image. This option is mutually exclusive with resize_x and resize_y. The op will keep the aspect ratio of the original image.
  • resize_x (float or float tensor, optional, default = 0.0) – The length of the X dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_y is left at 0, then the op will keep the aspect ratio of the original image.
  • resize_y (float or float tensor, optional, default = 0.0) – The length of the Y dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_x is left at 0, then the op will keep the aspect ratio of the original image.
class nvidia.dali.ops.Rotate(**kwargs)

Rotate the image.

Parameters:
  • angle (float or float tensor) – Rotation angle.
  • fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
  • interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
  • mask (int or int tensor, optional, default = 1) –

    Whether to apply this augmentation to the input image.

    • 0 - do not apply this transformation
    • 1 - apply this transformation
class nvidia.dali.ops.Saturation(**kwargs)

Changes saturation level of the image.

Parameters:
  • image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image
  • saturation (float or float tensor, optional, default = 1.0) –

    Saturation change factor. Values >= 0 are supported. For example:

    • 0 - completely desaturated image
    • 1 - no change to image’s saturation
class nvidia.dali.ops.Sphere(**kwargs)

Perform a sphere augmentation.

Parameters:
  • fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
  • interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
  • mask (int or int tensor, optional, default = 1) –

    Whether to apply this augmentation to the input image.

    • 0 - do not apply this transformation
    • 1 - apply this transformation
class nvidia.dali.ops.TFRecordReader(path, index_path, features, **kwargs)

Read sample data from a TensorFlow TFRecord file.

Parameters:
  • features (dict of (string, nvidia.dali.tfrecord.Feature)) – Dictionary of names and configuration of features existing in TFRecord file. Typically obtained using helper functions dali.tfrecord.FixedLenFeature and dali.tfrecord.VarLenFeature, they are equivalent to TensorFlow’s tf.FixedLenFeature and tf.VarLenFeature respectively.
  • index_path (str or list of str) – List of paths to index files (1 index file for every TFRecord file). Index files may be obtained from TFRecord files using tfrecord2idx script distributed with DALI.
  • path (str or list of str) – List of paths to TFRecord files.
  • initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
  • num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
  • random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
  • shard_id (int, optional, default = 0) – Id of the part to read.
  • tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.
class nvidia.dali.ops.Uniform(**kwargs)

Produce tensor filled with uniformly distributed random numbers.

Parameters:range (float or list of float, optional, default = [-1.0, 1.0]) – Range of produced random numbers.
class nvidia.dali.ops.WarpAffine(**kwargs)

Apply an affine transformation to the image.

Parameters:
  • matrix (float or list of float) –

    Matrix of the transform (dst -> src). Given list of values (M11, M12, M13, M21, M22, M23) this operation will produce a new image using formula

    dst(x,y) = src(M11 * x + M12 * y + M13, M21 * x + M22 * y + M23)

    It is equivalent to OpenCV’s warpAffine operation with a flag WARP_INVERSE_MAP set.

  • fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
  • interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
  • mask (int or int tensor, optional, default = 1) –

    Whether to apply this augmentation to the input image.

    • 0 - do not apply this transformation
    • 1 - apply this transformation
  • use_image_center (bool, optional, default = False) – Whether to use image center as the center of transformation. When this is True coordinates are calculated from the center of the image.
class nvidia.dali.ops.Water(**kwargs)

Perform a water augmentation (make image appear to be underwater).

Parameters:
  • ampl_x (float, optional, default = 10.0) – Amplitude of the wave in x direction.
  • ampl_y (float, optional, default = 10.0) – Amplitude of the wave in y direction.
  • fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
  • freq_x (float, optional, default = 0.049087) – Frequency of the wave in x direction.
  • freq_y (float, optional, default = 0.049087) – Frequence of the wave in y direction.
  • interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
  • mask (int or int tensor, optional, default = 1) –

    Whether to apply this augmentation to the input image.

    • 0 - do not apply this transformation
    • 1 - apply this transformation
  • phase_x (float, optional, default = 0.0) – Phase of the wave in x direction.
  • phase_y (float, optional, default = 0.0) – Phase of the wave in y direction.
class nvidia.dali.ops.nvJPEGDecoder(**kwargs)

Decode JPEG images using the nvJPEG library. Output of the decoder is on the GPU and uses HWC ordering.

Parameters:
  • output_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of output image.
  • use_batched_decode (bool, optional, default = False) – Use nvJPEG’s batched decoding API.