Supported operations¶
- 
class nvidia.dali.ops.Brightness(**kwargs)¶
- Changes the brightness of an image - Parameters: - brightness (float or float tensor, optional, default = 1.0) – Brightness change factor. Values >= 0 are accepted. For example: - 0 - black image,
- 1 - no change
- 2 - increase brightness twice
 
- image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image
 
- brightness (float or float tensor, optional, default = 1.0) – 
- 
class nvidia.dali.ops.Caffe2Reader(**kwargs)¶
- Read sample data from a Caffe2 Lightning Memory-Mapped Database (LMDB). - Parameters: - path (str) – Path to Caffe2 LMDB directory.
- additional_inputs (int, optional, default = 0) – Additional auxiliary data tensors provided for each sample.
- bbox (bool, optional, default = False) – Denotes if bounding-box information is present.
- initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
- label_type (int, optional, default = 0) – Type of label stored in dataset. - 0 = SINGLE_LABEL : single integer label for multi-class classification
- 1 = MULTI_LABEL_SPARSE : sparse active label indices for multi-label classification
- 2 = MULTI_LABEL_DENSE : dense label embedding vector for label embedding regression
- 3 = MULTI_LABEL_WEIGHTED_SPARSE : sparse active label indices with per-label weights for multi-label classification.
 
- num_labels (int, optional, default = 1) – Number of classes in dataset. Required when sparse labels are used.
- num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
- random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
- shard_id (int, optional, default = 0) – Id of the part to read.
- tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.
 
- 
class nvidia.dali.ops.CaffeReader(**kwargs)¶
- Read (Image, label) pairs from a Caffe LMDB - Parameters: - path (str) – Path to Caffe LMDB directory.
- initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
- num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
- random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
- shard_id (int, optional, default = 0) – Id of the part to read.
- tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.
 
- 
class nvidia.dali.ops.Cast(**kwargs)¶
- Cast tensor to a different type - Parameters: - dtype (nvidia.dali.types.DALIDataType) – Output data type. 
- 
class nvidia.dali.ops.CoinFlip(**kwargs)¶
- Produce tensor filled with 0s and 1s - results of random coin flip, usable as an argument for select ops. - Parameters: - probability (float, optional, default = 0.5) – Probability of returning 1. 
- 
class nvidia.dali.ops.Contrast(**kwargs)¶
- Changes the color contrast of the image. - Parameters: - contrast (float or float tensor, optional, default = 1.0) – Contrast change factor. Values >= 0 are accepted. For example: - 0 - gray image,
- 1 - no change
- 2 - increase contrast twice
 
- image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image
 
- contrast (float or float tensor, optional, default = 1.0) – 
- 
class nvidia.dali.ops.Copy(**kwargs)¶
- Make a copy of the input tensor 
- 
class nvidia.dali.ops.Crop(**kwargs)¶
- Perform a random crop. - Parameters: - crop (int or list of int) – - Size of the cropped image. If only a single value c is provided,
- the resulting crop will be square with size (c,c)
 
- crop_pos_x (float or float tensor, optional, default = 0.5) – Horizontal position of the crop in image coordinates (0.0 - 1.0)
- crop_pos_y (float or float tensor, optional, default = 0.5) – Vertical position of the crop in image coordinates (0.0 - 1.0)
- image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image
 
- crop (int or list of int) – 
- 
class nvidia.dali.ops.CropCastPermute(**kwargs)¶
- Perform a random crop, data type cast and permute (from NHWC to NCHW). - Parameters: - crop (int or list of int) – - Size of the cropped image. If only a single value c is provided,
- the resulting crop will be square with size (c,c)
 
- crop_pos_x (float or float tensor, optional, default = 0.5) – Horizontal position of the crop in image coordinates (0.0 - 1.0)
- crop_pos_y (float or float tensor, optional, default = 0.5) – Vertical position of the crop in image coordinates (0.0 - 1.0)
- image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image
- output_dtype (nvidia.dali.types.DALIDataType, optional, default = DALIDataType.FLOAT) – - Output data type. If NO_TYPE is specified, the ouput data type is inferred
- from the input data type.
 
- output_layout (nvidia.dali.types.DALITensorLayout, optional, default = DALITensorLayout.NCHW) – Output tensor data layout
 
- crop (int or list of int) – 
- 
class nvidia.dali.ops.CropMirrorNormalize(**kwargs)¶
- Perform fused cropping, normalization, format conversion (NHWC to NCHW) if desired, and type casting. Normalization takes input image and produces output using formula output = (input - mean) / std- Parameters: - crop (int or list of int) – Size of the cropped image. If only a single value c is provided, the resulting crop will be square with size (c,c)
- mean (float or list of float) – Mean pixel values for image normalization.
- std (float or list of float) – Standard deviation values for image normalization.
- crop_pos_x (float or float tensor, optional, default = 0.5) – Horizontal position of the crop in image coordinates (0.0 - 1.0).
- crop_pos_y (float or float tensor, optional, default = 0.5) – Vertical position of the crop in image coordinates (0.0 - 1.0).
- image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
- mirror (int or int tensor, optional, default = 0) – Mask for horizontal flip. - 0 - do not perform horizontal flip for this image
- 1 - perform horizontal flip for this image.
 
- output_dtype (nvidia.dali.types.DALIDataType, optional, default = DALIDataType.FLOAT) – Output data type.
- output_layout (nvidia.dali.types.DALITensorLayout, optional, default = DALITensorLayout.NCHW) – Output tensor data layout
- pad_output (bool, optional, default = False) – Whether to pad the output to number of channels being multiple of 4.
 
- 
class nvidia.dali.ops.DummyOp(**kwargs)¶
- Dummy operator for testing - Parameters: - num_outputs (int, optional, default = 2) – Number of outputs. 
- 
class nvidia.dali.ops.DumpImage(**kwargs)¶
- Save images in batch to disk in PPM format. Useful for debugging. - Parameters: - input_layout (nvidia.dali.types.DALITensorLayout, optional, default = DALITensorLayout.NHWC) – Layout of input images.
- suffix (str, optional, default = '') – Suffix to be added to output file names.
 
- 
class nvidia.dali.ops.ExternalSource(**kwargs)¶
- Allows externally provided data to be passed as an input to the pipeline,
- see nvidia.dali.pipeline.Pipeline.feed_input()andnvidia.dali.pipeline.Pipeline.iter_setup(). Currenlty this operator is not supported in TensorFlow.
 
- 
class nvidia.dali.ops.FastResizeCropMirror(**kwargs)¶
- Perform a fused resize, crop, mirror operation. Handles both fixed and random resizing and cropping. Backprojects the desired crop through the resize operation to reduce the amount of work performed. - Parameters: - crop (int or list of int) – Size of the cropped image. If only a single value c is provided, the resulting crop will be square with size (c,c)
- crop_pos_x (float or float tensor, optional, default = 0.5) – Horizontal position of the crop in image coordinates (0.0 - 1.0).
- crop_pos_y (float or float tensor, optional, default = 0.5) – Vertical position of the crop in image coordinates (0.0 - 1.0).
- image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Type of interpolation used.
- mirror (int or int tensor, optional, default = 0) – Mask for horizontal flip. - 0 - do not perform horizontal flip for this image
- 1 - perform horizontal flip for this image.
 
- resize_shorter (float or float tensor, optional, default = 0.0) – The length of the shorter dimension of the resized image. This option is mutually exclusive with resize_x and resize_y. The op will keep the aspect ratio of the original image.
- resize_x (float or float tensor, optional, default = 0.0) – The length of the X dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_y is left at 0, then the op will keep the aspect ratio of the original image.
- resize_y (float or float tensor, optional, default = 0.0) – The length of the Y dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_x is left at 0, then the op will keep the aspect ratio of the original image.
 
- 
class nvidia.dali.ops.FileReader(**kwargs)¶
- Read (Image, label) pairs from a directory - Parameters: - file_root (str) – Path to a directory containing data files.
- file_list (str, optional, default = '') – Path to the file with a list of pairs file label(leave empty to traverse the file_root directory to obtain files and labels)
- initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
- num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
- random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
- shard_id (int, optional, default = 0) – Id of the part to read.
- tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.
 
- 
class nvidia.dali.ops.Flip(**kwargs)¶
- Flip the image on the horizontal and/or vertical axes. - Parameters: - fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
- horizontal (bool or bool tensor, optional, default = True) – Perform a horizontal flip. Default value is True.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
- mask (int or int tensor, optional, default = 1) – Whether to apply this augmentation to the input image. - 0 - do not apply this transformation
- 1 - apply this transformation
 
- vertical (bool or bool tensor, optional, default = False) – Perform a vertical flip. Default value is False.
 
- 
class nvidia.dali.ops.HostDecoder(**kwargs)¶
- Decode images on the host using OpenCV. When applicable, it will pass execution to faster, format-specific decoders (like libjpeg-turbo). Output of the decoder is in HWC ordering. - Parameters: - output_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of output image. 
- 
class nvidia.dali.ops.Hue(**kwargs)¶
- Changes the hue level of the image. - Parameters: - hue (float or float tensor, optional, default = 0.0) – Hue change in angles.
- image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image
 
- 
class nvidia.dali.ops.Jitter(**kwargs)¶
- Perform a random Jitter augmentation. The output image is produced by moving each pixel by a random amount bounded by half of nDegree parameter (in both x and y dimensions). - Parameters: - fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
- mask (int or int tensor, optional, default = 1) – Whether to apply this augmentation to the input image. - 0 - do not apply this transformation
- 1 - apply this transformation
 
- nDegree (int, optional, default = 2) – Each pixel is moved by a random amount in range [-nDegree/2, nDegree/2].
 
- 
class nvidia.dali.ops.MXNetReader(**kwargs)¶
- Read sample data from a MXNet RecordIO - Parameters: - index_path (str or list of str) – List (of length 1) containing a path to index (.idx) file. It is generated by the MXNet’s im2rec.py script together with RecordIO file. It can also be generated using rec2idx script distributed with DALI.
- path (str or list of str) – List of paths to RecordIO files.
- initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
- num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
- random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
- shard_id (int, optional, default = 0) – Id of the part to read.
- tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.
 
- 
class nvidia.dali.ops.NormalizePermute(**kwargs)¶
- Perform fused normalization, format conversion from NHWC to NCHW and type casting. Normalization takes input image and produces output using formula - output = (input - mean) / std - Parameters: - height (int) – Height of the input image.
- mean (float or list of float) – Mean pixel values for image normalization.
- std (float or list of float) – Standard deviation values for image normalization.
- width (int) – Width of the input image.
- image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
- output_dtype (nvidia.dali.types.DALIDataType, optional, default = DALIDataType.FLOAT) – Output data type.
 
- 
class nvidia.dali.ops.RandomResizedCrop(**kwargs)¶
- Perform a crop with randomly chosen area and aspect ratio, then resize it to given size. - Parameters: - size (int or list of int) – Size of resized image.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Type of interpolation used.
- num_attempts (int, optional, default = 10) – Maximum number of attempts used to choose random area and aspect ratio.
- random_area (float or list of float, optional, default = [0.08, 1.0]) – Range from which to choose random area factor A. Before resizing, the cropped image’s area will be equal to A * original image’s area.
- random_aspect_ratio (float or list of float, optional, default = [0.75, 1.333333]) – Range from which to choose random aspect ratio.
 
- 
class nvidia.dali.ops.Resize(**kwargs)¶
- Resize images. - Parameters: - image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Type of interpolation used.
- resize_shorter (float or float tensor, optional, default = 0.0) – The length of the shorter dimension of the resized image. This option is mutually exclusive with resize_x and resize_y. The op will keep the aspect ratio of the original image.
- resize_x (float or float tensor, optional, default = 0.0) – The length of the X dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_y is left at 0, then the op will keep the aspect ratio of the original image.
- resize_y (float or float tensor, optional, default = 0.0) – The length of the Y dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_x is left at 0, then the op will keep the aspect ratio of the original image.
- save_attrs (bool, optional, default = False) – Save reshape attributes for testing.
 
- 
class nvidia.dali.ops.ResizeCropMirror(**kwargs)¶
- Perform a fused resize, crop, mirror operation. Handles both fixed and random resizing and cropping. - Parameters: - crop (int or list of int) – Size of the cropped image. If only a single value c is provided, the resulting crop will be square with size (c,c)
- crop_pos_x (float or float tensor, optional, default = 0.5) – Horizontal position of the crop in image coordinates (0.0 - 1.0).
- crop_pos_y (float or float tensor, optional, default = 0.5) – Vertical position of the crop in image coordinates (0.0 - 1.0).
- image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Type of interpolation used.
- mirror (int or int tensor, optional, default = 0) – Mask for horizontal flip. - 0 - do not perform horizontal flip for this image
- 1 - perform horizontal flip for this image.
 
- resize_shorter (float or float tensor, optional, default = 0.0) – The length of the shorter dimension of the resized image. This option is mutually exclusive with resize_x and resize_y. The op will keep the aspect ratio of the original image.
- resize_x (float or float tensor, optional, default = 0.0) – The length of the X dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_y is left at 0, then the op will keep the aspect ratio of the original image.
- resize_y (float or float tensor, optional, default = 0.0) – The length of the Y dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_x is left at 0, then the op will keep the aspect ratio of the original image.
 
- 
class nvidia.dali.ops.Rotate(**kwargs)¶
- Rotate the image. - Parameters: - angle (float or float tensor) – Rotation angle.
- fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
- mask (int or int tensor, optional, default = 1) – Whether to apply this augmentation to the input image. - 0 - do not apply this transformation
- 1 - apply this transformation
 
 
- 
class nvidia.dali.ops.Saturation(**kwargs)¶
- Changes saturation level of the image. - Parameters: - image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image
- saturation (float or float tensor, optional, default = 1.0) – Saturation change factor. Values >= 0 are supported. For example: - 0 - completely desaturated image
- 1 - no change to image’s saturation
 
 
- 
class nvidia.dali.ops.Sphere(**kwargs)¶
- Perform a sphere augmentation. - Parameters: - fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
- mask (int or int tensor, optional, default = 1) – Whether to apply this augmentation to the input image. - 0 - do not apply this transformation
- 1 - apply this transformation
 
 
- 
class nvidia.dali.ops.TFRecordReader(path, index_path, features, **kwargs)¶
- Read sample data from a TensorFlow TFRecord file. - Parameters: - features (dict of (string, nvidia.dali.tfrecord.Feature)) – Dictionary of names and configuration of features existing in TFRecord file. Typically obtained using helper functions dali.tfrecord.FixedLenFeature and dali.tfrecord.VarLenFeature, they are equivalent to TensorFlow’s tf.FixedLenFeature and tf.VarLenFeature respectively.
- index_path (str or list of str) – List of paths to index files (1 index file for every TFRecord file). Index files may be obtained from TFRecord files using tfrecord2idx script distributed with DALI.
- path (str or list of str) – List of paths to TFRecord files.
- initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
- num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
- random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
- shard_id (int, optional, default = 0) – Id of the part to read.
- tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.
 
- 
class nvidia.dali.ops.Uniform(**kwargs)¶
- Produce tensor filled with uniformly distributed random numbers. - Parameters: - range (float or list of float, optional, default = [-1.0, 1.0]) – Range of produced random numbers. 
- 
class nvidia.dali.ops.WarpAffine(**kwargs)¶
- Apply an affine transformation to the image. - Parameters: - matrix (float or list of float) – Matrix of the transform (dst -> src). Given list of values (M11, M12, M13, M21, M22, M23) this operation will produce a new image using formula dst(x,y) = src(M11 * x + M12 * y + M13, M21 * x + M22 * y + M23) It is equivalent to OpenCV’s warpAffine operation with a flag WARP_INVERSE_MAP set. 
- fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
- mask (int or int tensor, optional, default = 1) – Whether to apply this augmentation to the input image. - 0 - do not apply this transformation
- 1 - apply this transformation
 
- use_image_center (bool, optional, default = False) – Whether to use image center as the center of transformation. When this is True coordinates are calculated from the center of the image.
 
- matrix (float or list of float) – 
- 
class nvidia.dali.ops.Water(**kwargs)¶
- Perform a water augmentation (make image appear to be underwater). - Parameters: - ampl_x (float, optional, default = 10.0) – Amplitude of the wave in x direction.
- ampl_y (float, optional, default = 10.0) – Amplitude of the wave in y direction.
- fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
- freq_x (float, optional, default = 0.049087) – Frequency of the wave in x direction.
- freq_y (float, optional, default = 0.049087) – Frequence of the wave in y direction.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
- mask (int or int tensor, optional, default = 1) – Whether to apply this augmentation to the input image. - 0 - do not apply this transformation
- 1 - apply this transformation
 
- phase_x (float, optional, default = 0.0) – Phase of the wave in x direction.
- phase_y (float, optional, default = 0.0) – Phase of the wave in y direction.
 
- 
class nvidia.dali.ops.nvJPEGDecoder(**kwargs)¶
- Decode JPEG images using the nvJPEG library. Output of the decoder is on the GPU and uses HWC ordering. - Parameters: - output_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of output image.
- use_batched_decode (bool, optional, default = False) – Use nvJPEG’s batched decoding API.