Supported operations¶
-
class
nvidia.dali.ops.
Brightness
(**kwargs)¶ Changes the brightness of an image
Parameters: - brightness (float or float tensor, optional, default = 1.0) –
Brightness change factor. Values >= 0 are accepted. For example:
- 0 - black image,
- 1 - no change
- 2 - increase brightness twice
- image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image
- brightness (float or float tensor, optional, default = 1.0) –
-
class
nvidia.dali.ops.
Caffe2Reader
(**kwargs)¶ Read sample data from a Caffe2 Lightning Memory-Mapped Database (LMDB).
Parameters: - path (str) – Path to Caffe2 LMDB directory.
- additional_inputs (int, optional, default = 0) – Additional auxiliary data tensors provided for each sample.
- bbox (bool, optional, default = False) – Denotes if bounding-box information is present.
- initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
- label_type (int, optional, default = 0) –
Type of label stored in dataset.
- 0 = SINGLE_LABEL : single integer label for multi-class classification
- 1 = MULTI_LABEL_SPARSE : sparse active label indices for multi-label classification
- 2 = MULTI_LABEL_DENSE : dense label embedding vector for label embedding regression
- 3 = MULTI_LABEL_WEIGHTED_SPARSE : sparse active label indices with per-label weights for multi-label classification.
- num_labels (int, optional, default = 1) – Number of classes in dataset. Required when sparse labels are used.
- num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
- random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
- shard_id (int, optional, default = 0) – Id of the part to read.
- tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.
-
class
nvidia.dali.ops.
CaffeReader
(**kwargs)¶ Read (Image, label) pairs from a Caffe LMDB
Parameters: - path (str) – Path to Caffe LMDB directory.
- initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
- num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
- random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
- shard_id (int, optional, default = 0) – Id of the part to read.
- tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.
-
class
nvidia.dali.ops.
Cast
(**kwargs)¶ Cast tensor to a different type
Parameters: dtype (nvidia.dali.types.DALIDataType) – Output data type.
-
class
nvidia.dali.ops.
CoinFlip
(**kwargs)¶ Produce tensor filled with 0s and 1s - results of random coin flip, usable as an argument for select ops.
Parameters: probability (float, optional, default = 0.5) – Probability of returning 1.
-
class
nvidia.dali.ops.
Contrast
(**kwargs)¶ Changes the color contrast of the image.
Parameters: - contrast (float or float tensor, optional, default = 1.0) –
Contrast change factor. Values >= 0 are accepted. For example:
- 0 - gray image,
- 1 - no change
- 2 - increase contrast twice
- image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image
- contrast (float or float tensor, optional, default = 1.0) –
-
class
nvidia.dali.ops.
Copy
(**kwargs)¶ Make a copy of the input tensor
-
class
nvidia.dali.ops.
CropMirrorNormalize
(**kwargs)¶ Perform fused cropping, normalization, format conversion (NHWC to NCHW) if desired, and type casting. Normalization takes input image and produces output using formula
output = (input - mean) / stdParameters: - crop (int or list of int) – Size of the cropped image. If only a single value c is provided, the resulting crop will be square with size (c,c)
- mean (float or list of float) – Mean pixel values for image normalization.
- std (float or list of float) – Standard deviation values for image normalization.
- crop_pos_x (float or float tensor, optional, default = 0.5) – Horizontal position of the crop in image coordinates (0.0 - 1.0).
- crop_pos_y (float or float tensor, optional, default = 0.5) – Vertical position of the crop in image coordinates (0.0 - 1.0).
- image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
- mirror (int or int tensor, optional, default = 0) –
Mask for horizontal flip.
- 0 - do not perform horizontal flip for this image
- 1 - perform horizontal flip for this image.
- output_dtype (nvidia.dali.types.DALIDataType, optional, default = DALIDataType.FLOAT) – Output data type.
- output_layout (nvidia.dali.types.DALITensorLayout, optional, default = DALITensorLayout.NCHW) – Output tensor data layout
- pad_output (bool, optional, default = False) – Whether to pad the output to number of channels being multiple of 4.
-
class
nvidia.dali.ops.
DummyOp
(**kwargs)¶ Dummy operator for testing
Parameters: num_outputs (int, optional, default = 2) – Number of outputs.
-
class
nvidia.dali.ops.
DumpImage
(**kwargs)¶ Save images in batch to disk in PPM format. Useful for debugging.
Parameters: - input_layout (nvidia.dali.types.DALITensorLayout, optional, default = DALITensorLayout.NHWC) – Layout of input images.
- suffix (str, optional, default = '') – Suffix to be added to output file names.
-
class
nvidia.dali.ops.
ExternalSource
(**kwargs)¶ Allows externally provided data to be passed as an input to the pipeline
-
class
nvidia.dali.ops.
FastResizeCropMirror
(**kwargs)¶ Perform a fused resize, crop, mirror operation. Handles both fixed and random resizing and cropping. Backprojects the desired crop through the resize operation to reduce the amount of work performed.
Parameters: - crop (int or list of int) – Size of the cropped image. If only a single value c is provided, the resulting crop will be square with size (c,c)
- crop_pos_x (float or float tensor, optional, default = 0.5) – Horizontal position of the crop in image coordinates (0.0 - 1.0).
- crop_pos_y (float or float tensor, optional, default = 0.5) – Vertical position of the crop in image coordinates (0.0 - 1.0).
- image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Type of interpolation used.
- mirror (int or int tensor, optional, default = 0) –
Mask for horizontal flip.
- 0 - do not perform horizontal flip for this image
- 1 - perform horizontal flip for this image.
- resize_shorter (float or float tensor, optional, default = 0.0) – The length of the shorter dimension of the resized image. This option is mutually exclusive with resize_x and resize_y. The op will keep the aspect ratio of the original image.
- resize_x (float or float tensor, optional, default = 0.0) – The length of the X dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_y is left at 0, then the op will keep the aspect ratio of the original image.
- resize_y (float or float tensor, optional, default = 0.0) – The length of the Y dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_x is left at 0, then the op will keep the aspect ratio of the original image.
-
class
nvidia.dali.ops.
FileReader
(**kwargs)¶ Read (Image, label) pairs from a directory
Parameters: - file_root (str) – Path to a directory containing data files.
- file_list (str, optional, default = '') – Path to the file with a list of pairs
file label
(leave empty to traverse the file_root directory to obtain files and labels) - initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
- num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
- random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
- shard_id (int, optional, default = 0) – Id of the part to read.
- tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.
-
class
nvidia.dali.ops.
HostDecoder
(**kwargs)¶ Decode images on the host using OpenCV. When applicable, it will pass execution to faster, format-specific decoders (like libjpeg-turbo). Output of the decoder is in HWC ordering.
Parameters: output_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of output image.
-
class
nvidia.dali.ops.
Hue
(**kwargs)¶ Changes the hue level of the image.
Parameters: - hue (float or float tensor, optional, default = 0.0) – Hue change in angles.
- image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image
-
class
nvidia.dali.ops.
Jitter
(**kwargs)¶ Perform a random Jitter augmentation. The output image is produced by moving each pixel by a random amount bounded by half of nDegree parameter (in both x and y dimensions).
Parameters: - fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
- mask (int or int tensor, optional, default = 1) –
Whether to apply this augmentation to the input image.
- 0 - do not apply this transformation
- 1 - apply this transformation
- nDegree (int, optional, default = 2) – Each pixel is moved by a random amount in range [-nDegree/2, nDegree/2].
-
class
nvidia.dali.ops.
MXNetReader
(**kwargs)¶ Read sample data from a MXNet RecordIO
Parameters: - index_path (str or list of str) – List (of length 1) containing a path to index (.idx) file. It is generated by the MXNet’s im2rec.py script together with RecordIO file. It can also be generated using rec2idx script distributed with DALI.
- path (str or list of str) – List of paths to RecordIO files.
- initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
- num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
- random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
- shard_id (int, optional, default = 0) – Id of the part to read.
- tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.
-
class
nvidia.dali.ops.
NormalizePermute
(**kwargs)¶ Perform fused normalization, format conversion from NHWC to NCHW and type casting. Normalization takes input image and produces output using formula
output = (input - mean) / std
Parameters: - height (int) – Height of the input image.
- mean (float or list of float) – Mean pixel values for image normalization.
- std (float or list of float) – Standard deviation values for image normalization.
- width (int) – Width of the input image.
- image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
- output_dtype (nvidia.dali.types.DALIDataType, optional, default = DALIDataType.FLOAT) – Output data type.
-
class
nvidia.dali.ops.
RandomResizedCrop
(**kwargs)¶ Perform a crop with randomly chosen area and aspect ratio, then resize it to given size.
Parameters: - size (int or list of int) – Size of resized image.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Type of interpolation used.
- num_attempts (int, optional, default = 10) – Maximum number of attempts used to choose random area and aspect ratio.
- random_area (float or list of float, optional, default = [0.08, 1.0]) – Range from which to choose random area factor A. Before resizing, the cropped image’s area will be equal to A * original image’s area.
- random_aspect_ratio (float or list of float, optional, default = [0.75, 1.333333]) – Range from which to choose random aspect ratio.
-
class
nvidia.dali.ops.
Resize
(**kwargs)¶ Resize images.
Parameters: - image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Type of interpolation used.
- resize_shorter (float or float tensor, optional, default = 0.0) – The length of the shorter dimension of the resized image. This option is mutually exclusive with resize_x and resize_y. The op will keep the aspect ratio of the original image.
- resize_x (float or float tensor, optional, default = 0.0) – The length of the X dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_y is left at 0, then the op will keep the aspect ratio of the original image.
- resize_y (float or float tensor, optional, default = 0.0) – The length of the Y dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_x is left at 0, then the op will keep the aspect ratio of the original image.
- save_attrs (bool, optional, default = False) – Save reshape attributes for testing.
-
class
nvidia.dali.ops.
ResizeCropMirror
(**kwargs)¶ Perform a fused resize, crop, mirror operation. Handles both fixed and random resizing and cropping.
Parameters: - crop (int or list of int) – Size of the cropped image. If only a single value c is provided, the resulting crop will be square with size (c,c)
- crop_pos_x (float or float tensor, optional, default = 0.5) – Horizontal position of the crop in image coordinates (0.0 - 1.0).
- crop_pos_y (float or float tensor, optional, default = 0.5) – Vertical position of the crop in image coordinates (0.0 - 1.0).
- image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Type of interpolation used.
- mirror (int or int tensor, optional, default = 0) –
Mask for horizontal flip.
- 0 - do not perform horizontal flip for this image
- 1 - perform horizontal flip for this image.
- resize_shorter (float or float tensor, optional, default = 0.0) – The length of the shorter dimension of the resized image. This option is mutually exclusive with resize_x and resize_y. The op will keep the aspect ratio of the original image.
- resize_x (float or float tensor, optional, default = 0.0) – The length of the X dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_y is left at 0, then the op will keep the aspect ratio of the original image.
- resize_y (float or float tensor, optional, default = 0.0) – The length of the Y dimension of the resized image. This option is mutually exclusive with resize_shorter. If the resize_x is left at 0, then the op will keep the aspect ratio of the original image.
-
class
nvidia.dali.ops.
Rotate
(**kwargs)¶ Rotate the image.
Parameters: - angle (float or float tensor) – Rotation angle.
- fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
- mask (int or int tensor, optional, default = 1) –
Whether to apply this augmentation to the input image.
- 0 - do not apply this transformation
- 1 - apply this transformation
-
class
nvidia.dali.ops.
Saturation
(**kwargs)¶ Changes saturation level of the image.
Parameters: - image_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image
- saturation (float or float tensor, optional, default = 1.0) –
Saturation change factor. Values >= 0 are supported. For example:
- 0 - completely desaturated image
- 1 - no change to image’s saturation
-
class
nvidia.dali.ops.
Sphere
(**kwargs)¶ Perform a sphere augmentation.
Parameters: - fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
- mask (int or int tensor, optional, default = 1) –
Whether to apply this augmentation to the input image.
- 0 - do not apply this transformation
- 1 - apply this transformation
-
class
nvidia.dali.ops.
TFRecordReader
(path, index_path, features, **kwargs)¶ Read sample data from a TensorFlow TFRecord file.
Parameters: - features (dict of (string, nvidia.dali.tfrecord.Feature)) – Dictionary of names and configuration of features existing in TFRecord file. Typically obtained using helper functions dali.tfrecord.FixedLenFeature and dali.tfrecord.VarLenFeature, they are equivalent to TensorFlow’s tf.FixedLenFeature and tf.VarLenFeature respectively.
- index_path (str or list of str) – List of paths to index files (1 index file for every TFRecord file). Index files may be obtained from TFRecord files using tfrecord2idx script distributed with DALI.
- path (str or list of str) – List of paths to TFRecord files.
- initial_fill (int, optional, default = 1024) – Size of the buffer used for shuffling.
- num_shards (int, optional, default = 1) – Partition the data into this many parts (used for multiGPU training).
- random_shuffle (bool, optional, default = False) – Whether to randomly shuffle data.
- shard_id (int, optional, default = 0) – Id of the part to read.
- tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.
-
class
nvidia.dali.ops.
Uniform
(**kwargs)¶ Produce tensor filled with uniformly distributed random numbers.
Parameters: range (float or list of float, optional, default = [-1.0, 1.0]) – Range of produced random numbers.
-
class
nvidia.dali.ops.
WarpAffine
(**kwargs)¶ Apply an affine transformation to the image.
Parameters: - matrix (float or list of float) –
Matrix of the transform (dst -> src). Given list of values (M11, M12, M13, M21, M22, M23) this operation will produce a new image using formula
dst(x,y) = src(M11 * x + M12 * y + M13, M21 * x + M22 * y + M23)
It is equivalent to OpenCV’s warpAffine operation with a flag WARP_INVERSE_MAP set.
- fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
- mask (int or int tensor, optional, default = 1) –
Whether to apply this augmentation to the input image.
- 0 - do not apply this transformation
- 1 - apply this transformation
- use_image_center (bool, optional, default = False) – Whether to use image center as the center of transformation. When this is True coordinates are calculated from the center of the image.
- matrix (float or list of float) –
-
class
nvidia.dali.ops.
Water
(**kwargs)¶ Perform a water augmentation (make image appear to be underwater).
Parameters: - ampl_x (float, optional, default = 10.0) – Amplitude of the wave in x direction.
- ampl_y (float, optional, default = 10.0) – Amplitude of the wave in y direction.
- fill_value (float, optional, default = 0.0) – Color value used for padding pixels.
- freq_x (float, optional, default = 0.049087) – Frequency of the wave in x direction.
- freq_y (float, optional, default = 0.049087) – Frequence of the wave in y direction.
- interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_NN) – Type of interpolation used.
- mask (int or int tensor, optional, default = 1) –
Whether to apply this augmentation to the input image.
- 0 - do not apply this transformation
- 1 - apply this transformation
- phase_x (float, optional, default = 0.0) – Phase of the wave in x direction.
- phase_y (float, optional, default = 0.0) – Phase of the wave in y direction.
-
class
nvidia.dali.ops.
nvJPEGDecoder
(**kwargs)¶ Decode JPEG images using the nvJPEG library. Output of the decoder is on the GPU and uses HWC ordering.
Parameters: - output_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of output image.
- use_batched_decode (bool, optional, default = False) – Use nvJPEG’s batched decoding API.