TrivialAugment

TrivialAugment, as described in https://arxiv.org/abs/2103.10158, is an automatic augmentation scheme that is parameter-free - it can be used without the search for optimal meta-parameters.

Each sample is processed with just one randomly selected augmentation. The magnitude bin for every augmentation is randomly selected.

To use the TrivialAugment, import and call the trivial_augment_wide() inside the pipeline definition, for example:

from nvidia.dali import pipeline_def, fn, types
from nvidia.dali.auto_aug import trivial_augment

@pipeline_def(enable_conditionals=True)
def training_pipe(data_dir, image_size):

    jpegs, labels = fn.readers.file(file_root=data_dir, ...)
    shapes = fn.peek_image_shape(jpegs)
    images = fn.decoders.image(jpegs, device="mixed", output_type=types.RGB)

    augmented_images = trivial_augment.trivial_augment_wide(images, shape=shapes)

    resized_images = fn.resize(augmented_images, size=[image_size, image_size])

    return resized_images, labels

trivial_augment_wide() uses a standard set of augmentation, as described in the paper. To use a custom version of TrivialAugment see the TrivialAugment API section.

Warning

You need to define the pipeline with the @pipeline_def decorator and set enable_conditionals to True to use automatic augmentations.

TrivialAugment API

The standard set of augmentations (TrivialAugment Wide) can be used by invoking the trivial_augment_wide() inside the pipeline definition.

A TrivialAugment policy is a list of augmentations. To obtain the list for the TrivialAugment Wide use get_trivial_augment_wide_suite().

To use a custom list of augmentations, pass it as a first argument to the apply_trivial_augment() invoked inside the pipeline definition.

nvidia.dali.auto_aug.trivial_augment.trivial_augment_wide(data, num_magnitude_bins=31, shape=None, fill_value=128, interp_type=None, max_translate_abs=None, max_translate_rel=None, seed=None, excluded=None)

Applies TrivialAugment Wide (https://arxiv.org/abs/2103.10158) augmentation scheme to the provided batch of samples.

Parameters:
  • data (DataNode) – A batch of samples to be processed. The supported samples are images of HWC layout and videos of FHWC layout, the supported data type is uint8.

  • num_magnitude_bins (int, optional) – The number of bins to divide the magnitude ranges into.

  • fill_value (int, optional) – A value to be used as a padding for images/frames transformed with warp_affine ops (translation, shear and rotate). If None is specified, the images/frames are padded with the border value repeated (clamped).

  • interp_type (DALIInterpType, optional) – Interpolation method used by the warp_affine ops (translation, shear and rotate). Supported values are types.INTERP_LINEAR (default) and types.INTERP_NN.

  • max_translate_abs (int or (int, int), optional) – Only valid when shapes is not provided. Specifies the maximal shift (in pixels) in the translation augmentation. If a tuple is specified, the first component limits height, the second the width. Defaults to 32, which means the maximal magnitude shifts the image by 32 pixels.

  • max_translate_rel (float or (float, float), optional) – Only valid when shapes argument is provided. Specifies the maximal shift as a fraction of image shape in the translation augmentations. If a tuple is specified, the first component limits the height, the second the width. Defaults to 1, which means the maximal magnitude shifts the image entirely out of the canvas.

  • seed (int, optional) – Seed to be used to randomly sample operations (and to negate magnitudes).

  • excluded (List[str], optional) – A list of names of the operations to be excluded from the default suite of augmentations. If, instead of just limiting the set of operations, you need to include some custom operations or fine-tuned of the existing ones, you can use the apply_trivial_augment() directly, which accepts a list of augmentations.

Returns:

A batch of transformed samples.

Return type:

DataNode

nvidia.dali.auto_aug.trivial_augment.apply_trivial_augment(augmentations, data, num_magnitude_bins=31, seed=None, **kwargs)

Applies the list of augmentations in TrivialAugment (https://arxiv.org/abs/2103.10158) fashion. Each sample is processed with randomly selected transformation form augmentations list. The magnitude bin for every transformation is randomly selected from [0, num_magnitude_bins - 1].

Parameters:
  • augmentations (List[core._Augmentation]) – List of augmentations to be sampled and applied in TrivialAugment fashion.

  • data (DataNode) – A batch of samples to be processed.

  • num_magnitude_bins (int, optional) – The number of bins to divide the magnitude ranges into.

  • seed (int, optional) – Seed to be used to randomly sample operations (and to negate magnitudes).

  • kwargs – Any extra parameters to be passed when calling augmentations. The signature of each augmentation is checked for any extra arguments and if the name of the argument matches one from the kwargs, the value is passed as an argument. For example, some augmentations from the default TrivialAugment suite accept shapes, fill_value and interp_type.

Returns:

A batch of transformed samples.

Return type:

DataNode

nvidia.dali.auto_aug.trivial_augment.get_trivial_augment_wide_suite(use_shape=False, max_translate_abs=None, max_translate_rel=None)

Creates a list of 14 augmentations referred as wide augmentation space in TrivialAugment paper (https://arxiv.org/abs/2103.10158).

Parameters:
  • use_shape (bool) – If true, the translation offset is computed as a percentage of the image/frame shape. Useful if the samples processed with the auto augment have different shapes. If false, the offsets range is bounded by a constant (max_translate_abs).

  • max_translate_abs (int or (int, int), optional) – Only valid with use_shape=False, specifies the maximal shift (in pixels) in the translation augmentations. If a tuple is specified, the first component limits height, the second the width. Defaults to 32.

  • max_translate_rel (float or (float, float), optional) – Only valid with use_shape=True, specifies the maximal shift as a fraction of image/frame shape in the translation augmentations. If a tuple is specified, the first component limits height, the second the width. Defaults to 1.