nvidia.dali.fn.normalize

nvidia.dali.fn.normalize(__input, /, *, axes=[], axis_names='', batch=False, bytes_per_sample_hint=[0], ddof=0, dtype=DALIDataType.FLOAT, epsilon=0.0, mean=None, preserve=False, scale=1.0, seed=-1, shift=0.0, stddev=None, device=None, name=None)

Normalizes the input by removing the mean and dividing by the standard deviation.

The mean and standard deviation can be calculated internally for the specified subset of axes or can be externally provided as the mean and stddev arguments.

The normalization is done following the formula:

out = scale * (in - mean) / stddev + shift

The formula assumes that out and in are equally shaped tensors, but mean and stddev might be either tensors of same shape, scalars, or a mix of these.

Note

The expression follows the numpy broadcasting rules.

Sizes of the non-scalar mean and stddev must have an extent of 1, if given axis is reduced, or match the corresponding extent of the input. A dimension is considered reduced if it is listed in axes or axis_names. If neither the axes nor the axis_names argument is present, the set of reduced axes is inferred by comparing the input shape to the shape of the mean/stddev arguments, but the set of reduced axes must be the same for all tensors in the batch.

Here are some examples of valid argument combinations:

  1. Per-sample normalization of dimensions 0 and 2:

    axes = 0,2                                        # optional
    input.shape = [ [480, 640, 3], [1080, 1920, 4] ]
    batch = False
    mean.shape =  [ [1, 640, 1], [1, 1920, 1] ]
    stddev = (not supplied)
    

With these shapes, batch normalization is not possible, because the non-reduced dimension has a different extent across samples.

  1. Batch normalization of dimensions 0 and 1:

    axes = 0,1                                        # optional
    input.shape = [ [480, 640, 3], [1080, 1920, 3] ]
    batch = True
    mean = (scalar)
    stddev.shape =  [ [1, 1, 3] ] ]
    

For color images, this example normalizes the 3 color channels separately, but across all samples in the batch.

This operator allows sequence inputs and supports volumetric data.

Supported backends
  • ‘cpu’

  • ‘gpu’

Parameters:

__input (TensorList) – Input to the operator.

Keyword Arguments:
  • axes (int or list of int, optional, default = []) –

    Indices of dimensions along which the input is normalized.

    By default, all axes are used, and the axes can also be specified by name. See axis_names for more information.

  • axis_names (layout str, optional, default = ‘’) –

    Names of the axes in the input.

    Axis indices are taken from the input layout, and this argument cannot be used with axes.

  • batch (bool, optional, default = False) –

    If set to True, the mean and standard deviation are calculated across tensors in the batch.

    This argument also requires that the input sample shapes in the non-reduced axes match.

  • bytes_per_sample_hint (int or list of int, optional, default = [0]) –

    Output size hint, in bytes per sample.

    If specified, the operator’s outputs residing in GPU or page-locked host memory will be preallocated to accommodate a batch of samples of this size.

  • ddof (int, optional, default = 0) –

    Delta Degrees of Freedom for Bessel’s correction.

    The variance is estimated by using the following formula:

    sum(Xi - mean)**2 / (N - ddof).
    

    This argument is ignored when an externally supplied standard deviation is used.

  • dtype (nvidia.dali.types.DALIDataType, optional, default = DALIDataType.FLOAT) –

    Output data type.

    When using integral types, use shift and scale to improve the usage of the output type’s dynamic range. If dtype is an integral type, out of range values are clamped, and non-integer values are rounded to nearest integer.

  • epsilon (float, optional, default = 0.0) – A value that is added to the variance to avoid division by small numbers.

  • mean (float or TensorList of float, optional) –

    Mean value to be subtracted from the data.

    The value can be a scalar or a batch of tensors with the same dimensionality as the input. The extent in each dimension must match the value of the input or be equal to 1. If the extent is 1, the value will be broadcast in this dimension. If the value is not specified, the mean is calculated from the input. A non-scalar mean cannot be used when batch argument is set to True.

  • preserve (bool, optional, default = False) – Prevents the operator from being removed from the graph even if its outputs are not used.

  • scale (float, optional, default = 1.0) –

    The scaling factor applied to the output.

    This argument is useful for integral output types.

  • seed (int, optional, default = -1) –

    Random seed.

    If not provided, it will be populated based on the global seed of the pipeline.

  • shift (float, optional, default = 0.0) –

    The value to which the mean will map in the output.

    This argument is useful for unsigned output types.

  • stddev (float or TensorList of float, optional) –

    Standard deviation value to scale the data.

    See mean argument for more information about shape constraints. If a value is not specified, the standard deviation is calculated from the input. A non-scalar stddev cannot be used when batch argument is set to True.