Normalization#

Normalizes an input tensor on a set of axes. The general normalization equation is:

\(Y=\frac{ X - Mean(X, axes) }{ \sqrt{Var(X, axes) + epsilon } } * S + B\)

Where:

X is the input tensor

Y is the output tensor

Mean(X, axes) is the mean of the input across the set of provided axes

Var(X, axes) is the variance of the input across the set of provided axes

S is the scale tensor

B is the bias tensor

Attributes#

epsilon The epsilon value used in normalization to avoid division by 0.

axes The axes to perform normalization on.

num_groups The number of groups of the normalization. If num_groups != 1, the input channels will be split into num_groups before normalization is performed.

compute_precision The precision of which the normalization computation will be perform in.

Inputs#

input: tensor of type T1

scale: tensor of type T1

bias: tensor of type T1

Outputs#

output: tensor of type T1

Data Types#

T1: float32, float16, bfloat16

Shape Information#

Depending on the type of the normalization operation, the expected input shapes are different.

InstanceNormalization#

Input shape should be of \([N, C, H, W, ...]\) for image input, or \([N, C, D1, ...]\) for non-image input.

Scale and Bias shape should be of \([1, C, 1, 1, ..., 1]\), where C is the number of channels of the input tensor.

Output shape should be of \([N, C, H, W, ...]\) or \([N, C, D1, ...]\).

GroupNormalization#

Input shape should be of \([N, C, H, W, ...]\) for image input, or \([N, C, D1, ...]\) for non-image input.

Scale and Bias shape should be of \([1, G, 1, 1, ..., 1]\), where G is the number of groups.

Output shape should be of \([N, C, H, W, ...]\) or \([N, C, D1, ...]\).

LayerNormalization#

Input shape should be of \([D0, D1, D2, D3, ..., DN]\)

Scale and Bias shape depend on the provided axes.

For example, for axes \([2, 3, ... N]\), Scale and Bias shape should be of \([1, 1, D2, D3, ..., DN]\)

For axes \([3, ... N]\), Scale and Bias shape should be of \([1, 1, 1, D3, ..., DN]\)

Output shape should be of \([D0, D1, D2, D3, ..., DN]\)

Examples#

def convert_axes_to_tuple(axes, num_dims):
    converted_axes = []
    for i in range(num_dims):
        ref = 1 << i
        if axes & ref == ref:
            converted_axes.append(i)
    return tuple(converted_axes)

def normalization_reference(x, s, bias, axes, epsilon=1e-5):  # type: ignore
    num_dims = len(x.shape)
    converted_axes = convert_axes_to_tuple(axes, num_dims)
    mean = np.mean(x, axis=converted_axes, keepdims=True)
    var = np.var(x, axis=converted_axes, keepdims=True)
    return s * (x - mean) / np.sqrt(var + epsilon) + bias

input = network.add_input("input1", dtype=trt.float32, shape=(2, 3, 2, 2))
scale = network.add_input("scale", dtype=trt.float32, shape=(1, 3, 1, 1))
bias = network.add_input("bias", dtype=trt.float32, shape=(1, 3, 1, 1))

axes = 1 << 2 | 1 << 3
layer = network.add_normalization(input, scale, bias, axes)
network.mark_output(layer.get_output(0))

input_vals = np.random.rand(2, 3, 2, 2)
scale_vals = np.array([1.0, 2.0, 3.0]).reshape((1, 3, 1, 1))
bias_vals = np.array([-3.0, -2.0, -1.0]).reshape((1, 3, 1, 1))

inputs[input.name] = input_vals
inputs[scale.name] = scale_vals
inputs[bias.name] = bias_vals

outputs[layer.get_output(0).name] = layer.get_output(0).shape

ref = normalization_reference(input_vals, scale_vals, bias_vals, axes)

expected[layer.get_output(0).name] = ref

C++ API#

For more information about the C++ INormalization operator, refer to the C++ INormalization documentation

Python API#

For more information about the Python INormalization operator, refer to the Python INormalization documentation