Normalization

Normalizes an input tensor on a set of axes. The general normalization equation is:

\(Y=\frac{ X - Mean(X, axes) }{ \sqrt{Var(X, axes) + epsilon } } * S + B\)

Where:

X is the input tensor

Y is the output tensor

Mean(X, axes) is the mean of the input across the set of provided axes

Var(X, axes) is the variance of the input across the set of provided axes

S is the scale tensor

B is the bias tensor

Attributes

epsilon The epsilon value used in normalization to avoid division by 0.

axes The axes to perform normalization on.

num_groups The number of groups of the normalization. If num_groups != 1, the input channels will be split into num_groups before normalization is performed.

compute_precision The precision of which the normalization computation will be perform in.

Inputs

input: tensor of type T1

scale: tensor of type T1

bias: tensor of type T1

Outputs

output: tensor of type T1

Data Types

T1: float32, float16, bfloat16

Shape Information

Depending on the type of the normalization operation, the expected input shapes are different.

InstanceNormalization

Input shape should be of \([N, C, H, W, ...]\) for image input, or \([N, C, D1, ...]\) for non-image input.

Scale and Bias shape should be of \([1, C, 1, 1, ..., 1]\), where C is the number of channels of the input tensor.

Output shape should be of \([N, C, H, W, ...]\) or \([N, C, D1, ...]\).

GroupNormalization

Input shape should be of \([N, C, H, W, ...]\) for image input, or \([N, C, D1, ...]\) for non-image input.

Scale and Bias shape should be of \([1, G, 1, 1, ..., 1]\), where G is the number of groups.

Output shape should be of \([N, C, H, W, ...]\) or \([N, C, D1, ...]\).

LayerNormalization

Input shape should be of \([D0, D1, D2, D3, ..., DN]\)

Scale and Bias shape depend on the provided axes.

For example, for axes \([2, 3, ... N]\), Scale and Bias shape should be of \([1, 1, D2, D3, ..., DN]\)

For axes \([3, ... N]\), Scale and Bias shape should be of \([1, 1, 1, D3, ..., DN]\)

Output shape should be of \([D0, D1, D2, D3, ..., DN]\)

Examples

input = network.add_input("input1", dtype=trt.float32, shape=(2, 3, 2, 2))
scale = network.add_input("scale", dtype=trt.float32, shape=(1, 3, 1, 1))
bias = network.add_input("bias", dtype=trt.float32, shape=(1, 3, 1, 1))

axes = 1 << 2 | 1 << 3
layer = network.add_normalization(input, scale, bias, axes)
network.mark_output(layer.get_output(0))

input_vals = np.random.rand(2, 3, 2, 2)
scale_vals = np.array([1.0, 2.0, 3.0]).reshape((1, 3, 1, 1))
bias_vals = np.array([-3.0, -2.0, -1.0]).reshape((1, 3, 1, 1))

inputs[input.name] = input_vals
inputs[scale.name] = scale_vals
inputs[bias.name] = bias_vals

outputs[layer.get_output(0).name] = layer.get_output(0).shape

ref = normalization_reference(input_vals, scale_vals, bias_vals, axes)

expected[layer.get_output(0).name] = ref

C++ API

For more information about the C++ INormalization operator, refer to the C++ INormalization documentation

Python API

For more information about the Python INormalization operator, refer to the Python INormalization documentation