Reduce¶

Computes a reduction across dimensions of an input tensor into an output tensor.

Attributes¶

operation Reduce operation can be one of:

SUM Sums the elements.
PROD Multiplies the elements.
MAX Retrieves the maximum element.
MIN Retrieves the minimum element.
AVG Computes the average of the elements.

axes The axes to reduce, represented as a bitmask. For example, when \(axes=6\), dims 1 and 2 are reduced.

keep_dims Controls whether to preserve the original rank by keeping the reduced dimensions (with a dimension of 1), or to reduce the tensor’s rank.

Inputs¶

input: tensor of type T.

Outputs¶

output: tensor of type T.

Data Types¶

T: int8, int32, int64, float16, float32, bfloat16

Shape Information¶

input is a tensor with a shape of \([a_0,...,a_n], n \geq 1\)

when keep_dims is set to True, output is a tensor with a shape of \([b_0,...,b_n]\) where: \(b_i = \begin{cases}a_i, & axes[i] = 0 \\1, & axes[i] = 1 \\\end{cases}\)

when keep_dims is set to False, output is a tensor with a shape of \([b_0,...,b_m]\) and it’s shape is equivalent to the keep_dims = true case with the reduced unit dimensions removed.

DLA Support¶

DLA supports the following operation type:

MAX where any combination of the CHW is reduced.

Note that the batch size for DLA is the product of all dimensions except the CHW dimensions.

Examples¶

Reduce

in1 = network.add_input("input1", dtype=trt.float32, shape=(1, 2, 2, 3))
layer = network.add_reduce(in1, op=trt.ReduceOperation.MAX, axes=4, keep_dims=True)
network.mark_output(layer.get_output(0))

inputs[in1.name] = np.array(
    [
        [
            [[-3.0, -2.0, -1.0], [0.0, 1.0, 2.0]],
            [[3.0, 4.0, 5.0], [6.0, 7.0, 8.0]],
        ]
    ]
)

outputs[layer.get_output(0).name] = layer.get_output(0).shape
expected[layer.get_output(0).name] = np.array(
    [
        [
            [[0.0, 1.0, 2.0]],
            [[6.0, 7.0, 8.0]],
        ]
    ]
)

Reduce with keep dims set to false

in1 = network.add_input("input1", dtype=trt.float32, shape=(1, 2, 2, 3))
layer = network.add_reduce(in1, op=trt.ReduceOperation.PROD, axes=6, keep_dims=False)
network.mark_output(layer.get_output(0))

inputs[in1.name] = np.array(
    [
        [
            [[-3.0, -2.0, -1.0], [0.0, 1.0, 2.0]],
            [[3.0, 4.0, 5.0], [6.0, 7.0, 8.0]],
        ]
    ]
)

outputs[layer.get_output(0).name] = layer.get_output(0).shape
expected[layer.get_output(0).name] = np.array([[0.0, -56.0, -80.0]])

C++ API¶

For more information about the C++ IReduceLayer operator, refer to the C++ IReduceLayer documentation.

Python API¶

For more information about the Python IReduceLayer operator, refer to the Python IReduceLayer documentation.