Reduce#

Computes a reduction across dimensions of an input tensor into an output tensor.

Attributes#

operation Reduce operation can be one of:

  • SUM Sums the elements.

  • PROD Multiplies the elements.

  • MAX Retrieves the maximum element.

  • MIN Retrieves the minimum element.

  • AVG Computes the average of the elements.

axes The axes to reduce, represented as a bitmask. For example, when \(axes=6\), dims 1 and 2 are reduced.

keep_dims Controls whether to preserve the original rank by keeping the reduced dimensions (with a dimension of 1), or to reduce the tensor’s rank.

Inputs#

input: tensor of type T.

Outputs#

output: tensor of type T.

Data Types#

T: int8, int32, int64, float16, float32, bfloat16

Shape Information#

input is a tensor with a shape of \([a_0,...,a_n], n \geq 1\)

when keep_dims is set to True, output is a tensor with a shape of \([b_0,...,b_n]\) where: \(b_i = \begin{cases}a_i, & axes[i] = 0 \\1, & axes[i] = 1 \\\end{cases}\)

when keep_dims is set to False, output is a tensor with a shape of \([b_0,...,b_m]\) and it’s shape is equivalent to the keep_dims = true case with the reduced unit dimensions removed.

DLA Support#

DLA supports the following operation type:

  • MAX where any combination of the CHW is reduced.

Note that the batch size for DLA is the product of all dimensions except the CHW dimensions.

Examples#

Reduce
in1 = network.add_input("input1", dtype=trt.float32, shape=(1, 2, 2, 3))
layer = network.add_reduce(in1, op=trt.ReduceOperation.MAX, axes=4, keep_dims=True)
network.mark_output(layer.get_output(0))

inputs[in1.name] = np.array(
    [
        [
            [[-3.0, -2.0, -1.0], [0.0, 1.0, 2.0]],
            [[3.0, 4.0, 5.0], [6.0, 7.0, 8.0]],
        ]
    ]
)

outputs[layer.get_output(0).name] = layer.get_output(0).shape
expected[layer.get_output(0).name] = np.array(
    [
        [
            [[0.0, 1.0, 2.0]],
            [[6.0, 7.0, 8.0]],
        ]
    ]
)
Reduce with keep dims set to false
in1 = network.add_input("input1", dtype=trt.float32, shape=(1, 2, 2, 3))
layer = network.add_reduce(in1, op=trt.ReduceOperation.PROD, axes=6, keep_dims=False)
network.mark_output(layer.get_output(0))

inputs[in1.name] = np.array(
    [
        [
            [[-3.0, -2.0, -1.0], [0.0, 1.0, 2.0]],
            [[3.0, 4.0, 5.0], [6.0, 7.0, 8.0]],
        ]
    ]
)

outputs[layer.get_output(0).name] = layer.get_output(0).shape
expected[layer.get_output(0).name] = np.array([[0.0, -56.0, -80.0]])

C++ API#

For more information about the C++ IReduceLayer operator, refer to the C++ IReduceLayer documentation.

Python API#

For more information about the Python IReduceLayer operator, refer to the Python IReduceLayer documentation.