Pooling¶

Computes a per-channel pooling using a sampling window on the input tensor into an output tensor. The supported sampling window shapes are 2-D or 3-D.

Attributes¶

pooling_type Pooling operation can be one of:

MAX For each output element, return the maximum value found in its corresponding sampling window.
AVERAGE For each output element, return the average of the values in its corresponding sampling window.
MAX_AVERAGE_BLEND For each output element, return the weighted sum of MAX and AVG pooling, where blend_factor is the blending factor. \(pooling(\text{MAX_AVERAGE_BLEND})=(1-\text{blend_factor}) \cdot pooling(MAX) + \text{blend_factor} \cdot pooling(AVERAGE)\).

blend_factor A parameter used when the pooling type is set to MAX_AVERAGE_BLEND.

padding_mode The padding mode. The padding mode can be one of the following:

\[\begin{split}\begin{gather} I = \text{dimensions of input image.} \\ B = \text{pre-padding, before the image data. For deconvolution, pre-padding is set before output.} \\ A = \text{post-padding, after the image data. For deconvolution, post-padding is set after output.} \\ P = \text{delta between input and output} \\ S = \text{stride} \\ F = \text{filter} \\ O = \text{output} \\ D = \text{dilation. For Pooling layers, always equals 1} \\ M = I + B + A\text{The data plus any padding} \\ DK = 1 + D \cdot (F - 1) \\ \end{gather}\end{split}\]

EXPLICIT_ROUND_DOWN Use explicit padding, rounding the output size down.

\(O = \lfloor\frac{M - DK}{S}\rfloor + 1\)
EXPLICIT_ROUND_UP Use explicit padding, rounding the output size up.

\(O = \lceil\frac{M - DK}{S}\rceil + 1\)
SAME_UPPER Use SAME padding, with \(\text{pre-padding} \leq \text{post-padding}\).

\(\begin{gather}O = \lceil\frac{I}{S}\rceil \\ P = \lfloor\frac{I-1}{S}\rfloor \cdot S + DK -I \\ B = \lfloor\frac{P}{2}\rfloor \\ A = P - B \end{gather}\)
SAME_LOWER Use SAME padding, with \(\text{pre-padding} \geq \text{post-padding}\).

\(\begin{gather}O = \lceil\frac{I}{S}\rceil \\ P = \lfloor\frac{I-1}{S}\rfloor \cdot S + DK -I \\ A = \lfloor\frac{P}{2}\rfloor \\ B = P - A \end{gather}\)

average_count_excludes_padding When setting this parameter, the average pooling calculation ignores the padded input.

Inputs¶

input: tensor of type T

Outputs¶

output: tensor of type T

Data Types¶

T: int8, float16, float32

Shape Information¶

Input tensor must be a tensor with rank \(r\geq3\).

Output tensor rank is same as the input tensor rank. If the input’s shape is \([a_0,...,a_n]\), the stride is \(s\), the padding is \(p\), and the sampling window shape is \([r_0,..,r_m]\) where \(m=2\) or \(m=3\):

\[\begin{split}b_i = \begin{cases} a_i &\mbox{if } i \in [0,n-m) \\ \frac{a_i + 2 \cdot p_{m+i-n} + r_{m+i-n}}{s_{m+i-n}} &\mbox{else} \\ \end{cases}\end{split}\]

Volume Limits¶

input and output can have up to \(2^{31}\) elements. If shapes of input and output tensors are all build-time constants, the maximum number of elements is \(2^{40}\).

Supporting Formats¶

As of TRT 10.5, TRT only supports FP32/FP16 NCHW kernels for 3D pooling.

DLA Support¶

DLA FP16 and DLA INT8 are supported for 2D pooling for max pooling, and for the inclusive padding mode of average pooling.

Examples¶

Pooling

in1 = network.add_input("input1", dtype=trt.float32, shape=(1, 1, 5, 5))
layer = network.add_pooling_nd(in1, trt.PoolingType.MAX, trt.tensorrt.DimsHW(3, 3))
network.mark_output(layer.get_output(0))

inputs[in1.name] = np.array(
    [
        [
            [
                [-10.0, -9.0, -8.0, -7.0, -6.0],
                [-5.0, -4.0, -3.0, -2.0, -1.0],
                [0.0, 1.0, 2.0, 3.0, 4.0],
                [5.0, 6.0, 7.0, 8.0, 9.0],
                [10.0, 11.0, 12.0, 13.0, 14.0],
            ]
        ]
    ]
)
np.reshape(np.arange(-10, 15, dtype=np.float32), newshape=(1, 1, 5, 5))

outputs[layer.get_output(0).name] = layer.get_output(0).shape

expected[layer.get_output(0).name] = np.array([[[[2.0, 3.0, 4.0], [7.0, 8.0, 9.0], [12.0, 13.0, 14.0]]]])

in1 = network.add_input("input1", dtype=trt.float32, shape=(1, 1, 5, 5))
layer = network.add_pooling_nd(in1, trt.PoolingType.AVERAGE, trt.tensorrt.DimsHW(3, 3))
layer.post_padding = (1, 1)
layer.pre_padding = (1, 1)
layer.average_count_excludes_padding = True
network.mark_output(layer.get_output(0))

inputs[in1.name] = np.array(
    [
        [
            [
                [-10.0, -9.0, -8.0, -7.0, -6.0],
                [-5.0, -4.0, -3.0, -2.0, -1.0],
                [0.0, 1.0, 2.0, 3.0, 4.0],
                [5.0, 6.0, 7.0, 8.0, 9.0],
                [10.0, 11.0, 12.0, 13.0, 14.0],
            ]
        ]
    ]
)
np.reshape(np.arange(-10, 15, dtype=np.float32), newshape=(1, 1, 5, 5))

outputs[layer.get_output(0).name] = layer.get_output(0).shape

expected[layer.get_output(0).name] = np.array(
    [
        [
            [
                [-7.0, -6.5, -5.5, -4.5, -4.0],
                [-4.5, -4.0, -3.0, -2.0, -1.5],
                [0.5, 1.0, 2.0, 3.0, 3.5],
                [5.5, 6.0, 7.0, 8.0, 8.5],
                [8.0, 8.5, 9.5, 10.5, 11.0],
            ]
        ]
    ]
)

C++ API¶

For more information about the C++ IPoolingLayer operator, refer to the C++ IPoolingLayer documentation.

Python API¶

For more information about the Python IPoolingLayer operator, refer to the Python IPoolingLayer documentation.