Pooling

Computes a per-channel pooling using a sampling window on the input tensor into an output tensor. The supported sampling window shapes are 2-D or 3-D.

Attributes

pooling_type Pooling operation can be one of:

  • MAX For each output element, return the maximum value found in its corresponding sampling window.

  • AVERAGE For each output element, return the average of the values in its corresponding sampling window.

  • MAX_AVERAGE_BLEND For each output element, return the weighted sum of MAX and AVG pooling, where blend_factor is the blending factor. \(pooling(\text{MAX_AVERAGE_BLEND})=(1-\text{blend_factor}) \cdot pooling(MAX) + \text{blend_factor} \cdot pooling(AVERAGE)\).

blend_factor A parameter used when the pooling type is set to MAX_AVERAGE_BLEND.

window_size The sampling window shape.

stride Controls the stride used for the sampling window.

padding Controls the padding used for the input tensor.

padding_mode Controls the padding mode, can be one of:

padding_mode The padding mode. The padding mode can be one of the following:

\[\begin{split}I = \text{dimensions of input image.} \\ B = \text{pre-padding, before the image data. For deconvolution, pre-padding is set before output.} \\ A = \text{post-padding, after the image data. For deconvolution, post-padding is set after output.} \\ P = \text{delta between input and output} \\ S = \text{stride} \\ F = \text{filter} \\ O = \text{output} \\ D = \text{dilation} \\ M = I + B + A\text{The data plus any padding} \\ DK = 1 + D \cdot (F - 1) \\\end{split}\]
  • EXPLICIT_ROUND_DOWN Use explicit padding, rounding the output size down.
    \(O = \lfloor\frac{M - F}{S}\rfloor + 1\)
  • EXPLICIT_ROUND_UP Use explicit padding, rounding the output size up.
    \(O = \lceil\frac{M - F}{S}\rceil + 1\)
  • SAME_UPPER Use SAME padding, with \(\text{pre-padding} \leq \text{post-padding}\).
    \(\begin{gather}O = \lceil\frac{I}{S}\rceil \\ P = \lfloor\frac{I-1}{S}\rfloor \cdot S + F -I \\ B = \lfloor\frac{P}{2}\rfloor \\ A = P - B \end{gather}\)
  • SAME_LOWER Use SAME padding, with \(\text{pre-padding} \geq \text{post-padding}\).
    \(\begin{gather}O = \lceil\frac{I}{S}\rceil \\ P = \lfloor\frac{I-1}{S}\rfloor \cdot S + F -I \\ A = \lfloor\frac{P}{2}\rfloor \\ B = P - A \end{gather}\)
  • CAFFE_ROUND_DOWN Use CAFFE padding, rounding the output size down. It uses the pre-padding value.
    \(O = \lceil\frac{M-F}{S}\rceil \cdot (1-S) -S \geq I + B\)
  • CAFFE_ROUND_UP Use CAFFE padding, rounding output size up. It uses the pre-padding value.
    \(O = \lfloor\frac{M-F}{S}\rfloor \cdot (1-S) -S \geq I + B\)
\[\begin{split}\begin{gather}I = \text{dimensions of input image} \\ B = \text{pre-padding, before the image data} \\ A = \text{post-padding, after the image data} \\ P = \text{delta between input and output} \\ S = \text{stride} \\ F = \text{filter} \\ O = \text{output} \\ D = \text{dilation} \\ M = I + B + A \text{(The image data plus any padding)} \\ DK = 1 + D * (F - 1) \end{gather}\end{split}\]

average_count_excludes_padding When setting this parameter, the average pooling calculation ignores the padded input.

Inputs

input: tensor of type T

Outputs

output: tensor of type T

Data Types

T: int8, float16, float32

Shape Information

Input tensor must be a tensor with rank \(r\geq3\).

Output tensor rank is same as the input tensor rank. If the input’s shape is \([a_0,...,a_n]\), the stride is \(s\), the padding is \(p\), and the sampling window shape is \([r_0,..,r_m]\) where \(m=2\) or \(m=3\):

\[\begin{split}b_i = \begin{cases} a_i &\mbox{if } i \in [0,n-m) \\ \frac{a_i + 2 \cdot p_{m+i-n} + r_{m+i-n}}{s_{m+i-n}} &\mbox{else} \\ \end{cases}\end{split}\]

Examples

Pooling
in1 = network.add_input("input1", dtype=trt.float32, shape=(1, 1, 5, 5))
layer = network.add_pooling_nd(in1, trt.PoolingType.MAX, trt.tensorrt.DimsHW(3, 3))
network.mark_output(layer.get_output(0))

inputs[in1.name] = np.array(
    [
        [
            [
                [-10.0, -9.0, -8.0, -7.0, -6.0],
                [-5.0, -4.0, -3.0, -2.0, -1.0],
                [0.0, 1.0, 2.0, 3.0, 4.0],
                [5.0, 6.0, 7.0, 8.0, 9.0],
                [10.0, 11.0, 12.0, 13.0, 14.0],
            ]
        ]
    ]
)
np.reshape(np.arange(-10, 15, dtype=np.float32), newshape=(1, 1, 5, 5))

outputs[layer.get_output(0).name] = layer.get_output(0).shape

expected[layer.get_output(0).name] = np.array([[[[2.0, 3.0, 4.0], [7.0, 8.0, 9.0], [12.0, 13.0, 14.0]]]])
in1 = network.add_input("input1", dtype=trt.float32, shape=(1, 1, 5, 5))
layer = network.add_pooling_nd(in1, trt.PoolingType.AVERAGE, trt.tensorrt.DimsHW(3, 3))
layer.post_padding = (1, 1)
layer.pre_padding = (1, 1)
layer.average_count_excludes_padding = True
network.mark_output(layer.get_output(0))

inputs[in1.name] = np.array(
    [
        [
            [
                [-10.0, -9.0, -8.0, -7.0, -6.0],
                [-5.0, -4.0, -3.0, -2.0, -1.0],
                [0.0, 1.0, 2.0, 3.0, 4.0],
                [5.0, 6.0, 7.0, 8.0, 9.0],
                [10.0, 11.0, 12.0, 13.0, 14.0],
            ]
        ]
    ]
)
np.reshape(np.arange(-10, 15, dtype=np.float32), newshape=(1, 1, 5, 5))

outputs[layer.get_output(0).name] = layer.get_output(0).shape

expected[layer.get_output(0).name] = np.array(
    [
        [
            [
                [-7.0, -6.5, -5.5, -4.5, -4.0],
                [-4.5, -4.0, -3.0, -2.0, -1.5],
                [0.5, 1.0, 2.0, 3.0, 3.5],
                [5.5, 6.0, 7.0, 8.0, 8.5],
                [8.0, 8.5, 9.5, 10.5, 11.0],
            ]
        ]
    ]
)

C++ API

For more information about the C++ IPoolingLayer operator, refer to the C++ IPoolingLayer documentation.

Python API

For more information about the Python IPoolingLayer operator, refer to the Python IPoolingLayer documentation.