Convolution

Computes a convolution on an input tensor and adds an optional bias to produce an output tensor.

Attributes

kernel_size An array of 2 or 3 elements, describing the size of the convolution kernel in each spatial dimension. the size of the array(2 or 3) determines the type of the deconvolution, 2D or 3D.

num_output_maps The number of output maps for the convolution.

stride The stride of the convolution. The default is \((1, 1)\).

padding The padding of the convolution. The default is \((0, 0)\).

pre_padding The pre-padding. The default is \((0, 0)\).

post_padding The post-padding. The default is \((0, 0)\).

padding_mode The padding mode. The padding mode can be one of the following:

\[\begin{split}I = \text{dimensions of input image.} \\ B = \text{pre-padding, before the image data. For deconvolution, pre-padding is set before output.} \\ A = \text{post-padding, after the image data. For deconvolution, post-padding is set after output.} \\ P = \text{delta between input and output} \\ S = \text{stride} \\ F = \text{filter} \\ O = \text{output} \\ D = \text{dilation} \\ M = I + B + A\text{The data plus any padding} \\ DK = 1 + D \cdot (F - 1) \\\end{split}\]
  • EXPLICIT_ROUND_DOWN Use explicit padding, rounding the output size down.
    \(O = \lfloor\frac{M - F}{S}\rfloor + 1\)
  • EXPLICIT_ROUND_UP Use explicit padding, rounding the output size up.
    \(O = \lceil\frac{M - F}{S}\rceil + 1\)
  • SAME_UPPER Use SAME padding, with \(\text{pre-padding} \leq \text{post-padding}\).
    \(\begin{gather}O = \lceil\frac{I}{S}\rceil \\ P = \lfloor\frac{I-1}{S}\rfloor \cdot S + F -I \\ B = \lfloor\frac{P}{2}\rfloor \\ A = P - B \end{gather}\)
  • SAME_LOWER Use SAME padding, with \(\text{pre-padding} \geq \text{post-padding}\).
    \(\begin{gather}O = \lceil\frac{I}{S}\rceil \\ P = \lfloor\frac{I-1}{S}\rfloor \cdot S + F -I \\ A = \lfloor\frac{P}{2}\rfloor \\ B = P - A \end{gather}\)
  • CAFFE_ROUND_DOWN Use CAFFE padding, rounding the output size down. It uses the pre-padding value.
    \(O = \lceil\frac{M-F}{S}\rceil \cdot (1-S) -S \geq I + B\)
  • CAFFE_ROUND_UP Use CAFFE padding, rounding output size up. It uses the pre-padding value.
    \(O = \lfloor\frac{M-F}{S}\rfloor \cdot (1-S) -S \geq I + B\)

num_groups The number of groups for a convolution.

kernel The kernel weights for the convolution.

bias The bias weights for the convolution.

dilation The dilation for a convolution. The default is \((1, 1)\).

kernel_size_nd The multi-dimension kernel size of the convolution.

stride_nd The multi-dimension stride of the convolution. The default is \((1, \cdots, 1)\).

padding_nd The multi-dimension padding of the convolution. The default is \((0,\cdots,0)\).

dilation_nd The multi-dimension dilation for the convolution. The default is \((1,\cdots,1)\).

Inputs

input: tensor of type T

Outputs

output: tensor of type T

Data Types

T: int8, float16, float32

Shape Information

input is a tensor of shape \([A_0,\cdots,A_n]\), \(n\geq3\).

W is a tensor of shape \([k, W_0,\cdots,W_m]\), \(m\in(2,3)\)

bias is a tensor of shape \([k]\)

output is a tensor of shape \([A_0, m, B_{n-m},\cdots, B_n]\), where:

\[ \begin{align}\begin{aligned}m: \text{num_output_maps}\\d: \text{dilation}\\k: \text{kernel_size}\\p: \text{padding}\\B_i=\lfloor(A_i \cdot p_{i-n+m} - t_{i-n+m})\rfloor + 1\\t_{i-n+m} = 1 + d_{i-n+m} \cdot (k_{i-n+m} - 1)\end{aligned}\end{align} \]

Examples

Convolution 2d
input_shape = [1, 1, 5, 5]
in1 = network.add_input("input1", dtype=trt.float32, shape=input_shape)
num_filter = 3
w = np.array(
    [
        [
            [0.3, -0.8, 1.0],
            [0.5, -0.5, 0.0],
            [0.4, -0.2, 0.9],
        ],
        [
            [0.4, -0.7, 0.8],
            [0.3, -0.2, 1.0],
            [0.3, 0.2, 0.3],
        ],
        [
            [0.1, -0.2, 0.3],
            [0.1, -0.2, 0.3],
            [0.1, -0.2, 0.9],
        ],
    ],
    np.float32,
)
layer = network.add_convolution_nd(in1, num_filter, kernel_shape=(3, 3), kernel=trt.Weights(w))
network.mark_output(layer.get_output(0))

inputs[in1.name] = np.array(
    [
        [
            [
                [-3.0, -2.0, -1.0, -2.0, -1.0],
                [10.0, -25.0, 0.0, -2.0, -1.0],
                [1.0, 2.0, -2.0, -2.0, -1.0],
                [10.0, -25.0, 0.0, -2.0, -1.0],
                [-3.0, -2.0, -1.0, -2.0, -1.0],
            ]
        ]
    ]
)

outputs[layer.get_output(0).name] = layer.get_output(0).shape
expected[layer.get_output(0).name] = np.array(
    [
        [
            [[15.4, -14.9, 0.0], [31.5, -19.3, 0.1], [12.5, -14.7, 0.1]],
            [[7.5, -11.6, -1.7], [17.4, -20.7, -1.3], [3.8, -10.3, -1.8]],
            [[3.7, -4.9, -0.6], [11.1, -7.4, -0.5], [4.3, -4.9, -0.6]],
        ]
    ]
)
Convolution 3d
input_shape = [1, 1, 3, 3, 3]
in1 = network.add_input("input1", dtype=trt.float32, shape=input_shape)
num_filter = 1
w = np.array(
    [
        [
            [0.3],
        ]
    ],
    np.float32,
)
layer = network.add_convolution_nd(in1, num_filter, kernel_shape=(1, 1, 1), kernel=trt.Weights(w))
network.mark_output(layer.get_output(0))

inputs[in1.name] = np.array(
    [
        [
            [
                [
                    [0.3, -0.8, 1.0],
                    [0.5, -0.5, 0.0],
                    [0.4, -0.2, 0.9],
                ],
                [
                    [0.4, -0.7, 0.8],
                    [0.3, -0.2, 1.0],
                    [0.3, 0.2, 0.3],
                ],
                [
                    [0.1, -0.2, 0.3],
                    [0.1, -0.2, 0.3],
                    [0.1, -0.2, 0.9],
                ],
            ]
        ]
    ]
)

outputs[layer.get_output(0).name] = layer.get_output(0).shape
expected[layer.get_output(0).name] = np.array(
    [
        [
            [
                [[0.09, -0.24000001, 0.3], [0.15, -0.15, 0.0], [0.12, -0.06, 0.27]],
                [[0.12, -0.21000001, 0.24000001], [0.09, -0.06, 0.3], [0.09, 0.06, 0.09]],
                [[0.03, -0.06, 0.09], [0.03, -0.06, 0.09], [0.03, -0.06, 0.27]],
            ]
        ]
    ]
)

C++ API

For more information about the C++ IConvolutionLayer operator, refer to the C++ IConvolutionLayer documentation.

Python API

For more information about the Python IConvolutionLayer operator, refer to the Python IConvolutionLayer documentation.