Convolution

Computes a convolution on an input tensor and adds an optional bias to produce an output tensor.

Attributes

num_output_maps The number of output maps for the convolution.

pre_padding The pre-padding. The default is \((0, 0)\).

post_padding The post-padding. The default is \((0, 0)\).

padding_mode The padding mode. The padding mode can be one of the following:

\[\begin{split}I = \text{dimensions of input image.} \\ B = \text{pre-padding, before the image data. For deconvolution, pre-padding is set before output.} \\ A = \text{post-padding, after the image data. For deconvolution, post-padding is set after output.} \\ P = \text{delta between input and output} \\ S = \text{stride} \\ F = \text{filter} \\ O = \text{output} \\ D = \text{dilation} \\ M = I + B + A\text{The data plus any padding} \\ DK = 1 + D \cdot (F - 1) \\\end{split}\]
  • EXPLICIT_ROUND_DOWN Use explicit padding, rounding the output size down.
    \(O = \lfloor\frac{M - F}{S}\rfloor + 1\)
  • EXPLICIT_ROUND_UP Use explicit padding, rounding the output size up.
    \(O = \lceil\frac{M - F}{S}\rceil + 1\)
  • SAME_UPPER Use SAME padding, with \(\text{pre-padding} \leq \text{post-padding}\).
    \(\begin{gather}O = \lceil\frac{I}{S}\rceil \\ P = \lfloor\frac{I-1}{S}\rfloor \cdot S + F -I \\ B = \lfloor\frac{P}{2}\rfloor \\ A = P - B \end{gather}\)
  • SAME_LOWER Use SAME padding, with \(\text{pre-padding} \geq \text{post-padding}\).
    \(\begin{gather}O = \lceil\frac{I}{S}\rceil \\ P = \lfloor\frac{I-1}{S}\rfloor \cdot S + F -I \\ A = \lfloor\frac{P}{2}\rfloor \\ B = P - A \end{gather}\)

num_groups The number of groups for a convolution.

kernel The kernel weights for the convolution.

bias The bias weights for the convolution.

kernel_size_nd The multi-dimension kernel size of the convolution.

stride_nd The multi-dimension stride of the convolution. The default is \((1, \cdots, 1)\).

padding_nd The multi-dimension padding of the convolution. The default is \((0,\cdots,0)\).

dilation_nd The multi-dimension dilation for the convolution. The default is \((1,\cdots,1)\).

Inputs

input: tensor of type T

Outputs

output: tensor of type T

Data Types

T: int8, float16, float32, bfloat16

Shape Information

input is a tensor of shape \([A_0,\cdots,A_n]\), \(n\geq3\).

W is a tensor of shape \([k, W_0,\cdots,W_m]\), \(m\in(2,3)\)

bias is a tensor of shape \([k]\)

output is a tensor of shape \([A_0, m, B_{n-m},\cdots, B_n]\), where:

\[ \begin{align}\begin{aligned}m: \text{num_output_maps}\\d: \text{dilation}\\k: \text{kernel_size}\\p: \text{padding}\\B_i=\lfloor(A_i \cdot p_{i-n+m} - t_{i-n+m})\rfloor + 1\\t_{i-n+m} = 1 + d_{i-n+m} \cdot (k_{i-n+m} - 1)\end{aligned}\end{align} \]

DLA Support

DLA FP16 and DLA INT8 are supported for 2D convolutions.

Examples

Convolution 2d
input_shape = [1, 1, 5, 5]
in1 = network.add_input("input1", dtype=trt.float32, shape=input_shape)
num_filter = 3
w = np.array(
    [
        [
            [0.3, -0.8, 1.0],
            [0.5, -0.5, 0.0],
            [0.4, -0.2, 0.9],
        ],
        [
            [0.4, -0.7, 0.8],
            [0.3, -0.2, 1.0],
            [0.3, 0.2, 0.3],
        ],
        [
            [0.1, -0.2, 0.3],
            [0.1, -0.2, 0.3],
            [0.1, -0.2, 0.9],
        ],
    ],
    np.float32,
)
layer = network.add_convolution_nd(in1, num_filter, kernel_shape=(3, 3), kernel=trt.Weights(w))
network.mark_output(layer.get_output(0))

inputs[in1.name] = np.array(
    [
        [
            [
                [-3.0, -2.0, -1.0, -2.0, -1.0],
                [10.0, -25.0, 0.0, -2.0, -1.0],
                [1.0, 2.0, -2.0, -2.0, -1.0],
                [10.0, -25.0, 0.0, -2.0, -1.0],
                [-3.0, -2.0, -1.0, -2.0, -1.0],
            ]
        ]
    ]
)

outputs[layer.get_output(0).name] = layer.get_output(0).shape
expected[layer.get_output(0).name] = np.array(
    [
        [
            [[15.4, -14.9, 0.0], [31.5, -19.3, 0.1], [12.5, -14.7, 0.1]],
            [[7.5, -11.6, -1.7], [17.4, -20.7, -1.3], [3.8, -10.3, -1.8]],
            [[3.7, -4.9, -0.6], [11.1, -7.4, -0.5], [4.3, -4.9, -0.6]],
        ]
    ]
)
Convolution 3d
input_shape = [1, 1, 3, 3, 3]
in1 = network.add_input("input1", dtype=trt.float32, shape=input_shape)
num_filter = 1
w = np.array(
    [
        [
            [0.3],
        ]
    ],
    np.float32,
)
layer = network.add_convolution_nd(in1, num_filter, kernel_shape=(1, 1, 1), kernel=trt.Weights(w))
network.mark_output(layer.get_output(0))

inputs[in1.name] = np.array(
    [
        [
            [
                [
                    [0.3, -0.8, 1.0],
                    [0.5, -0.5, 0.0],
                    [0.4, -0.2, 0.9],
                ],
                [
                    [0.4, -0.7, 0.8],
                    [0.3, -0.2, 1.0],
                    [0.3, 0.2, 0.3],
                ],
                [
                    [0.1, -0.2, 0.3],
                    [0.1, -0.2, 0.3],
                    [0.1, -0.2, 0.9],
                ],
            ]
        ]
    ]
)

outputs[layer.get_output(0).name] = layer.get_output(0).shape
expected[layer.get_output(0).name] = np.array(
    [
        [
            [
                [[0.09, -0.24000001, 0.3], [0.15, -0.15, 0.0], [0.12, -0.06, 0.27]],
                [[0.12, -0.21000001, 0.24000001], [0.09, -0.06, 0.3], [0.09, 0.06, 0.09]],
                [[0.03, -0.06, 0.09], [0.03, -0.06, 0.09], [0.03, -0.06, 0.27]],
            ]
        ]
    ]
)

C++ API

For more information about the C++ IConvolutionLayer operator, refer to the C++ IConvolutionLayer documentation.

Python API

For more information about the Python IConvolutionLayer operator, refer to the Python IConvolutionLayer documentation.