# Deconvolution¶

Computes a 2D or 3D deconvolution of an input tensor into an output tensor.

Note

This layer is also known as ConvTranspose.

## Attributes¶

kernel_size An array of 2 or 3 elements, describing the size of the deconvolution kernel in each spatial dimension. the size of the array(2 or 3) determines the type of the deconvolution, 2D or 3D.

padding_mode The padding mode. The padding mode can be one of the following:

$\begin{split}I = \text{dimensions of input image.} \\ B = \text{pre-padding, before the image data. For deconvolution, pre-padding is set before output.} \\ A = \text{post-padding, after the image data. For deconvolution, post-padding is set after output.} \\ P = \text{delta between input and output} \\ S = \text{stride} \\ F = \text{filter} \\ O = \text{output} \\ D = \text{dilation} \\ M = I + B + A\text{The data plus any padding} \\ DK = 1 + D \cdot (F - 1) \\\end{split}$
• EXPLICIT_ROUND_DOWN Use explicit padding, rounding the output size down.
$$O = \lfloor\frac{M - F}{S}\rfloor + 1$$
• EXPLICIT_ROUND_UP Use explicit padding, rounding the output size up.
$$O = \lceil\frac{M - F}{S}\rceil + 1$$
• SAME_UPPER Use SAME padding, with $$\text{pre-padding} \leq \text{post-padding}$$.
$$\begin{gather}O = \lceil\frac{I}{S}\rceil \\ P = \lfloor\frac{I-1}{S}\rfloor \cdot S + F -I \\ B = \lfloor\frac{P}{2}\rfloor \\ A = P - B \end{gather}$$
• SAME_LOWER Use SAME padding, with $$\text{pre-padding} \geq \text{post-padding}$$.
$$\begin{gather}O = \lceil\frac{I}{S}\rceil \\ P = \lfloor\frac{I-1}{S}\rfloor \cdot S + F -I \\ A = \lfloor\frac{P}{2}\rfloor \\ B = P - A \end{gather}$$
• CAFFE_ROUND_DOWN Use CAFFE padding, rounding the output size down. It uses the pre-padding value.
$$O = \lceil\frac{M-F}{S}\rceil \cdot (1-S) -S \geq I + B$$
• CAFFE_ROUND_UP Use CAFFE padding, rounding output size up. It uses the pre-padding value.
$$O = \lfloor\frac{M-F}{S}\rfloor \cdot (1-S) -S \geq I + B$$

pre_padding the amount of pre-padding to use for each spatial dimension.

post_padding the amount of post-padding to use for each spatial dimension.

stride the stride to use for each spatial dimension.

dilation the dilation factor for each spatial dimension.

num_output_maps the number of output feature maps in the layer’s output.

num_groups the number of groups in the layer’s output. When int8 data type is used, the number of groups must be a multiply of 4 for both input and output.

kernel_weights pointer to the layer’s weights.

bias_weights pointer to the layer’s bias.

## Inputs¶

input: tensor of type T.

## Outputs¶

output: tensor of type T.

## Data Types¶

T: int8, float16, float32

## Shape Information¶

input is a tensor with a shape of $$[a_0,...,a_n]$$

kernel_size is an array $$[k_0,...,k_{m-1}], m \in [2,3]$$

output is a tensor with a shape of $$[b_0,...,b_n]$$, where:

$\begin{split}s_j - \text{stride at spatial dimension j}\\ k_j - \text{kernel at spatial dimension j}\\ d_j - \text{dilation at spatial dimension j}\\ p_j^{pre} - \text{pre padding at spatial dimension j}\\ p_j^{post} - \text{post padding at spatial dimension j}\\\end{split}$
$\begin{split}b_i = \begin{cases} a_i, & 0 \leq i < n-m \\ (a_i−1) \cdot s_j + 1 + d_j \cdot (k_j - 1) − p_j^{pre} - p_j^{post}, & n-m \leq i < n, & j=i-(n-m) \end{cases}\end{split}$

## Examples¶

Deconvolution
in1 = network.add_input("input1", dtype=trt.float32, shape=(1, 1, 3, 3))
in1, num_output_maps=1, kernel_shape=(3, 3), kernel=np.ones(shape=(1, 1, 3, 3), dtype=np.float32)
)
network.mark_output(layer.get_output(0))

inputs[in1.name] = np.array([[[[-3.0, -2.0, -1.0], [0.0, 1.0, 2.0], [2.0, 5.0, 6.0]]]])

outputs[layer.get_output(0).name] = layer.get_output(0).shape

expected[layer.get_output(0).name] = np.array(
[
[
[
[-3.0, -5.0, -6.0, -3.0, -1.0],
[-3.0, -4.0, -3.0, 0.0, 1.0],
[-1.0, 3.0, 10.0, 11.0, 7.0],
[2.0, 8.0, 16.0, 14.0, 8.0],
[2.0, 7.0, 13.0, 11.0, 6.0],
]
]
]
)