Layers¶
PaddingMode¶

tensorrt.
PaddingMode
¶  Enumerates types of padding available in convolution, deconvolution and pooling layers.
Padding mode takes precedence if both
padding_mode
andpre_padding
are set.EXPLICIT* corresponds to explicit padding.SAME* implicitly calculates padding such that the output dimensions are the same as the input dimensions. For convolution and pooling, output dimensions are determined by ceil(input dimensions, stride).CAFFE* corresponds to symmetric padding.
Members:
SAME_UPPER : Use SAME padding, with
pre_padding
<=post_padding
SAME_LOWER : Use SAME padding, with
pre_padding
>=post_padding
EXPLICIT_ROUND_DOWN : Use explicit padding, rounding the output size down
EXPLICIT_ROUND_UP : Use explicit padding, rounding the output size up
CAFFE_ROUND_UP : Use CAFFE padding, rounding the output size up
CAFFE_ROUND_DOWN : Use CAFFE padding, rounding the output size down
IConvolutionLayer¶

class
tensorrt.
IConvolutionLayer
¶ A convolution layer in an
INetworkDefinition
.This layer performs a correlation operation between 3dimensional filter with a 4dimensional tensor to produce another 4dimensional tensor.
An optional bias argument is supported, which adds a perchannel constant to each value in the output.
Variables:  kernel_size –
DimsHW
The HW kernel size of the convolution.  num_output_maps –
int
The number of output maps for the convolution.  stride –
DimsHW
The stride of the convolution. Default: (1, 1)  padding –
DimsHW
The padding of the convolution. The input will be zeropadded by this number of elements in the height and width directions. If the padding is asymmetric, this value corresponds to the prepadding. Default: (0, 0)  pre_padding –
DimsHW
The prepadding. The start of input will be zeropadded by this number of elements in the height and width directions. Default: (0, 0)  post_padding –
DimsHW
The postpadding. The end of input will be zeropadded by this number of elements in the height and width directions. Default: (0, 0)  padding_mode –
PaddingMode
The padding mode. Padding mode takes precedence if bothIConvolutionLayer.padding_mode
and eitherIConvolutionLayer.pre_padding
orIConvolutionLayer.post_padding
are set.  num_groups –
int
The number of groups for a convolution. The input tensor channels are divided into this many groups, and a convolution is executed for each group, using a filter per group. The results of the group convolutions are concatenated to form the output. Note When using groups in int8 mode, the size of the groups (i.e. the channel count divided by the group count) must be a multiple of 4 for both input and output. Default: 1.  kernel –
Weights
The kernel weights for the convolution. The weights are specified as a contiguous array in GKCRS order, where G is the number of groups, K the number of output feature maps, C the number of input channels, and R and S are the height and width of the filter.  bias –
Weights
The bias weights for the convolution. Bias is optional. To omit bias, set this to an emptyWeights
object. The bias is applied perchannel, so the number of weights (if nonzero) must be equal to the number of output feature maps.  dilation –
DimsHW
The dilation for a convolution. Default: (1, 1)  kernel_size_nd –
Dims
The multidimension kernel size of the convolution.  stride_nd –
Dims
The multidimension stride of the convolution. Default: (1, …, 1)  padding_nd –
Dims
The multidimension padding of the convolution. The input will be zeropadded by this number of elements in each dimension. If the padding is asymmetric, this value corresponds to the prepadding. Default: (0, …, 0)  dilation_nd –
Dims
The multidimension dilation for the convolution. Default: (1, …, 1)
 kernel_size –
IFullyConnectedLayer¶

class
tensorrt.
IFullyConnectedLayer
¶ A fully connected layer in an
INetworkDefinition
.This layer expects an input tensor of three or more nonbatch dimensions. The input is automatically reshaped into an MxV tensor X, where V is a product of the last three dimensions and M is a product of the remaining dimensions (where the product over 0 dimensions is defined as 1). For example:
 If the input tensor has shape {C, H, W}, then the tensor is reshaped into {1, C*H*W} .
 If the input tensor has shape {P, C, H, W}, then the tensor is reshaped into {P, C*H*W} .
The layer then performs:
\(Y := matmul(X, W^T) + bias\)
Where X is the MxV tensor defined above, W is the KxV weight tensor of the layer, and bias is a row vector size K that is broadcasted to MxK . K is the number of output channels, and configurable via
IFullyConnectedLayer.num_output_channels
. If bias is not specified, it is implicitly 0 .The MxK result Y is then reshaped such that the last three dimensions are {K, 1, 1} and the remaining dimensions match the dimensions of the input tensor. For example:
 If the input tensor has shape {C, H, W}, then the output tensor will have shape {K, 1, 1} .
 If the input tensor has shape {P, C, H, W}, then the output tensor will have shape {P, K, 1, 1} .
Variables:
IActivationLayer¶

tensorrt.
ActivationType
¶ The type of activation to perform.
Members:
RELU : Rectified Linear activation
HARD_SIGMOID : Hard sigmoid activation: f(x) = max(0, min(1, alpha * x + beta))
THRESHOLDED_RELU : Thresholded Relu activation: f(x) = x if x > alpha, f(x) = 0 if x <= alpha
TANH : Hyperbolic Tangent activation
LEAKY_RELU : Leaky Relu activation: f(x) = x if x >= 0, f(x) = alpha * x if x < 0
SCALED_TANH : Scaled Tanh activation: f(x) = alpha * tanh(beta * x)
CLIP : Clip activation: f(x) = max(alpha, min(beta, x))
SOFTPLUS : Softplus activation: f(x) = alpha * log(exp(beta * x) + 1)
SIGMOID : Sigmoid activation
SELU : Selu activation: f(x) = beta * x if x > 0, f(x) = beta * (alpha * exp(x)  alpha) if x <= 0
ELU : Elu activation: f(x) = x if x >= 0, f(x) = alpha * (exp(x)  1) if x < 0
SOFTSIGN : Softsign activation: f(x) = x / (1 + abs(x))

class
tensorrt.
IActivationLayer
¶ An Activation layer in an
INetworkDefinition
. This layer applies a perelement activation function to its input. The output has the same shape as the input.Variables:  type –
ActivationType
The type of activation to be performed.  alpha –
float
The alpha parameter that is used by some parametric activations (LEAKY_RELU, ELU, SELU, SOFTPLUS, CLIP, HARD_SIGMOID, SCALED_TANH). Other activations ignore this parameter.  beta –
float
The beta parameter that is used by some parametric activations (SELU, SOFTPLUS, CLIP, HARD_SIGMOID, SCALED_TANH). Other activations ignore this parameter.
 type –
IPoolingLayer¶

tensorrt.
PoolingType
¶ The type of pooling to perform in a pooling layer.
Members:
AVERAGE : Average over elements. If the tensor is padded, the count includes the padding
MAX : Maximum over elements
MAX_AVERAGE_BLEND : Blending between the max pooling and average pooling: (1blendFactor)*maxPool + blendFactor*avgPool

class
tensorrt.
IPoolingLayer
¶ A Pooling layer in an
INetworkDefinition
. The layer applies a reduction operation within a window over the input.Variables:  type –
PoolingType
The type of pooling to be performed.  window_size –
DimsHW
The window size for pooling.  stride –
DimsHW
The stride for pooling. Default: (1, 1)  padding –
DimsHW
The padding for pooling. Default: (0, 0)  pre_padding –
DimsHW
The prepadding. The start of input will be zeropadded by this number of elements in the height and width directions. Default: (0, 0)  post_padding –
DimsHW
The postpadding. The end of input will be zeropadded by this number of elements in the height and width directions. Default: (0, 0)  padding_mode –
PaddingMode
The padding mode. Padding mode takes precedence if bothIPoolingLayer.padding_mode
and eitherIPoolingLayer.pre_padding
orIPoolingLayer.post_padding
are set.  blend_factor –
float
The blending factor for the max_average_blend mode: \(max_average_blendPool = (1blendFactor)*maxPool + blendFactor*avgPool\) .blend_factor
is a user value in [0,1] with the default value of 0.0. This value only applies for thePoolingType.MAX_AVERAGE_BLEND
mode.  average_count_excludes_padding –
bool
Whether average pooling uses as a denominator the overlap area between the window and the unpadded input. If this is not set, the denominator is the overlap between the pooling window and the padded input. Default: True  window_size_nd –
Dims
The multidimension window size for pooling.  stride_nd –
Dims
The multidimension stride for pooling. Default: (1, …, 1)  padding_nd –
Dims
The multidimension padding for pooling. Default: (0, …, 0)
 type –
ILRNLayer¶

class
tensorrt.
ILRNLayer
¶ A LRN layer in an
INetworkDefinition
. The output size is the same as the input size.Variables:  window_size –
int
The LRN window size. The window size must be odd and in the range of [1, 15].  alpha –
float
The LRN alpha value. The valid range is [1e20, 1e20].  beta –
float
The LRN beta value. The valid range is [0.01, 1e5f].  k –
float
The LRN K value. The valid range is [1e5, 1e10].
 window_size –
IScaleLayer¶

tensorrt.
ScaleMode
¶ Controls how scale is applied in a Scale layer.
Members:
ELEMENTWISE : Elementwise coefficients.
UNIFORM : Identical coefficients across all elements of the tensor.
CHANNEL : Perchannel coefficients. The channel dimension is assumed to be the third to last dimension.

class
tensorrt.
IScaleLayer
¶ A Scale layer in an
INetworkDefinition
.This layer applies a perelement computation to its input:
\(output = (input * scale + shift) ^ power\)
The coefficients can be applied on a pertensor, perchannel, or perelement basis.
Note If the number of weights is 0, then a default value is used for shift, power, and scale. The default shift is 0, the default power is 1, and the default scale is 1.
The output size is the same as the input size.
Note The input tensor for this layer is required to have a minimum of 3 dimensions.
Variables:
ISoftMaxLayer¶

class
tensorrt.
ISoftMaxLayer
¶ A Softmax layer in an
INetworkDefinition
.This layer applies a perchannel softmax to its input.
The output size is the same as the input size.
Variables: axes – int
The axes along which softmax is computed. Currently, only one axis can be set. The axis is specified by setting the bit corresponding to the axis, after excluding the batch dimension, to 1. Let’s say we have an NCHW tensor as input (three nonbatch dimensions). Bit 0 corresponds to the C dimension boolean. Bit 1 corresponds to the H dimension boolean. Bit 2 corresponds to the W dimension boolean. For example, to perform softmax on axis R of a NPQRCHW input, set bit 2. By default, softmax is performed on the axis which is the number of nonbatch axes minus three. It is 0 if there are fewer than 3 nonbatch axes. For example, if the input is NCHW, the default axis is C. If the input is NHW, then the default axis is H.
IConcatenationLayer¶

class
tensorrt.
IConcatenationLayer
¶ A concatenation layer in an
INetworkDefinition
.The output channel size is the sum of the channel sizes of the inputs. The other output sizes are the same as the other input sizes, which must all match.
Variables: axis – int
The axis along which concatenation occurs. 0 is the major axis (excluding the batch dimension). The default is the number of nonbatch axes in the tensor minus three (e.g. for an NCHW input it would be 0), or 0 if there are fewer than 3 nonbatch axes.
IDeconvolutionLayer¶

class
tensorrt.
IDeconvolutionLayer
¶ A deconvolution layer in an
INetworkDefinition
.Variables:  kernel_size –
DimsHW
The HW kernel size of the convolution.  num_output_maps –
int
The number of output feature maps for the deconvolution.  stride –
DimsHW
The stride of the deconvolution. Default: (1, 1)  padding –
DimsHW
The padding of the deconvolution. The input will be zeropadded by this number of elements in the height and width directions. Padding is symmetric. Default: (0, 0)  pre_padding –
DimsHW
The prepadding. The start of input will be zeropadded by this number of elements in the height and width directions. Default: (0, 0)  post_padding –
DimsHW
The postpadding. The end of input will be zeropadded by this number of elements in the height and width directions. Default: (0, 0)  padding_mode –
PaddingMode
The padding mode. Padding mode takes precedence if bothIDeconvolutionLayer.padding_mode
and eitherIDeconvolutionLayer.pre_padding
orIDeconvolutionLayer.post_padding
are set.  num_groups –
int
The number of groups for a deconvolution. The input tensor channels are divided into this many groups, and a deconvolution is executed for each group, using a filter per group. The results of the group convolutions are concatenated to form the output. Note When using groups in int8 mode, the size of the groups (i.e. the channel count divided by the group count) must be a multiple of 4 for both input and output. Default: 1  kernel –
Weights
The kernel weights for the deconvolution. The weights are specified as a contiguous array in CKRS order, where C the number of input channels, K the number of output feature maps, and R and S are the height and width of the filter.  bias –
Weights
The bias weights for the deconvolution. Bias is optional. To omit bias, set this to an emptyWeights
object. The bias is applied perfeaturemap, so the number of weights (if nonzero) must be equal to the number of output feature maps.  kernel_size_nd –
Dims
The multidimension kernel size of the convolution.  stride_nd –
Dims
The multidimension stride of the deconvolution. Default: (1, …, 1)  padding_nd –
Dims
The multidimension padding of the deconvolution. The input will be zeropadded by this number of elements in each dimension. Padding is symmetric. Default: (0, …, 0)
 kernel_size –
IElementWiseLayer¶

tensorrt.
ElementWiseOperation
¶ The binary operations that may be performed by an ElementWise layer.
Members:
EQUAL : Check if two elements are equal
DIV : Divide the first element by the second
SUB : Subtract the second element from the first
POW : The first element to the power of the second element
LESS : Check if element in first tensor is less than corresponding element in second tensor
OR : Logical OR of two elements
MIN : Min of the two elements
FLOOR_DIV : Floor division of the first element by the second
GREATER : Check if element in first tensor is greater than corresponding element in second tensor
XOR : Logical XOR of two elements
MAX : Max of the two elements
AND : Logical AND of two elements
PROD : Product of the two elements
SUM : Sum of the two elements

class
tensorrt.
IElementWiseLayer
¶ A elementwise layer in an
INetworkDefinition
.This layer applies a perelement binary operation between corresponding elements of two tensors.
The input dimensions of the two input tensors must be equal, and the output tensor is the same size as each input.
Variables: op – ElementWiseOperation
The binary operation for the layer.
IGatherLayer¶

class
tensorrt.
IGatherLayer
¶ A gather layer in an
INetworkDefinition
.Variables:  axis –
int
The nonbatch dimension axis to gather on. The axis must be less than the number of nonbatch dimensions in the data input.  num_elementwise_dims –
int
The number of leading dimensions of indices tensor to be handled elementwise. Must be 0 if there is an implicit batch dimension. It can be 0 or 1 if there is not an implicit batch dimension.
 axis –
RNN Layers¶

tensorrt.
RNNOperation
¶ The RNN operations that may be performed by an RNN layer.
Equation definitions
In the equations below, we use the following naming convention:
t := current time stepi := input gateo := output gatef := forget gatez := update gater := reset gatec := cell gateh := hidden gateg[t] denotes the output of gate g at timestep t, e.g.`f[t]` is the output of the forget gate f .X[t] := input tensor for timestep tC[t] := cell state for timestep tH[t] := hidden state for timestep tW[g] := W (input) parameter weight matrix for gate gR[g] := U (recurrent) parameter weight matrix for gate gWb[g] := W (input) parameter bias vector for gate gRb[g] := U (recurrent) parameter bias vector for gate gUnless otherwise specified, all operations apply pointwise to elements of each operand tensor.
ReLU(X) := max(X, 0)tanh(X) := hyperbolic tangent of Xsigmoid(X) := 1 / (1 + exp(X))exp(X) := e^XA.B denotes matrix multiplication of A and B .A*B denotes pointwise multiplication of A and B .Equations
Depending on the value of RNNOperation chosen, each sublayer of the RNN layer will perform one of the following operations:
RELU
\(H[t] := ReLU(W[i].X[t] + R[i].H[t1] + Wb[i] + Rb[i])\)
TANH
\(H[t] := tanh(W[i].X[t] + R[i].H[t1] + Wb[i] + Rb[i])\)
LSTM
\(i[t] := sigmoid(W[i].X[t] + R[i].H[t1] + Wb[i] + Rb[i])\)\(f[t] := sigmoid(W[f].X[t] + R[f].H[t1] + Wb[f] + Rb[f])\)\(o[t] := sigmoid(W[o].X[t] + R[o].H[t1] + Wb[o] + Rb[o])\)\(c[t] := tanh(W[c].X[t] + R[c].H[t1] + Wb[c] + Rb[c])\)\(C[t] := f[t]*C[t1] + i[t]*c[t]\)\(H[t] := o[t]*tanh(C[t])\)GRU
\(z[t] := sigmoid(W[z].X[t] + R[z].H[t1] + Wb[z] + Rb[z])\)\(r[t] := sigmoid(W[r].X[t] + R[r].H[t1] + Wb[r] + Rb[r])\)\(h[t] := tanh(W[h].X[t] + r[t]*(R[h].H[t1] + Rb[h]) + Wb[h])\)\(H[t] := (1  z[t])*h[t] + z[t]*H[t1]\)Members:
TANH : Single gate RNN w/ TANH activation
LSTM : Fourgate LSTM network w/o peephole connections
RELU : Single gate RNN w/ ReLU activation
GRU : Threegate network consisting of Gated Recurrent Units

tensorrt.
RNNDirection
¶ The RNN direction that may be performed by an RNN layer.
Members:
BIDIRECTION : Network iterates from first to last (and vice versa) and outputs concatenated
UNIDIRECTION : Network iterates from first input to last input

tensorrt.
RNNInputMode
¶ The RNN input modes that may occur with an RNN layer.
If the RNN is configured with
RNNInputMode.LINEAR
, then for each gate g in the first layer of the RNN, the input vector X[t] (length E) is leftmultiplied by the gate’s corresponding weight matrix W[g] (dimensions HxE) as usual, before being used to compute the gate output as described byRNNOperation
.If the RNN is configured with
RNNInputMode.SKIP
, then this initial matrix multiplication is “skipped” and W[g] is conceptually an identity matrix. In this case, the input vector X[t] must have length H (the size of the hidden state).Members:
LINEAR : Perform the normal matrix multiplication in the first recurrent layer
SKIP : No operation is performed on the first recurrent layer
IRNNLayer¶
Deprecated since version 4.0.

class
tensorrt.
IRNNLayer
¶ An RNN layer in an
INetworkDefinition
.This layer applies an RNN operation on the inputs.
Deprecated This interface is superseded by IRNNv2Layer.
Variables:  num_layers –
int
The number of layers in the RNN.  hidden_size –
int
The size of the hidden layers.  max_seq_length –
int
The sequence length. This is the maximum number of input tensors that the RNN can process at once.  op –
RNNOperation
The operation of the RNN layer.  input_mode –
RNNInputMode
The input mode of the RNN layer.  direction –
RNNDirection
the direction of the RNN layer. The direction determines if the RNN is run as a unidirectional(left to right) or bidirectional(left to right and right to left). In theRNNDirection.BIDIRECTION
case the output is concatenated together, resulting in output size of 2xhidden_size
.  weights –
Weights
The weight parameters for the RNN. For more information, see IRNNLayer::setWeights().  bias –
Weights
The bias parameter vector for the RNN layer. For more information see IRNNLayer::setBias().  data_length –
int
The length of the data being processed by the RNN for use in computing other values.  hidden_state –
ITensor
the initial hidden state of the RNN with the provided hidden ITensor. The layout for hidden is a linear layout of a 3D matrix: C  The number of layers in the RNN, it must matchnum_layers
. H  The number of minibatches for each time sequence. W  The size of the per layer hidden states, it must matchhidden_size
. The amount of space required is doubled ifdirection
isRNNDirection.BIDIRECTION
with the bidirectional states coming after the unidirectional states. If not specified, then the initial hidden state is set to zero.  cell_state –
ITensor
the initial cell state of the RNN with the provided cell ITensor. The layout for cell is a linear layout of a 3D matrix: C  The number of layers in the RNN, it must matchnum_layers
. H  The number of minibatches for each time sequence. W  The size of the per layer hidden states, it must matchhidden_size
. The amount of space required is doubled ifdirection
isRNNDirection.BIDIRECTION
with the bidirectional states coming after the unidirectional states. If not specified, then the initial cell state is set to zero. The cell state only affects LSTM RNN’s.
 num_layers –
IRNNv2Layer¶

tensorrt.
RNNGateType
¶ The RNN input modes that may occur with an RNN layer.
If the RNN is configured with
RNNInputMode.LINEAR
, then for each gate g in the first layer of the RNN, the input vector X[t] (length E) is leftmultiplied by the gate’s corresponding weight matrix W[g] (dimensions HxE) as usual, before being used to compute the gate output as described byRNNOperation
.If the RNN is configured with
RNNInputMode.SKIP
, then this initial matrix multiplication is “skipped” and W[g] is conceptually an identity matrix. In this case, the input vector X[t] must have length H (the size of the hidden state).Members:
INPUT : Input Gate
CELL : Cell Gate
FORGET : Forget Gate
UPDATE : Update Gate
RESET : Reset Gate
HIDDEN : Hidden Gate
OUTPUT : Output Gate

class
tensorrt.
IRNNv2Layer
¶ An RNN layer in an
INetworkDefinition
, version 2Variables:  num_layers –
int
The layer count of the RNN.  hidden_size –
int
The hidden size of the RNN.  max_seq_length –
int
The maximum sequence length of the RNN  data_length –
int
The layer count of the RNN.  seq_lengths –
ITensor
Individual sequence lengths in the batch with theITensor
provided. Theseq_lengths
ITensor
should be a {N1, …, Np} tensor, where N1..Np are the index dimensions of the input tensor to the RNN. Ifseq_lengths
is not specified, then the RNN layer assumes all sequences are sizemax_seq_length
. All sequence lengths inseq_lengths
should be in the range [1,max_seq_length
]. Zerolength sequences are not supported. This tensor must be of type int32.  op –
RNNOperation
The operation of the RNN layer.  input_mode –
int
The input mode of the RNN layer.  direction –
int
The direction of the RNN layer.  hidden_state –
ITensor
the initial hidden state of the RNN with the providedhidden_state
ITensor
. Thehidden_state
ITensor
should have the dimensions {N1, …, Np, L, H}, where: N1..Np are the index dimensions specified by the input tensor L is the number of layers in the RNN, equal tonum_layers
H is the hidden state for each layer, equal tohidden_size
ifdirection
isRNNDirection.UNIDIRECTION
, and 2xhidden_size
otherwise.  cell_state –
ITensor
The initial cell state of the LSTM with the providedcell_state
ITensor
. Thecell_state
ITensor
should have the dimensions {N1, …, Np, L, H}, where: N1..Np are the index dimensions specified by the input tensor L is the number of layers in the RNN, equal tonum_layers
H is the hidden state for each layer, equal tohidden_size
ifdirection
isRNNDirection.UNIDIRECTION
, and 2xhidden_size
otherwise. It is an error to set this on an RNN layer that is not configured withRNNOperation.LSTM
.

get_bias_for_gate
(self: tensorrt.tensorrt.IRNNv2Layer, layer_index: int, gate: tensorrt.tensorrt.RNNGateType, is_w: bool) → array¶ Get the bias parameters for an individual gate in the RNN.
Parameters:  layer_index – The index of the layer that contains this gate.
 gate – The name of the gate within the RNN layer.
 is_w – True if the bias parameters are for the input bias Wb[g] and false if they are for the recurrent input bias Rb[g].
Returns: The bias parameters.

get_weights_for_gate
(self: tensorrt.tensorrt.IRNNv2Layer, layer_index: int, gate: tensorrt.tensorrt.RNNGateType, is_w: bool) → array¶ Get the weight parameters for an individual gate in the RNN.
Parameters:  layer_index – The index of the layer that contains this gate.
 gate – The name of the gate within the RNN layer.
 is_w – True if the weight parameters are for the input matrix W[g] and false if they are for the recurrent input matrix R[g].
Returns: The weight parameters.

set_bias_for_gate
(self: tensorrt.tensorrt.IRNNv2Layer, layer_index: int, gate: tensorrt.tensorrt.RNNGateType, is_w: bool, bias: tensorrt.tensorrt.Weights) → None¶ Set the bias parameters for an individual gate in the RNN.
Parameters:  layer_index – The index of the layer that contains this gate. Refer to
IRNNLayer.weights
for a description of the layer index.  gate – The name of the gate within the RNN layer. The gate name must correspond to one of the gates used by this layer’s
RNNOperation
.  is_w – True if the bias parameters are for the input bias Wb[g] and false if they are for the recurrent input bias Rb[g]. See
RNNOperation
for equations showing how these bias vectors are used in the RNN gate.  bias – The weight structure holding the bias parameters, which should be an array of size
hidden_size
.
 layer_index – The index of the layer that contains this gate. Refer to

set_weights_for_gate
(self: tensorrt.tensorrt.IRNNv2Layer, layer_index: int, gate: tensorrt.tensorrt.RNNGateType, is_w: bool, weights: tensorrt.tensorrt.Weights) → None¶ Set the weight parameters for an individual gate in the RNN.
Parameters:  layer_index – The index of the layer that contains this gate. Refer to
IRNNLayer.weights
for a description of the layer index.  gate – The name of the gate within the RNN layer. The gate name must correspond to one of the gates used by this layer’s
RNNOperation
.  is_w – True if the weight parameters are for the input matrix W[g] and false if they are for the recurrent input matrix R[g]. See
RNNOperation
for equations showing how these matrices are used in the RNN gate.  weights – The weight structure holding the weight parameters, which are stored as a rowmajor 2D matrix. Refer to
IRNNLayer.weights
for documentation on the expected dimensions of this matrix.
 layer_index – The index of the layer that contains this gate. Refer to
 num_layers –
IPluginLayer¶

class
tensorrt.
IPluginLayer
¶ A plugin layer in an
INetworkDefinition
.Variables: plugin – IPlugin
The plugin for the layer.
IPluginV2Layer¶

class
tensorrt.
IPluginV2Layer
¶ A plugin layer in an
INetworkDefinition
.Variables: plugin – IPluginV2
The plugin for the layer.
IUnaryLayer¶

tensorrt.
UnaryOperation
¶ The unary operations that may be performed by a Unary layer.
Members:
ABS : Absolute value
SINH : Hyperbolic sine
SQRT : Square root
ERF : Gauss error function
RECIP : Reciprocal
COSH : Hyperbolic cosine
SIN : Sine
ACOSH : Inverse hyperbolic cosine
FLOOR : Floor
ASIN : Inverse sine
NOT : Not
ATAN : Inverse tangent
CEIL : Ceiling
COS : Cosine
EXP : Exponentiation
ASINH : Inverse hyperbolic sine
ACOS : Inverse cosine
TAN : Tangent
ATANH : Inverse hyperbolic tangent
NEG : Negation
LOG : Log (base e)

class
tensorrt.
IUnaryLayer
¶ A unary layer in an
INetworkDefinition
.Variables: op – UnaryOperation
The unary operation for the layer.
IReduceLayer¶

tensorrt.
ReduceOperation
¶ The reduce operations that may be performed by a Reduce layer
Members:
PROD :
AVG :
MAX :
MIN :
SUM :

class
tensorrt.
IReduceLayer
¶ A reduce layer in an
INetworkDefinition
.Variables:  op –
ReduceOperation
The reduce operation for the layer.  axes –
int
The axes over which to reduce.  keep_dims –
bool
Specifies whether or not to keep the reduced dimensions for the layer.
 op –
IPaddingLayer¶

class
tensorrt.
IPaddingLayer
¶ A padding layer in an
INetworkDefinition
.Variables:  pre_padding –
DimsHW
The padding that is applied at the start of the tensor. Negative padding results in trimming the edge by the specified amount.  post_padding –
DimsHW
The padding that is applied at the end of the tensor. Negative padding results in trimming the edge by the specified amount  pre_padding_nd –
Dims
The padding that is applied at the start of the tensor. Negative padding results in trimming the edge by the specified amount. Only 2 dimensions currently supported.  post_padding_nd –
Dims
The padding that is applied at the end of the tensor. Negative padding results in trimming the edge by the specified amount. Only 2 dimensions currently supported.
 pre_padding –
IParametricReLULayer¶

class
tensorrt.
IParametricReLULayer
¶ A parametric ReLU layer in an
INetworkDefinition
.This layer applies a parametric ReLU activation to an input tensor (first input), with slopes taken from a slopes tensor (second input). This can be viewed as a leaky ReLU operation where the negative slope differs from element to element (and can in fact be learned).
The slopes tensor must be unidirectional broadcastable to the input tensor: the rank of the two tensors must be the same, and all dimensions of the slopes tensor must either equal the input tensor or be 1. The output tensor has the same shape as the input tensor.
ISelectLayer¶

class
tensorrt.
ISelectLayer
¶ A select layer in an
INetworkDefinition
.This layer implements an elementwise ternary conditional operation. Wherever
condition
isTrue
, elements are taken from the first input, and wherevercondition
isFalse
, elements are taken from the second input.
IShuffleLayer¶

class
tensorrt.
Permutation
(*args, **kwargs)¶ The elements of the permutation. The permutation is applied as outputDimensionIndex = permutation[inputDimensionIndex], so to permute from CHW order to HWC order, the required permutation is [1, 2, 0], and to permute from HWC to CHW, the required permutation is [2, 0, 1].
It supports iteration and indexing and is implicitly convertible to/from Python iterables (like
tuple
orlist
). Therefore, you can use those classes in place ofPermutation
.Overloaded function.
 __init__(self: tensorrt.tensorrt.Permutation) > None
 __init__(self: tensorrt.tensorrt.Permutation, arg0: List[int]) > None

class
tensorrt.
IShuffleLayer
¶ A shuffle layer in an
INetworkDefinition
.This class shuffles data by applying in sequence: a transpose operation, a reshape operation and a second transpose operation. The dimension types of the output are those of the reshape dimension.
Variables:  first_transpose –
Permutation
The permutation applied by the first transpose operation. Default: Identity Permutation  reshape_dims –
Dims
The reshaped dimensions. Two special values can be used as dimensions. Value 0 copies the corresponding dimension from input. This special value can be used more than once in the dimensions. If number of reshape dimensions is less than input, 0s are resolved by aligning the most significant dimensions of input. Value 1 infers that particular dimension by looking at input and rest of the reshape dimensions. Note that only a maximum of one dimension is permitted to be specified as 1. The product of the new dimensions must be equal to the product of the old.  second_transpose –
Permutation
The permutation applied by the second transpose operation. Default: Identity Permutation
 first_transpose –
ISliceLayer¶

class
tensorrt.
ISliceLayer
¶ A slice layer in an
INetworkDefinition
.Variables: 
set_input
(self: tensorrt.tensorrt.ISliceLayer, index: int, tensor: tensorrt.tensorrt.ITensor) → None¶ Sets the input tensor for the given index. The index must be 0 for a static slice layer. A static slice layer is converted to a dynamic slice layer by calling setInput with an index > 0. A dynamic slice layer cannot be converted back to a static slice layer.
For a dynamic slice layer, the values 03 are valid. If an index > 0 is specified, all values between index 0 and that index must be dynamic tensors. The values larger than index can use static dimensions. For example, if an index of two is specified, the stride tensor can be set via setStride, but the start tensor must be specified via setInput as both size and start are converted to dynamic tensors. The indices in the dynamic case are as follows:
Index  Description0  Data or Shape tensor to be sliced.1  The start tensor to begin slicing, Ndimensional for Data, and 1D for Shape.2  The size tensor of the resulting slice, Ndimensional for Data, and 1D for Shape.3  The stride of the slicing operation, Ndimensional for Data, and 1D for Shape.If this function is called with a value greater than 0, then the function getNbInputs() changes from returning 1 to index + 1. When converting from static to dynamic slice layer, all unset tensors, between 1 and index + 1, are initialized to nullptr. It is an error to attempt to build a network that has any nullptr inputs.
Parameters:  index – The index of the input tensor.
 tensor – The input tensor.

IShapeLayer¶

class
tensorrt.
IShapeLayer
¶ A shape layer in an
INetworkDefinition
. Used for getting the shape of a tensor. This class sets the output to a onedimensional tensor with the dimensions of the input tensor.For example, if the input is a fourdimensional tensor (of any type) with dimensions [2,3,5,7], the output tensor is a onedimensional Int32 tensor of length 4 containing the sequence 2, 3, 5, 7.
ITopKLayer¶

tensorrt.
TopKOperation
¶ The operations that may be performed by a TopK layer
Members:
MAX : Maximum of the elements
MIN : Minimum of the elements

class
tensorrt.
ITopKLayer
¶ A TopK layer in an
INetworkDefinition
.Variables:  op –
TopKOperation
The operation for the layer.  k –
TopKOperation
the k value for the layer. Currently only values up to 25 are supported.  axes –
TopKOperation
The axes along which to reduce.
 op –
IMatrixMultiplyLayer¶

tensorrt.
MatrixOperation
¶ The matrix operations that may be performed by a Matrix layer
Members:
NONE :
VECTOR : Treat operand as collection of vectors
TRANSPOSE : Transpose each matrix

class
tensorrt.
IMatrixMultiplyLayer
¶ A matrix multiply layer in an
INetworkDefinition
.Let A be op(getInput(0)) and B be op(getInput(1)) where op(x) denotes the corresponding MatrixOperation.
When A and B are matrices or vectors, computes the inner product A * B:
matrix * matrix > matrixmatrix * vector > vectorvector * matrix > vectorvector * vector > scalarInputs of higher rank are treated as collections of matrices or vectors. The output will be a corresponding collection of matrices, vectors, or scalars.
Variables:  op0 –
MatrixOperation
How to treat the first input.  op1 –
MatrixOperation
How to treat the second input.
 op0 –
IRaggedSoftMaxLayer¶

class
tensorrt.
IRaggedSoftMaxLayer
¶ A ragged softmax layer in an
INetworkDefinition
.This layer takes a ZxS input tensor and an additional Zx1 bounds tensor holding the lengths of the Z sequences.
This layer computes a softmax across each of the Z sequences.
The output tensor is of the same size as the input tensor.
IIdentityLayer¶

class
tensorrt.
IIdentityLayer
¶ A layer that represents the identity function.
If tensor precision is explicitly specified, it can be used to transform from one precision to another.
IConstantLayer¶

class
tensorrt.
IConstantLayer
¶ A constant layer in an
INetworkDefinition
.Variables:
IResizeLayer¶

tensorrt.
ResizeMode
¶ Various modes of resize in the resize layer.
Members:
NEAREST : 1D, 2D, and 3D nearest neighbor resizing.
LINEAR : Can handle linear, bilinear, trilinear resizing.

class
tensorrt.
IResizeLayer
¶ A resize layer in an
INetworkDefinition
.Resize layer can be used for resizing a ND tensor.
Resize layer currently supports the following configurations:
 ResizeMode.NEAREST  resizes innermost m dimensions of ND, where 0 < m <= min(3, N) and N > 0.
 ResizeMode.LINEAR  resizes innermost m dimensions of ND, where 0 < m <= min(3, N) and N > 0.
Default resize mode is ResizeMode.NEAREST.
Resize layer provides two ways to resize tensor dimensions:
 Set output dimensions directly. It can be done for static as well as dynamic resize layer. Static resize layer requires output dimensions to be known at buildtime. Dynamic resize layer requires output dimensions to be set as one of the input tensors.
 Set scales for resize. Each output dimension is calculated as floor(input dimension * scale). Only static resize layer allows setting scales where the scales are known at buildtime.
Variables:  shape –
Dims
The output dimensions. Must to equal to input dimensions size.  scales –
List[float]
List of resize scales.  resize_mode –
ResizeMode
Resize mode can be Linear or Nearest.  align_corners –
bool
If True, the centers of the 4 corner pixels of both input and output tensors are aligned. Default: False.

set_input
(self: tensorrt.tensorrt.IResizeLayer, index: int, tensor: tensorrt.tensorrt.ITensor) → None¶ Sets the input tensor for the given index.
If index == 1 and num_inputs == 1, and there is no implicit batch dimension, in which case num_inputs changes to 2. Once such additional input is set, resize layer works in dynamic mode. When index == 1 and num_inputs == 1, the output dimensions are used from the input tensor, overriding the dimensions supplied by shape.
Parameters:  index – The index of the input tensor.
 tensor – The input tensor.
ILoop¶

class
tensorrt.
ILoop
¶ Helper for creating a recurrent subgraph.
Variables: name – The name of the loop. The name is used in error diagnostics. 
add_iterator
(self: tensorrt.tensorrt.ILoop, tensor: tensorrt.tensorrt.ITensor, axis: int = 0, reverse: bool = False) → tensorrt.tensorrt.IIteratorLayer¶ Return layer that subscripts tensor by loop iteration.
For reverse=false, this is equivalent to add_gather(tensor, I, 0) where I is a scalar tensor containing the loop iteration number. For reverse=true, this is equivalent to add_gather(tensor, M1I, 0) where M is the trip count computed from TripLimits of kind
COUNT
.Parameters:  tensor – The tensor to iterate over.
 axis – The axis along which to iterate.
 reverse – Whether to iterate in the reverse direction.
Returns: The
IIteratorLayer
, orNone
if it could not be created.

add_loop_output
(self: tensorrt.tensorrt.ILoop, tensor: tensorrt.tensorrt.ITensor, kind: tensorrt.tensorrt.LoopOutput, axis: int = 0) → tensorrt.tensorrt.ILoopOutputLayer¶ Make an output for this loop, based on the given tensor.
If
kind
isCONCATENATE
orREVERSE
, a second input specifying the concatenation dimension must be added via methodILoopOutputLayer.set_input()
.Parameters:  kind – The kind of loop output. See
LoopOutput
 axis – The axis for concatenation (if using
kind
ofCONCATENATE
orREVERSE
).
Returns: The added
ILoopOutputLayer
, orNone
if it could not be created. kind – The kind of loop output. See

add_recurrence
(self: tensorrt.tensorrt.ILoop, initial_value: tensorrt.tensorrt.ITensor) → tensorrt.tensorrt.IRecurrenceLayer¶ Create a recurrence layer for this loop with initial_value as its first input.
Parameters: initial_value – The initial value of the recurrence layer. Returns: The added IRecurrenceLayer
, orNone
if it could not be created.

add_trip_limit
(self: tensorrt.tensorrt.ILoop, tensor: tensorrt.tensorrt.ITensor, kind: tensorrt.tensorrt.TripLimit) → tensorrt.tensorrt.ITripLimitLayer¶ Add a tripcount limiter, based on the given tensor.
There may be at most one
COUNT
and oneWHILE
limiter for a loop. When both trip limits exist, the loop exits when the count is reached or condition is falsified. It is an error to not add at least one trip limiter.For
WHILE
, the input tensor must be the output of a subgraph that contains only layers that are notITripLimitLayer
,IIteratorLayer
orILoopOutputLayer
. AnyIRecurrenceLayer
s in the subgraph must belong to the same loop as theITripLimitLayer
. A trivial example of this rule is that the input to theWHILE
is the output of anIRecurrenceLayer
for the same loop.Parameters:  tensor – The input tensor. Must be available before the loop starts.
 kind – The kind of trip limit. See
TripLimit
Returns: The added
ITripLimitLayer
, orNone
if it could not be created.

ILoopBoundaryLayer¶
ITripLimitLayer¶

tensorrt.
TripLimit
¶ Describes kinds of trip limits.
Members:
COUNT : Tensor is scalar of type kINT32 that contains the trip count.
WHILE : Tensor is a scalar of type BOOL. Loop terminates when value is false.
IRecurrenceLayer¶

class
tensorrt.
IRecurrenceLayer
¶ 
set_input
(self: tensorrt.tensorrt.IRecurrenceLayer, index: int, tensor: tensorrt.tensorrt.ITensor) → None¶ Set the first or second input. If index==1 and the number of inputs is one, the input is appended. The first input specifies the initial output value, and must come from outside the loop. The second input specifies the next output value, and must come from inside the loop. The two inputs must have the same dimensions.
Parameters:  index – The index of the input to set.
 tensor – The input tensor.

IIteratorLayer¶

class
tensorrt.
IIteratorLayer
¶ Variables:  axis – The axis to iterate over
 reverse – For reverse=false, the layer is equivalent to add_gather(tensor, I, 0) where I is a
scalar tensor containing the loop iteration number.
For reverse=true, the layer is equivalent to add_gather(tensor, M1I, 0) where M is the trip count
computed from TripLimits of kind
COUNT
. The default is reverse=false.
ILoopOutputLayer¶

tensorrt.
LoopOutput
¶ Describes kinds of loop outputs.
Members:
LAST_VALUE : Output value is value of tensor for last iteration.
CONCATENATE : Output value is concatenation of values of tensor for each iteration, in forward order.
REVERSE : Output value is concatenation of values of tensor for each iteration, in reverse order.

class
tensorrt.
ILoopOutputLayer
¶ An
ILoopOutputLayer
is the sole way to get output from a loop.The first input tensor must be defined inside the loop; the output tensor is outside the loop. The second input tensor, if present, must be defined outside the loop.
If
kind
isLAST_VALUE
, a single input must be provided.If
kind
isCONCATENATE
orREVERSE
, a second input must be provided. The second input must be a scalar “shape tensor”, defined before the loop commences, that specifies the concatenation length of the output.The output tensor has j more dimensions than the input tensor, where j == 0 if
kind
isLAST_VALUE
j == 1 ifkind
isCONCATENATE
orREVERSE
.Variables:  axis – The contenation axis. Ignored if
kind
isLAST_VALUE
. For example, if the input tensor has dimensions [b,c,d], andkind
isCONCATENATE
, the output has four dimensions. Let a be the value of the second input. axis=0 causes the output to have dimensions [a,b,c,d]. axis=1 causes the output to have dimensions [b,a,c,d]. axis=2 causes the output to have dimensions [b,c,a,d]. axis=3 causes the output to have dimensions [b,c,d,a]. Default is axis is 0.  kind – The kind of loop output. See
LoopOutput

set_input
(self: tensorrt.tensorrt.ILoopOutputLayer, index: int, tensor: tensorrt.tensorrt.ITensor) → None¶ Like
ILayer.set_input()
, but additionally works if index==1,num_inputs`==1, in which case :attr:`num_inputs
changes to 2.
 axis – The contenation axis. Ignored if
IFillLayer¶

tensorrt.
FillOperation
¶ The tensor fill operations that may performed by an Fill layer.
Members:
LINSPACE : Generate evenly spaced numbers over a specified interval
RANDOM_UNIFORM : Generate a tensor with random values drawn from a uniform distribution

class
tensorrt.
IFillLayer
¶ A fill layer in an
INetworkDefinition
.
set_input
(self: tensorrt.tensorrt.IFillLayer, index: int, tensor: tensorrt.tensorrt.ITensor) → None¶ replace an input of this layer with a specific tensor.
 Index  Description for kLINSPACE
 0  Shape tensor, represents the output tensor’s dimensions. 1  Start, a scalar, represents the start value. 2  Delta, a 1D tensor, length equals to shape tensor’s nbDims, represents the delta value for each dimension.
 Index  Description for kRANDOM_UNIFORM
 0  Shape tensor, represents the output tensor’s dimensions. 1  Minimum, a scalar, represents the minimum random value. 2  Maximum, a scalar, represents the maximal random value.
Parameters:  index – the index of the input to modify.
 tensor – the input tensor.
