Layers

IConvolutionLayer

class tensorrt.IConvolutionLayer

A convolution layer in an INetworkDefinition .

This layer performs a correlation operation between 3-dimensional filter with a 4-dimensional tensor to produce another 4-dimensional tensor.

An optional bias argument is supported, which adds a per-channel constant to each value in the output.

Variables:
  • kernel_sizeDimsHW The HW kernel size of the convolution.
  • num_output_mapsint The number of output maps for the convolution.
  • strideDimsHW The stride of the convolution. Default: (1, 1)
  • paddingDimsHW the padding of the convolution. The input will be zero-padded by this number of elements in the height and width directions. Padding is symmetric. Default: (0, 0)
  • num_groupsint The number of groups for a convolution. The input tensor channels are divided into this many groups, and a convolution is executed for each group, using a filter per group. The results of the group convolutions are concatenated to form the output. Note When using groups in int8 mode, the size of the groups (i.e. the channel count divided by the group count) must be a multiple of 4 for both input and output. Default: 1.
  • kernelWeights The kernel weights for the convolution. The weights are specified as a contiguous array in GKCRS order, where G is the number of groups, K the number of output feature maps, C the number of input channels, and R and S are the height and width of the filter.
  • biasWeights The bias weights for the convolution. Bias is optional. To omit bias, set this to an empty Weights object. The bias is applied per-channel, so the number of weights (if non-zero) must be equal to the number of output feature maps.
  • dilationDimsHW The dilation for a convolution. Default: (1, 1)

IFullyConnectedLayer

class tensorrt.IFullyConnectedLayer

A fully connected layer in an INetworkDefinition .

This layer expects an input tensor of three or more non-batch dimensions. The input is automatically reshaped into an MxV tensor X, where V is a product of the last three dimensions and M is a product of the remaining dimensions (where the product over 0 dimensions is defined as 1). For example:

  • If the input tensor has shape {C, H, W}, then the tensor is reshaped into {1, C*H*W} .
  • If the input tensor has shape {P, C, H, W}, then the tensor is reshaped into {P, C*H*W} .

The layer then performs:

\(Y := matmul(X, W^T) + bias\)

Where X is the MxV tensor defined above, W is the KxV weight tensor of the layer, and bias is a row vector size K that is broadcasted to MxK . K is the number of output channels, and configurable via IFullyConnectedLayer.num_output_channels . If bias is not specified, it is implicitly 0 .

The MxK result Y is then reshaped such that the last three dimensions are {K, 1, 1} and the remaining dimensions match the dimensions of the input tensor. For example:

  • If the input tensor has shape {C, H, W}, then the output tensor will have shape {K, 1, 1} .
  • If the input tensor has shape {P, C, H, W}, then the output tensor will have shape {P, K, 1, 1} .
Variables:
  • num_output_channelsint The number of output channels K from the fully connected layer.
  • kernelWeights The kernel weights, given as a KxC matrix in row-major order.
  • biasWeights The bias weights. Bias is optional. To omit bias, set this to an empty Weights object.

IActivationLayer

tensorrt.ActivationType

The type of activation to perform.

Members:

SIGMOID : Sigmoid activation

RELU : Rectified Linear activation

TANH : Hyperbolic Tangent activation

class tensorrt.IActivationLayer

An Activation layer in an INetworkDefinition . This layer applies a per-element activation function to its input. The output has the same shape as the input.

Variables:typeActivationType The type of activation to be performed.

IPoolingLayer

tensorrt.PoolingType

The type of pooling to perform in a pooling layer.

Members:

MAX_AVERAGE_BLEND : Blending between the max pooling and average pooling: (1-blendFactor)*maxPool + blendFactor*avgPool

MAX : Maximum over elements

AVERAGE : Average over elements. If the tensor is padded, the count includes the padding

class tensorrt.IPoolingLayer

A Pooling layer in an INetworkDefinition . The layer applies a reduction operation within a window over the input.

Variables:
  • typePoolingType The type of pooling to be performed.
  • window_sizeDimsHW The window size for pooling.
  • strideDimsHW The stride for pooling. Default: 1
  • paddingDimsHW The padding for pooling. Default: 1
  • blend_factorfloat The blending factor for the max_average_blend mode: \(max_average_blendPool = (1-blendFactor)*maxPool + blendFactor*avgPool\) . blend_factor is a user value in [0,1] with the default value of 0.0. This value only applies for the PoolingType.MAX_AVERAGE_BLEND mode.
  • average_count_excludes_paddingbool Whether average pooling uses as a denominator the overlap area between the window and the unpadded input. If this is not set, the denominator is the overlap between the pooling window and the padded input. Default: True

ILRNLayer

class tensorrt.ILRNLayer

A LRN layer in an INetworkDefinition . The output size is the same as the input size.

Variables:
  • window_sizeint The LRN window size. The window size must be odd and in the range of [1, 15].
  • alphafloat The LRN alpha value. The valid range is [-1e20, 1e20].
  • betafloat The LRN beta value. The valid range is [0.01, 1e5f].
  • kfloat The LRN K value. The valid range is [1e-5, 1e10].

IScaleLayer

tensorrt.ScaleMode

Controls how scale is applied in a Scale layer.

Members:

CHANNEL : Per-channel coefficients. The channel dimension is assumed to be the third to last dimension.

ELEMENTWISE : Elementwise coefficients.

UNIFORM : Identical coefficients across all elements of the tensor.

class tensorrt.IScaleLayer

A Scale layer in an INetworkDefinition .

This layer applies a per-element computation to its input:

\(output = (input * scale + shift) ^ power\)

The coefficients can be applied on a per-tensor, per-channel, or per-element basis.

Note If the number of weights is 0, then a default value is used for shift, power, and scale. The default shift is 0, the default power is 1, and the default scale is 1.

The output size is the same as the input size.

Note The input tensor for this layer is required to have a minimum of 3 dimensions.

Variables:

ISoftMaxLayer

class tensorrt.ISoftMaxLayer

A Softmax layer in an INetworkDefinition .

This layer applies a per-channel softmax to its input.

The output size is the same as the input size.

Variables:axesint The axes along which softmax is computed. Currently, only one axis can be set. The axis is specified by setting the bit corresponding to the axis, after excluding the batch dimension, to 1. Let’s say we have an NCHW tensor as input (three non-batch dimensions). Bit 0 corresponds to the C dimension boolean. Bit 1 corresponds to the H dimension boolean. Bit 2 corresponds to the W dimension boolean. For example, to perform softmax on axis R of a NPQRCHW input, set bit 2. By default, softmax is performed on the axis which is the number of non-batch axes minus three. It is 0 if there are fewer than 3 non-batch axes. For example, if the input is NCHW, the default axis is C. If the input is NHW, then the default axis is H.

IConcatenationLayer

class tensorrt.IConcatenationLayer

A concatenation layer in an INetworkDefinition .

The output channel size is the sum of the channel sizes of the inputs. The other output sizes are the same as the other input sizes, which must all match.

Variables:axisint The axis along which concatenation occurs. 0 is the major axis (excluding the batch dimension). The default is the number of non-batch axes in the tensor minus three (e.g. for an NCHW input it would be 0), or 0 if there are fewer than 3 non-batch axes.

IDeconvolutionLayer

class tensorrt.IDeconvolutionLayer

A deconvolution layer in an INetworkDefinition .

Variables:
  • kernel_sizeDimsHW The HW kernel size of the convolution.
  • num_output_mapsint The number of output feature maps for the deconvolution.
  • strideDimsHW the stride of the deconvolution. Default: (1, 1)
  • paddingDimsHW The padding of the deconvolution. The input will be zero-padded by this number of elements in the height and width directions. Padding is symmetric. Default: (0,0)
  • num_groupsint The number of groups for a deconvolution. The input tensor channels are divided into this many groups, and a deconvolution is executed for each group, using a filter per group. The results of the group convolutions are concatenated to form the output. Note When using groups in int8 mode, the size of the groups (i.e. the channel count divided by the group count) must be a multiple of 4 for both input and output. Default: 1
  • kernelWeights The kernel weights for the deconvolution. The weights are specified as a contiguous array in CKRS order, where C the number of input channels, K the number of output feature maps, and R and S are the height and width of the filter.
  • biasWeights The bias weights for the deconvolution. Bias is optional. To omit bias, set this to an empty Weights object. The bias is applied per-feature-map, so the number of weights (if non-zero) must be equal to the number of output feature maps.

IElementWiseLayer

tensorrt.ElementWiseOperation

The binary operations that may be performed by an ElementWise layer.

Members:

SUM : Sum of the two elements

POW : The first element to the power of the second element

MIN : Min of the two elements

SUB : Subtract the second element from the first

DIV : Divide the first element by the second

PROD : Product of the two elements

MAX : Max of the two elements

class tensorrt.IElementWiseLayer

A elementwise layer in an INetworkDefinition .

This layer applies a per-element binary operation between corresponding elements of two tensors.

The input dimensions of the two input tensors must be equal, and the output tensor is the same size as each input.

Variables:opElementWiseOperation The binary operation for the layer.

IGatherLayer

class tensorrt.IGatherLayer

A gather layer in an INetworkDefinition .

Variables:axisint The non-batch dimension axis to gather on. The axis must be less than the number of non-batch dimensions in the data input.

RNN Layers

tensorrt.RNNOperation

The RNN operations that may be performed by an RNN layer.

Equation definitions

In the equations below, we use the following naming convention:

t := current time step
i := input gate
o := output gate
f := forget gate
z := update gate
r := reset gate
c := cell gate
h := hidden gate
g[t] denotes the output of gate g at timestep t, e.g.`f[t]` is the output of the forget gate f .
X[t] := input tensor for timestep t
C[t] := cell state for timestep t
H[t] := hidden state for timestep t
W[g] := W (input) parameter weight matrix for gate g
R[g] := U (recurrent) parameter weight matrix for gate g
Wb[g] := W (input) parameter bias vector for gate g
Rb[g] := U (recurrent) parameter bias vector for gate g

Unless otherwise specified, all operations apply pointwise to elements of each operand tensor.

ReLU(X) := max(X, 0)
tanh(X) := hyperbolic tangent of X
sigmoid(X) := 1 / (1 + exp(-X))
exp(X) := e^X
A.B denotes matrix multiplication of A and B .
A*B denotes pointwise multiplication of A and B .

Equations

Depending on the value of RNNOperation chosen, each sub-layer of the RNN layer will perform one of the following operations:

RELU

\(H[t] := ReLU(W[i].X[t] + R[i].H[t-1] + Wb[i] + Rb[i])\)

TANH

\(H[t] := tanh(W[i].X[t] + R[i].H[t-1] + Wb[i] + Rb[i])\)

LSTM

\(i[t] := sigmoid(W[i].X[t] + R[i].H[t-1] + Wb[i] + Rb[i])\)
\(f[t] := sigmoid(W[f].X[t] + R[f].H[t-1] + Wb[f] + Rb[f])\)
\(o[t] := sigmoid(W[o].X[t] + R[o].H[t-1] + Wb[o] + Rb[o])\)
\(c[t] := tanh(W[c].X[t] + R[c].H[t-1] + Wb[c] + Rb[c])\)
\(C[t] := f[t]*C[t-1] + i[t]*c[t]\)
\(H[t] := o[t]*tanh(C[t])\)

GRU

\(z[t] := sigmoid(W[z].X[t] + R[z].H[t-1] + Wb[z] + Rb[z])\)
\(r[t] := sigmoid(W[r].X[t] + R[r].H[t-1] + Wb[r] + Rb[r])\)
\(h[t] := tanh(W[h].X[t] + r[t]*(R[h].H[t-1] + Rb[h]) + Wb[h])\)
\(H[t] := (1 - z[t])*h[t] + z[t]*H[t-1]\)

Members:

RELU : Single gate RNN w/ ReLU activation

LSTM : Four-gate LSTM network w/o peephole connections

TANH : Single gate RNN w/ TANH activation

GRU : Three-gate network consisting of Gated Recurrent Units

tensorrt.RNNDirection

The RNN direction that may be performed by an RNN layer.

Members:

UNIDIRECTION : Network iterates from first input to last input

BIDIRECTION : Network iterates from first to last (and vice versa) and outputs concatenated

tensorrt.RNNInputMode

The RNN input modes that may occur with an RNN layer.

If the RNN is configured with RNNInputMode.LINEAR , then for each gate g in the first layer of the RNN, the input vector X[t] (length E) is left-multiplied by the gate’s corresponding weight matrix W[g] (dimensions HxE) as usual, before being used to compute the gate output as described by RNNOperation .

If the RNN is configured with RNNInputMode.SKIP , then this initial matrix multiplication is “skipped” and W[g] is conceptually an identity matrix. In this case, the input vector X[t] must have length H (the size of the hidden state).

Members:

LINEAR : Perform the normal matrix multiplication in the first recurrent layer

SKIP : No operation is performed on the first recurrent layer

IRNNLayer

Deprecated since version 4.0.

class tensorrt.IRNNLayer

An RNN layer in an INetworkDefinition .

This layer applies an RNN operation on the inputs.

Deprecated This interface is superseded by IRNNv2Layer.

Variables:
  • num_layersint The number of layers in the RNN.
  • hidden_sizeint The size of the hidden layers.
  • max_seq_lengthint The sequence length. This is the maximum number of input tensors that the RNN can process at once.
  • opRNNOperation The operation of the RNN layer.
  • input_modeRNNInputMode The input mode of the RNN layer.
  • directionRNNDirection the direction of the RNN layer. The direction determines if the RNN is run as a unidirectional(left to right) or bidirectional(left to right and right to left). In the RNNDirection.BIDIRECTION case the output is concatenated together, resulting in output size of 2x hidden_size .
  • weightsWeights The weight parameters for the RNN. For more information, see IRNNLayer::setWeights().
  • biasWeights The bias parameter vector for the RNN layer. For more information see IRNNLayer::setBias().
  • data_lengthint The length of the data being processed by the RNN for use in computing other values.
  • hidden_stateITensor the initial hidden state of the RNN with the provided hidden ITensor. The layout for p hidden is a linear layout of a 3D matrix: C - The number of layers in the RNN, it must match num_layers . H - The number of mini-batches for each time sequence. W - The size of the per layer hidden states, it must match hidden_size . The amount of space required is doubled if direction is RNNDirection.BIDIRECTION with the bidirectional states coming after the unidirectional states. If not specified, then the initial hidden state is set to zero.
  • cell_stateITensor the initial cell state of the RNN with the provided p cell ITensor. The layout for p cell is a linear layout of a 3D matrix: C - The number of layers in the RNN, it must match num_layers . H - The number of mini-batches for each time sequence. W - The size of the per layer hidden states, it must match hidden_size . The amount of space required is doubled if direction is RNNDirection.BIDIRECTION with the bidirectional states coming after the unidirectional states. If not specified, then the initial cell state is set to zero. The cell state only affects LSTM RNN’s.

IRNNv2Layer

tensorrt.RNNGateType

The RNN input modes that may occur with an RNN layer.

If the RNN is configured with RNNInputMode.LINEAR , then for each gate g in the first layer of the RNN, the input vector X[t] (length E) is left-multiplied by the gate’s corresponding weight matrix W[g] (dimensions HxE) as usual, before being used to compute the gate output as described by RNNOperation .

If the RNN is configured with RNNInputMode.SKIP , then this initial matrix multiplication is “skipped” and W[g] is conceptually an identity matrix. In this case, the input vector X[t] must have length H (the size of the hidden state).

Members:

UPDATE : Update Gate

HIDDEN : Hidden Gate

OUTPUT : Output Gate

FORGET : Forget Gate

INPUT : Input Gate

CELL : Cell Gate

RESET : Reset Gate

class tensorrt.IRNNv2Layer

An RNN layer in an INetworkDefinition , version 2

Variables:
  • num_layersint The layer count of the RNN.
  • hidden_sizeint The hidden size of the RNN.
  • max_seq_lengthint The maximum sequence length of the RNN
  • data_lengthint The layer count of the RNN.
  • seq_lengthsITensor Individual sequence lengths in the batch with the ITensor provided. The seq_lengths ITensor should be a {N1, …, Np} tensor, where N1..Np are the index dimensions of the input tensor to the RNN. If seq_lengths is not specified, then the RNN layer assumes all sequences are size max_seq_length . All sequence lengths in seq_lengths should be in the range [1, max_seq_length ]. Zero-length sequences are not supported. This tensor must be of type int32.
  • opRNNOperation The operation of the RNN layer.
  • input_modeint The input mode of the RNN layer.
  • directionint The direction of the RNN layer.
  • hidden_stateITensor the initial hidden state of the RNN with the provided hidden_state ITensor . The hidden_state ITensor should have the dimensions {N1, …, Np, L, H}, where: N1..Np are the index dimensions specified by the input tensor L is the number of layers in the RNN, equal to num_layers H is the hidden state for each layer, equal to hidden_size if direction is RNNDirection.UNIDIRECTION , and 2x hidden_size otherwise.
  • cell_stateITensor The initial cell state of the LSTM with the provided cell_state ITensor . The cell_state ITensor should have the dimensions {N1, …, Np, L, H}, where: N1..Np are the index dimensions specified by the input tensor L is the number of layers in the RNN, equal to num_layers H is the hidden state for each layer, equal to hidden_size if direction is RNNDirection.UNIDIRECTION, and 2x hidden_size otherwise. It is an error to set this on an RNN layer that is not configured with RNNOperation.LSTM .
get_bias_for_gate(self: tensorrt.tensorrt.IRNNv2Layer, layer_index: int, gate: tensorrt.tensorrt.RNNGateType, is_w: bool) → array

Get the bias parameters for an individual gate in the RNN.

Parameters:
  • layer_index – The index of the layer that contains this gate.
  • gate – The name of the gate within the RNN layer.
  • is_w – True if the bias parameters are for the input bias Wb[g] and false if they are for the recurrent input bias Rb[g].
Returns:

The bias parameters.

get_weights_for_gate(self: tensorrt.tensorrt.IRNNv2Layer, layer_index: int, gate: tensorrt.tensorrt.RNNGateType, is_w: bool) → array

Get the weight parameters for an individual gate in the RNN.

Parameters:
  • layer_index – The index of the layer that contains this gate.
  • gate – The name of the gate within the RNN layer.
  • is_w – True if the weight parameters are for the input matrix W[g] and false if they are for the recurrent input matrix R[g].
Returns:

The weight parameters.

set_bias_for_gate(self: tensorrt.tensorrt.IRNNv2Layer, layer_index: int, gate: tensorrt.tensorrt.RNNGateType, is_w: bool, bias: tensorrt.tensorrt.Weights) → None

Set the bias parameters for an individual gate in the RNN.

Parameters:
  • layer_index – The index of the layer that contains this gate. Refer to IRNNLayer.weights for a description of the layer index.
  • gate – The name of the gate within the RNN layer. The gate name must correspond to one of the gates used by this layer’s RNNOperation .
  • is_w – True if the bias parameters are for the input bias Wb[g] and false if they are for the recurrent input bias Rb[g]. See RNNOperation for equations showing how these bias vectors are used in the RNN gate.
  • bias – The weight structure holding the bias parameters, which should be an array of size hidden_size .
set_weights_for_gate(self: tensorrt.tensorrt.IRNNv2Layer, layer_index: int, gate: tensorrt.tensorrt.RNNGateType, is_w: bool, weights: tensorrt.tensorrt.Weights) → None

Set the weight parameters for an individual gate in the RNN.

Parameters:
  • layer_index – The index of the layer that contains this gate. Refer to IRNNLayer.weights for a description of the layer index.
  • gate – The name of the gate within the RNN layer. The gate name must correspond to one of the gates used by this layer’s RNNOperation .
  • is_w – True if the weight parameters are for the input matrix W[g] and false if they are for the recurrent input matrix R[g]. See RNNOperation for equations showing how these matrices are used in the RNN gate.
  • weights – The weight structure holding the weight parameters, which are stored as a row-major 2D matrix. Refer to IRNNLayer.weights for documentation on the expected dimensions of this matrix.

IOutputDimensionsFormula

class tensorrt.IOutputDimensionsFormula

Application-implemented interface to compute layer output sizes.

compute(self: tensorrt.tensorrt.IOutputDimensionsFormula, input_shape: tensorrt.tensorrt.DimsHW, kernel_shape: tensorrt.tensorrt.DimsHW, stride: tensorrt.tensorrt.DimsHW, padding: tensorrt.tensorrt.DimsHW, dilation: tensorrt.tensorrt.DimsHW, layer_name: str) → tensorrt.tensorrt.DimsHW

Application-implemented interface to compute the HW output dimensions of a layer from the layer input and parameters.

Parameters:
  • input_shape – The input shape of the layer.
  • kernel_shape – The kernel shape (or window size, for a pooling layer) parameter of the layer operation.
  • stride – The stride parameter for the layer.
  • padding – The padding parameter of the layer.
  • dilation – The dilation parameter of the layer (only applicable to convolutions).
  • layer_name – The name of the layer.

return The output size of the layer

IPluginLayer

class tensorrt.IPluginLayer

A plugin layer in an INetworkDefinition .

Variables:pluginIPlugin The plugin for the layer.

IUnaryLayer

tensorrt.UnaryOperation

The unary operations that may be performed by a Unary layer.

Members:

LOG : Log (base e)

EXP : Exponentiation

RECIP : Reciprocal

SQRT : Square root

ABS : Absolute value

NEG : Negation

class tensorrt.IUnaryLayer

A unary layer in an INetworkDefinition .

Variables:opUnaryOperation The unary operation for the layer.

IReduceLayer

tensorrt.ReduceOperation

The reduce operations that may be performed by a Reduce layer

Members:

SUM :

AVG :

MIN :

PROD :

MAX :

class tensorrt.IReduceLayer

A reduce layer in an INetworkDefinition .

Variables:
  • opReduceOperation The reduce operation for the layer.
  • axesint The axes over which to reduce.
  • keep_dimsbool Specifies whether or not to keep the reduced dimensions for the layer.

IPaddingLayer

class tensorrt.IPaddingLayer

A reduce layer in an INetworkDefinition .

Variables:
  • pre_paddingDimsHW The padding that is applied at the start of the tensor. Negative padding results in trimming the edge by the specified amount.
  • post_paddingDimsHW The padding that is applied at the end of the tensor. Negative padding results in trimming the edge by the specified amount

IShuffleLayer

class tensorrt.Permutation(*args, **kwargs)

The elements of the permutation. The permutation is applied as outputDimensionIndex = permutation[inputDimensionIndex], so to permute from CHW order to HWC order, the required permutation is [1, 2, 0], and to permute from HWC to CHW, the required permutation is [2, 0, 1].

It supports iteration and indexing and is implicitly convertible to/from Python iterables (like tuple or list ). Therefore, you can use those classes in place of Permutation .

Overloaded function.

  1. __init__(self: tensorrt.tensorrt.Permutation) -> None
  2. __init__(self: tensorrt.tensorrt.Permutation, arg0: List[int]) -> None
class tensorrt.IShuffleLayer

A shuffle layer in an INetworkDefinition .

This class shuffles data by applying in sequence: a transpose operation, a reshape operation and a second transpose operation. The dimension types of the output are those of the reshape dimension.

Variables:
  • first_transposePermutation The permutation applied by the first transpose operation. Default: Identity Permutation
  • reshape_dimsPermutation The reshaped dimensions. Two special values can be used as dimensions. Value 0 copies the corresponding dimension from input. This special value can be used more than once in the dimensions. If number of reshape dimensions is less than input, 0s are resolved by aligning the most significant dimensions of input. Value -1 infers that particular dimension by looking at input and rest of the reshape dimensions. Note that only a maximum of one dimension is permitted to be specified as -1. The product of the new dimensions must be equal to the product of the old.
  • second_transposePermutation The permutation applied by the second transpose operation. Default: Identity Permutation

ITopKLayer

tensorrt.TopKOperation

The operations that may be performed by a TopK layer

Members:

MIN : Minimum of the elements

MAX : Maximum of the elements

class tensorrt.ITopKLayer

A TopK layer in an INetworkDefinition .

Variables:
  • opTopKOperation The operation for the layer.
  • kTopKOperation the k value for the layer. Currently only values up to 25 are supported.
  • axesTopKOperation The axes along which to reduce.

IMatrixMultiplyLayer

class tensorrt.IMatrixMultiplyLayer

A matrix multiply layer in an INetworkDefinition .

Let A be input 0 and B be input 1. Tensors A and B must have equal rank, which must be at least 2.

When A and B are matrices, computes op(A) * op(B), where:

op(x)=x if transpose == false
op(x)=transpose(x) if transpose == true

Transposition is of the last two dimensions. Inputs of higher rank are treated as collections of matrices.

For a dimension that is not one of the last two dimensions: If the dimension is 1 for one of the tensors but not the other tensor, the former tensor is broadcast along that dimension to match the dimension of the latter tensor.

Variables:
  • transpose0bool Whether the first input is transposed.
  • transpose1bool Whether the second input is transposed.

IRaggedSoftMaxLayer

class tensorrt.IRaggedSoftMaxLayer

A ragged softmax layer in an INetworkDefinition .

This layer takes a ZxS input tensor and an additional Zx1 bounds tensor holding the lengths of the Z sequences.

This layer computes a softmax across each of the Z sequences.

The output tensor is of the same size as the input tensor.

IConstantLayer

class tensorrt.IConstantLayer

A constant layer in an INetworkDefinition .

Variables:
  • weightsWeights The weights for the layer.
  • shapeDims The shape of the layer.