INetworkDefinition¶
-
class
tensorrt.
INetworkDefinition
¶ Represents a TensorRT Network from which the Builder can build an Engine
Variables: - pooling_output_dimensions_formula –
IOutputDimensionsFormula
The formula from computing the pooling output dimensions. If set toNone
, the default formula is used. The default formula in each dimension is \((inputDim + padding * 2 - kernelSize) / stride + 1\) . - convolution_output_dimensions_formula –
IOutputDimensionsFormula
Deprecated Does not currently work reliably and will be removed in a future release. The formula from computing the convolution output dimensions. If set toNone
, the default formula is used. The default formula in each dimension is \((inputDim + padding * 2 - kernelSize) / stride + 1\) . - deconvolution_output_dimensions_formula –
IOutputDimensionsFormula
Deprecated Does not currently work reliably and will be removed in a future release. The formula from computing the deconvolution output dimensions. IfNone
is passed, the default formula is used. The default formula in each dimension is \((inputDim - 1) * stride + kernelSize - 2 * padding\) . - num_layers –
int
The number of layers in the network. - num_inputs –
int
The number of inputs of the network. - num_outputs –
int
The number of outputs of the network.
-
add_activation
(self: tensorrt.tensorrt.INetworkDefinition, input: tensorrt.tensorrt.ITensor, type: tensorrt.tensorrt.ActivationType) → tensorrt.tensorrt.IActivationLayer¶ Add an activation layer to the network. See
IActivationLayer
for more information.Parameters: - input – The input tensor to the layer.
- type – The type of activation function to apply.
Returns: The new activation layer, or
None
if it could not be created.
-
add_concatenation
(self: tensorrt.tensorrt.INetworkDefinition, inputs: List[tensorrt.tensorrt.ITensor]) → tensorrt.tensorrt.IConcatenationLayer¶ Add a concatenation layer to the network. Note that all tensors must have the same dimension except for the Channel dimension. See
IConcatenationLayer
for more information.Parameters: inputs – The input tensors to the layer. Returns: The new concatenation layer, or None
if it could not be created.
-
add_constant
(self: tensorrt.tensorrt.INetworkDefinition, shape: tensorrt.tensorrt.Dims, weights: tensorrt.tensorrt.Weights) → tensorrt.tensorrt.IConstantLayer¶ Add a constant layer to the network. See
IConstantLayer
for more information.Parameters: - shape – The shape of the constant.
- weights – The constant value, represented as weights.
Returns: The new constant layer, or
None
if it could not be created.
-
add_convolution
(self: tensorrt.tensorrt.INetworkDefinition, input: tensorrt.tensorrt.ITensor, num_output_maps: int, kernel_shape: tensorrt.tensorrt.DimsHW, kernel: tensorrt.tensorrt.Weights, bias: tensorrt.tensorrt.Weights) → tensorrt.tensorrt.IConvolutionLayer¶ Add a convolution layer to the network. See
IConvolutionLayer
for more information.Parameters: - input – The input tensor to the convolution.
- num_output_maps – The number of output feature maps for the convolution.
- kernel_shape – The dimensions of the convolution kernel.
- kernel – The kernel weights for the convolution.
- bias – The optional bias weights for the convolution.
Returns: The new convolution layer, or
None
if it could not be created.
-
add_deconvolution
(self: tensorrt.tensorrt.INetworkDefinition, input: tensorrt.tensorrt.ITensor, num_output_maps: int, kernel_shape: tensorrt.tensorrt.DimsHW, kernel: tensorrt.tensorrt.Weights, bias: tensorrt.tensorrt.Weights) → tensorrt.tensorrt.IDeconvolutionLayer¶ Add a deconvolution layer to the network. See
IDeconvolutionLayer
for more information.Parameters: - input – The input tensor to the layer.
- num_output_maps – The number of output feature maps.
- kernel_shape – The dimensions of the convolution kernel.
- kernel – The kernel weights for the convolution.
- bias – The optional bias weights for the convolution.
Returns: The new deconvolution layer, or
None
if it could not be created.
-
add_elementwise
(self: tensorrt.tensorrt.INetworkDefinition, input1: tensorrt.tensorrt.ITensor, input2: tensorrt.tensorrt.ITensor, op: tensorrt.tensorrt.ElementWiseOperation) → tensorrt.tensorrt.IElementWiseLayer¶ Add an elementwise layer to the network. See
IElementWiseLayer
for more information.Parameters: - input1 – The first input tensor to the layer.
- input2 – The second input tensor to the layer.
- op – The binary operation that the layer applies.
The input tensors must have the same number of dimensions. For each dimension, their lengths must match, or one of them must be one. In the latter case, the tensor is broadcast along that axis.
The output tensor has the same number of dimensions as the inputs. For each dimension, its length is the maximum of the lengths of the corresponding input dimension.
Returns: The new element-wise layer, or None
if it could not be created.
-
add_fully_connected
(self: tensorrt.tensorrt.INetworkDefinition, input: tensorrt.tensorrt.ITensor, num_outputs: int, kernel: tensorrt.tensorrt.Weights, bias: tensorrt.tensorrt.Weights) → tensorrt.tensorrt.IFullyConnectedLayer¶ Add a fully connected layer to the network. See
IFullyConnectedLayer
for more information.Parameters: - input – The input tensor to the layer.
- num_outputs – The number of outputs of the layer.
- kernel – The kernel weights for the convolution.
- bias – The optional bias weights for the convolution.
Returns: The new fully connected layer, or
None
if it could not be created.
-
add_gather
(self: tensorrt.tensorrt.INetworkDefinition, input: tensorrt.tensorrt.ITensor, indices: tensorrt.tensorrt.ITensor, axis: int) → tensorrt.tensorrt.IGatherLayer¶ Add a pooling layer to the network. See
IGatherLayer
for more information.Parameters: - input – The tensor to gather values from.
- indices – The tensor to get indices from to populate the output tensor.
- axis – The non-batch dimension axis in the data tensor to gather on.
Returns: The new pooling layer, or
None
if it could not be created.
-
add_input
(self: tensorrt.tensorrt.INetworkDefinition, name: str, dtype: tensorrt.tensorrt.DataType, shape: tensorrt.tensorrt.Dims) → tensorrt.tensorrt.ITensor¶ Adds an input to the network.
Parameters: - name – The name of the tensor.
- dtype – The data type of the tensor. Currently, trt.int8 is not supported for inputs.
- shape – The dimensions of the tensor. The total volume must be less than 2^30 elements.
Returns: The newly added Tensor.
-
add_lrn
(self: tensorrt.tensorrt.INetworkDefinition, input: tensorrt.tensorrt.ITensor, window: int, alpha: float, beta: float, k: float) → tensorrt.tensorrt.ILRNLayer¶ Add a LRN layer to the network. See
ILRNLayer
for more information.Parameters: - input – The input tensor to the layer.
- window – The size of the window.
- alpha – The alpha value for the LRN computation.
- beta – The beta value for the LRN computation.
- k – The k value for the LRN computation.
Returns: The new LRN layer, or
None
if it could not be created.
-
add_matrix_multiply
(self: tensorrt.tensorrt.INetworkDefinition, input0: tensorrt.tensorrt.ITensor, transpose0: bool, input1: tensorrt.tensorrt.ITensor, transpose1: bool) → tensorrt.tensorrt.IMatrixMultiplyLayer¶ Add a matrix multiply layer to the network. See
IMatrixMultiplyLayer
for more information.Parameters: - input0 – The first input tensor (commonly A).
- transpose0 – If true, op(input0)=transpose(input0), else op(input0)=input0.
- input1 – The second input tensor (commonly B).
- transpose1 – If true, op(input1)=transpose(input1), else op(input1)=input1.
Returns: The new matrix multiply layer, or
None
if it could not be created.
-
add_padding
(self: tensorrt.tensorrt.INetworkDefinition, input: tensorrt.tensorrt.ITensor, pre_padding: tensorrt.tensorrt.DimsHW, post_padding: tensorrt.tensorrt.DimsHW) → tensorrt.tensorrt.IPaddingLayer¶ Add a padding layer to the network. See
IPaddingLayer
for more information.Parameters: - input – The input tensor to the layer.
- pre_padding – The padding to apply to the start of the tensor.
- post_padding – The padding to apply to the end of the tensor.
Returns: The new padding layer, or
None
if it could not be created.
-
add_plugin
(self: tensorrt.tensorrt.INetworkDefinition, inputs: List[tensorrt.tensorrt.ITensor], plugin: tensorrt.tensorrt.IPlugin) → tensorrt.tensorrt.IPluginLayer¶ Add a plugin layer to the network. See
IPlugin
for more information.Parameters: - inputs – The input tensors to the layer.
- plugin – The layer plugin.
Returns: The new plugin layer, or
None
if it could not be created.
-
add_plugin_ext
(self: tensorrt.tensorrt.INetworkDefinition, inputs: List[tensorrt.tensorrt.ITensor], plugin: tensorrt.tensorrt.IPluginExt) → tensorrt.tensorrt.IPluginLayer¶ Add a plugin layer to the network using an
IPluginExt
interface. SeeIPluginExt
for more information.Parameters: - inputs – The input tensors to the layer.
- plugin – The layer plugin.
Returns: The new plugin layer, or
None
if it could not be created.
-
add_pooling
(self: tensorrt.tensorrt.INetworkDefinition, input: tensorrt.tensorrt.ITensor, type: tensorrt.tensorrt.PoolingType, window_size: tensorrt.tensorrt.DimsHW) → tensorrt.tensorrt.IPoolingLayer¶ Add a pooling layer to the network. See
IPoolingLayer
for more information.Parameters: - input – The input tensor to the layer.
- type – The type of pooling to apply.
- window_size – The size of the pooling window.
Returns: The new pooling layer, or
None
if it could not be created.
-
add_ragged_softmax
(self: tensorrt.tensorrt.INetworkDefinition, input: tensorrt.tensorrt.ITensor, bounds: tensorrt.tensorrt.ITensor) → tensorrt.tensorrt.IRaggedSoftMaxLayer¶ Add a ragged softmax layer to the network. See
IRaggedSoftMaxLayer
for more information.Parameters: - input – The ZxS input tensor.
- bounds – The Zx1 bounds tensor.
Returns: The new ragged softmax layer, or
None
if it could not be created.
-
add_reduce
(self: tensorrt.tensorrt.INetworkDefinition, input: tensorrt.tensorrt.ITensor, op: tensorrt.tensorrt.ReduceOperation, axes: int, keep_dims: bool) → tensorrt.tensorrt.IReduceLayer¶ Add a reduce layer to the network. See
IReduceLayer
for more information.Parameters: - input – The input tensor to the layer.
- op – The reduction operation to perform.
- axes –
The reduction dimensions.
Bit 0 of the uint32_t type corresponds to the non-batch dimension 0 boolean and so on.If a bit is set, then the corresponding dimension will be reduced.Let’s say we have an NCHW tensor as input (three non-batch dimensions).Bit 0 corresponds to the C dimension boolean.Bit 1 corresponds to the H dimension boolean.Bit 2 corresponds to the W dimension boolean.Note that reduction is not permitted over the batch size dimension. - keep_dims – The boolean that specifies whether or not to keep the reduced dimensions in the output of the layer.
Returns: The new reduce layer, or
None
if it could not be created.
-
add_rnn
(self: tensorrt.tensorrt.INetworkDefinition, input: tensorrt.tensorrt.ITensor, layer_count: int, hidden_size: int, max_seq_length: int, op: tensorrt.tensorrt.RNNOperation, mode: tensorrt.tensorrt.RNNInputMode, direction: tensorrt.tensorrt.RNNDirection, weights: tensorrt.tensorrt.Weights, bias: tensorrt.tensorrt.Weights) → tensorrt.tensorrt.IRNNLayer¶ Add a
layer_count
deep RNN layer to the network with a sequence length ofmax_seq_length
andhidden_size
internal state per layer. SeeIRNNLayer
for more information.Parameters: - input – The input tensor to the layer.
- layer_count – The number of layers in the RNN.
- hidden_size – The size of the internal hidden state for each layer.
- max_seq_length – The maximum length of the time sequence.
- op – The type of RNN to execute.
- mode – The input mode for the RNN.
- direction – The direction to run the RNN.
- weights – The weights for the weight matrix parameters of the RNN.
- bias – The weights for the bias vectors parameters of the RNN.
The input tensors must be of the type
float32
orfloat16
.See
IRNNLayer
for details on the required input format forweights
andbias
.The layout for the
input
tensor should be {1, S_max, N, E}, where:S_max is the maximum allowed sequence length (number of RNN iterations)N is the batch sizeE specifies the embedding length (unlessRNNInputMode.SKIP
is set, in which case it should matchhidden_size
).The first output tensor is the output of the final RNN layer across all timesteps, with dimensions {S_max, N, H}:
S_max is the maximum allowed sequence length (number of RNN iterations)N is the batch sizeH is an output hidden state (equal tohidden_size
or 2xhidden_size
)The second tensor is the final hidden state of the RNN across all layers, and if the RNN is an LSTM (i.e.
op
isRNNOperation.LSTM
), then the third tensor is the final cell state of the RNN across all layers. Both the second and third output tensors have dimensions {L, N, H}:L is equal tonum_layers
if getDirection isRNNDirection.UNIDIRECTION
, and 2*num_layers
if getDirection isRNNDirection.BIDIRECTION
. In the bi-directional case, layer l’s final forward hidden state is stored in L = 2*l, and final backward hidden state is stored in L = 2*l + 1 .N is the batch sizeH ishidden_size
.Note that in bidirectional RNNs, the full “hidden state” for a layer l is the concatenation of its forward hidden state and its backward hidden state, and its size is 2*H.
Deprecated IRNNLayer is superseded by IRNNv2Layer. Use add_rnn_v2() instead.
Returns: The new RNN layer, or None
if it could not be created.
-
add_rnn_v2
(self: tensorrt.tensorrt.INetworkDefinition, input: tensorrt.tensorrt.ITensor, layer_count: int, hidden_size: int, max_seq_length: int, op: tensorrt.tensorrt.RNNOperation) → tensorrt.tensorrt.IRNNv2Layer¶ Add an RNNv2 layer to the network. See
IRNNv2Layer
for more information.Add an
layer_count
deep RNN layer to the network withhidden_size
internal states that can take a batch with fixed or variable sequence lengths.Parameters: - input – The input tensor to the layer (see below).
- layer_count – The number of layers in the RNN.
- hidden_size – Size of the internal hidden state for each layer.
- max_seq_length – Maximum sequence length for the input.
- op – The type of RNN to execute.
By default, the layer is configured with
RNNDirection.UNIDIRECTION
andRNNInputMode.LINEAR
. To change these settings, setIRNNv2Layer.direction
andIRNNv2Layer.input_mode
.Weights and biases for the added layer should be set using
IRNNv2Layer.set_weights_for_gate()
andIRNNv2Layer.set_bias_for_gate()
prior to building an engine using this network.The input tensors must be of the type
float32
orfloat16
. The layout of the weights is row major and must be the same datatype as the input tensor.weights
contain 8 matrices andbias
contains 8 vectors.See
IRNNv2Layer.set_weights_for_gate()
andIRNNv2Layer.set_bias_for_gate()
for details on the required input format forweights
andbias
.The
input
ITensor should contain zero or more index dimensions {N1, …, Np}, followed by two dimensions, defined as follows:S_max is the maximum allowed sequence length (number of RNN iterations)E specifies the embedding length (unlessRNNInputMode.SKIP
is set, in which case it should matchIRNNv2Layer.hidden_size
).By default, all sequences in the input are assumed to be size
max_seq_length
. To provide explicit sequence lengths for each input sequence in the batch, setIRNNv2Layer.seq_lengths
.The RNN layer outputs up to three tensors.
The first output tensor is the output of the final RNN layer across all timesteps, with dimensions {N1, …, Np, S_max, H}:
N1..Np are the index dimensions specified by the input tensorS_max is the maximum allowed sequence length (number of RNN iterations)H is an output hidden state (equal toIRNNv2Layer.hidden_size
or 2xIRNNv2Layer.hidden_size
)The second tensor is the final hidden state of the RNN across all layers, and if the RNN is an LSTM (i.e.
IRNNv2Layer.op
isRNNOperation.LSTM
), then the third tensor is the final cell state of the RNN across all layers. Both the second and third output tensors have dimensions {N1, …, Np, L, H}:N1..Np are the index dimensions specified by the input tensorL is the number of layers in the RNN, equal toIRNNv2Layer.num_layers
H is the hidden state for each layer, equal toIRNNv2Layer.hidden_size
if getDirection isRNNDirection.UNIDIRECTION
, and 2xIRNNv2Layer.hidden_size
otherwise.Returns: The new RNNv2 layer, or None
if it could not be created.
-
add_scale
(self: tensorrt.tensorrt.INetworkDefinition, input: tensorrt.tensorrt.ITensor, mode: tensorrt.tensorrt.ScaleMode, shift: tensorrt.tensorrt.Weights = <tensorrt.tensorrt.Weights object at 0x7f2a9db2da40>, scale: tensorrt.tensorrt.Weights = <tensorrt.tensorrt.Weights object at 0x7f2a9db2da08>, power: tensorrt.tensorrt.Weights = <tensorrt.tensorrt.Weights object at 0x7f2a9db2d9d0>) → tensorrt.tensorrt.IScaleLayer¶ Add a scale layer to the network. See
IScaleLayer
for more information.Parameters: - input – The input tensor to The layer. This tensor is required to have a minimum of 3 dimensions.
- mode – The scaling mode.
- shift – The shift value.
- scale – The scale value.
- power – The power value.
If the weights are available, then the size of weights are dependent on the on the ScaleMode. For UNIFORM, the number of weights is equal to 1. For CHANNEL, the number of weights is equal to the channel dimension. For ELEMENTWISE, the number of weights is equal to the volume of the input.
Returns: The new scale layer, or None
if it could not be created.
-
add_shuffle
(self: tensorrt.tensorrt.INetworkDefinition, input: tensorrt.tensorrt.ITensor) → tensorrt.tensorrt.IShuffleLayer¶ Add a shuffle layer to the network. See
IShuffleLayer
for more information.:arg :input The input tensor to the layer.
Returns: The new shuffle layer, or None
if it could not be created.
-
add_softmax
(self: tensorrt.tensorrt.INetworkDefinition, input: tensorrt.tensorrt.ITensor) → tensorrt.tensorrt.ISoftMaxLayer¶ Add a softmax layer to the network. See
ISoftMaxLayer
for more information.Parameters: input – The input tensor to the layer. Returns: The new softmax layer, or None
if it could not be created.
-
add_topk
(self: tensorrt.tensorrt.INetworkDefinition, input: tensorrt.tensorrt.ITensor, op: tensorrt.tensorrt.TopKOperation, k: int, axes: int) → tensorrt.tensorrt.ITopKLayer¶ Add a TopK layer to the network. See
ITopKLayer
for more information.The TopK layer has two outputs of the same dimensions. The first contains data values, the second contains index positions for the values. Output values are sorted, largest first for operation
TopKOperation.MAX
and smallest first for operationTopKOperation.MIN
.Currently only values of K up to 1024 are supported.
Parameters: - input – The input tensor to the layer.
- op – Operation to perform.
- k – Number of elements to keep.
- axes – The reduction dimensions. Bit 0 of the uint32_t type corresponds to the non-batch dimension 0 boolean and so on. If a bit is set, then the corresponding dimension will be reduced. Let’s say we have an NCHW tensor as input (three non-batch dimensions). Bit 0 corresponds to the C dimension boolean. Bit 1 corresponds to the H dimension boolean. Bit 2 corresponds to the W dimension boolean. Note that TopK reduction is currently only permitted over one dimension.
Returns: The new TopK layer, or
None
if it could not be created.
-
add_unary
(self: tensorrt.tensorrt.INetworkDefinition, input: tensorrt.tensorrt.ITensor, op: tensorrt.tensorrt.UnaryOperation) → tensorrt.tensorrt.IUnaryLayer¶ Add a unary layer to the network. See
IUnaryLayer
for more information.Parameters: - input – The input tensor to the layer.
- op – The operation to apply.
Returns: The new unary layer, or
None
if it could not be created.
-
get_input
(self: tensorrt.tensorrt.INetworkDefinition, index: int) → tensorrt.tensorrt.ITensor¶ Get the input tensor specified by the given index.
Parameters: index – The index of the input tensor. Returns: The tensor, or None
if it is out of range.
-
get_layer
(self: tensorrt.tensorrt.INetworkDefinition, index: int) → tensorrt.tensorrt.ILayer¶ Get the layer specified by the given index.
Parameters: index – The index of the layer. Returns: The layer, or None
if it is out of range.
-
get_output
(self: tensorrt.tensorrt.INetworkDefinition, index: int) → tensorrt.tensorrt.ITensor¶ Get the output tensor specified by the given index.
Parameters: index – The index of the output tensor. Returns: The tensor, or None
if it is out of range.
-
mark_output
(self: tensorrt.tensorrt.INetworkDefinition, tensor: tensorrt.tensorrt.ITensor) → None¶ Mark a tensor as an output.
Parameters: tensor – The tensor to mark.
- pooling_output_dimensions_formula –