Layers¶
For more information, including examples, refer to the TensorRT Operator’s Reference documentation.
Note that layer weight properties may be represented as NumPy arrays or Weights objects depending on whether the underlying datatype is supported by NumPy. Explicitly construct a Weights object from the property if you want a consistent type:
conv_layer = network.add_convolution_nd(...)
bias_weights = trt.Weights(conv_layer.bias)
PaddingMode¶
- tensorrt.PaddingMode¶
- Enumerates types of padding available in convolution, deconvolution and pooling layers.
- Padding mode takes precedence if both - padding_modeand- pre_paddingare set.EXPLICIT* corresponds to explicit padding.SAME* implicitly calculates padding such that the output dimensions are the same as the input dimensions. For convolution and pooling, output dimensions are determined by ceil(input dimensions, stride).CAFFE* corresponds to symmetric padding.
 - Members: - EXPLICIT_ROUND_DOWN : Use explicit padding, rounding the output size down - EXPLICIT_ROUND_UP : Use explicit padding, rounding the output size up - SAME_UPPER : Use SAME padding, with - pre_padding<=- post_padding- SAME_LOWER : Use SAME padding, with - pre_padding>=- post_padding
ICastLayer¶
- class tensorrt.ICastLayer¶
- A layer that represents the cast function. - This layer casts the element of a given input tensor to a specified data type and returns an output tensor of the same shape in the converted type. - Conversions between all types except FP8 is supported. - Variables:
- to_type – - DataTypeThe specified data type of the output tensor.
 
IConvolutionLayer¶
- class tensorrt.IConvolutionLayer¶
- A convolution layer in an - INetworkDefinition.- This layer performs a correlation operation between 3 or 4 dimensional filter with a 4 or 5 dimensional tensor to produce another 4 or 5 dimensional tensor. - An optional bias argument is supported, which adds a per-channel constant to each value in the output. - Variables:
- num_output_maps – - intThe number of output maps for the convolution.
- pre_padding – - DimsHWThe pre-padding. The start of input will be zero-padded by this number of elements in the height and width directions. Default: (0, 0)
- post_padding – - DimsHWThe post-padding. The end of input will be zero-padded by this number of elements in the height and width directions. Default: (0, 0)
- padding_mode – - PaddingModeThe padding mode. Padding mode takes precedence if both- IConvolutionLayer.padding_modeand either- IConvolutionLayer.pre_paddingor- IConvolutionLayer.post_paddingare set.
- num_groups – - intThe number of groups for a convolution. The input tensor channels are divided into this many groups, and a convolution is executed for each group, using a filter per group. The results of the group convolutions are concatenated to form the output. Note When using groups in int8 mode, the size of the groups (i.e. the channel count divided by the group count) must be a multiple of 4 for both input and output. Default: 1.
- kernel – - WeightsThe kernel weights for the convolution. The weights are specified as a contiguous array in GKCRS order, where G is the number of groups, K the number of output feature maps, C the number of input channels, and R and S are the height and width of the filter.
- bias – - WeightsThe bias weights for the convolution. Bias is optional. To omit bias, set this to an empty- Weightsobject. The bias is applied per-channel, so the number of weights (if non-zero) must be equal to the number of output feature maps.
- kernel_size_nd – - DimsThe multi-dimension kernel size of the convolution.
- stride_nd – - DimsThe multi-dimension stride of the convolution. Default: (1, …, 1)
- padding_nd – - DimsThe multi-dimension padding of the convolution. The input will be zero-padded by this number of elements in each dimension. If the padding is asymmetric, this value corresponds to the pre-padding. Default: (0, …, 0)
- dilation_nd – - DimsThe multi-dimension dilation for the convolution. Default: (1, …, 1)
 
 
IGridSampleLayer¶
- tensorrt.InterpolationMode¶
- Various modes of interpolation, used in resize and grid_sample layers. - Members: - NEAREST : 1D, 2D, and 3D nearest neighbor interpolation. - LINEAR : Supports linear, bilinear, trilinear interpolation. - CUBIC : Supports bicubic interpolation. 
- tensorrt.SampleMode¶
- Controls how ISliceLayer and IGridSample handles out of bounds coordinates - Members: - STRICT_BOUNDS : Fail with error when the coordinates are out of bounds. - WRAP : Coordinates wrap around periodically. - CLAMP : Out of bounds indices are clamped to bounds - FILL : Use fill input value when coordinates are out of bounds. - REFLECT : Coordinates reflect. 
- class tensorrt.IGridSampleLayer¶
- A grid sample layer in an - INetworkDefinition.- This layer uses an input tensor and a grid tensor to produce an interpolated output tensor. The input and grid tensors must shape tensors of rank 4. The only supported SampleMode s are trt.samplemode.CLAMP, trt.samplemode.FILL, and trt.samplemode.REFLECT. - Variables:
- interpolation_mode – class:InterpolationMode The interpolation type to use. Defaults to LINEAR. 
- align_corners – class:bool the align mode to use. Defaults to False. 
- sample_mode – - SampleModeThe sample mode to use. Defaults to FILL.
 
 
IActivationLayer¶
- tensorrt.ActivationType¶
- The type of activation to perform. - Members: - RELU : Rectified Linear activation - SIGMOID : Sigmoid activation - TANH : Hyperbolic Tangent activation - LEAKY_RELU : Leaky Relu activation: f(x) = x if x >= 0, f(x) = alpha * x if x < 0 - ELU : Elu activation: f(x) = x if x >= 0, f(x) = alpha * (exp(x) - 1) if x < 0 - SELU : Selu activation: f(x) = beta * x if x > 0, f(x) = beta * (alpha * exp(x) - alpha) if x <= 0 - SOFTSIGN : Softsign activation: f(x) = x / (1 + abs(x)) - SOFTPLUS : Softplus activation: f(x) = alpha * log(exp(beta * x) + 1) - CLIP : Clip activation: f(x) = max(alpha, min(beta, x)) - HARD_SIGMOID : Hard sigmoid activation: f(x) = max(0, min(1, alpha * x + beta)) - SCALED_TANH : Scaled Tanh activation: f(x) = alpha * tanh(beta * x) - THRESHOLDED_RELU : Thresholded Relu activation: f(x) = x if x > alpha, f(x) = 0 if x <= alpha - GELU_ERF : GELU erf activation: 0.5 * x * (1 + erf(sqrt(0.5) * x)) - GELU_TANH : GELU tanh activation: 0.5 * x * (1 + tanh(sqrt(2/pi) * (0.044715F * pow(x, 3) + x))) 
- class tensorrt.IActivationLayer¶
- An Activation layer in an - INetworkDefinition. This layer applies a per-element activation function to its input. The output has the same shape as the input.- Variables:
- type – - ActivationTypeThe type of activation to be performed.
- alpha – - floatThe alpha parameter that is used by some parametric activations (LEAKY_RELU, ELU, SELU, SOFTPLUS, CLIP, HARD_SIGMOID, SCALED_TANH). Other activations ignore this parameter.
- beta – - floatThe beta parameter that is used by some parametric activations (SELU, SOFTPLUS, CLIP, HARD_SIGMOID, SCALED_TANH). Other activations ignore this parameter.
 
 
IPoolingLayer¶
- tensorrt.PoolingType¶
- The type of pooling to perform in a pooling layer. - Members: - MAX : Maximum over elements - AVERAGE : Average over elements. If the tensor is padded, the count includes the padding - MAX_AVERAGE_BLEND : Blending between the max pooling and average pooling: (1-blendFactor)*maxPool + blendFactor*avgPool 
- class tensorrt.IPoolingLayer¶
- A Pooling layer in an - INetworkDefinition. The layer applies a reduction operation within a window over the input.- Variables:
- type – - PoolingTypeThe type of pooling to be performed.
- pre_padding – - DimsHWThe pre-padding. The start of input will be zero-padded by this number of elements in the height and width directions. Default: (0, 0)
- post_padding – - DimsHWThe post-padding. The end of input will be zero-padded by this number of elements in the height and width directions. Default: (0, 0)
- padding_mode – - PaddingModeThe padding mode. Padding mode takes precedence if both- IPoolingLayer.padding_modeand either- IPoolingLayer.pre_paddingor- IPoolingLayer.post_paddingare set.
- blend_factor – - floatThe blending factor for the max_average_blend mode: \(max_average_blendPool = (1-blendFactor)*maxPool + blendFactor*avgPool\) .- blend_factoris a user value in [0,1] with the default value of 0.0. This value only applies for the- PoolingType.MAX_AVERAGE_BLENDmode.
- average_count_excludes_padding – - boolWhether average pooling uses as a denominator the overlap area between the window and the unpadded input. If this is not set, the denominator is the overlap between the pooling window and the padded input. Default: True
- window_size_nd – - DimsThe multi-dimension window size for pooling.
- stride_nd – - DimsThe multi-dimension stride for pooling. Default: (1, …, 1)
- padding_nd – - DimsThe multi-dimension padding for pooling. Default: (0, …, 0)
 
 
ILRNLayer¶
- class tensorrt.ILRNLayer¶
- A LRN layer in an - INetworkDefinition. The output size is the same as the input size.- Variables:
- window_size – - intThe LRN window size. The window size must be odd and in the range of [1, 15].
- alpha – - floatThe LRN alpha value. The valid range is [-1e20, 1e20].
- beta – - floatThe LRN beta value. The valid range is [0.01, 1e5f].
- k – - floatThe LRN K value. The valid range is [1e-5, 1e10].
 
 
IScaleLayer¶
- tensorrt.ScaleMode¶
- Controls how scale is applied in a Scale layer. - Members: - UNIFORM : Identical coefficients across all elements of the tensor. - CHANNEL : Per-channel coefficients. The channel dimension is assumed to be the third to last dimension. - ELEMENTWISE : Elementwise coefficients. 
- class tensorrt.IScaleLayer¶
- A Scale layer in an - INetworkDefinition.- This layer applies a per-element computation to its input: - \(output = (input * scale + shift) ^ {power}\) - The coefficients can be applied on a per-tensor, per-channel, or per-element basis. - Note If the number of weights is 0, then a default value is used for shift, power, and scale. The default shift is 0, the default power is 1, and the default scale is 1. - The output size is the same as the input size. - Note The input tensor for this layer is required to have a minimum of 3 dimensions. 
ISoftMaxLayer¶
- class tensorrt.ISoftMaxLayer¶
- A Softmax layer in an - INetworkDefinition.- This layer applies a per-channel softmax to its input. - The output size is the same as the input size. - Variables:
- axes – - intThe axis along which softmax is computed. Currently, only one axis can be set.
 - The axis is specified by setting the bit corresponding to the axis to 1, as a bit mask. - For example, consider an NCHW tensor as input (three non-batch dimensions). - By default, softmax is performed on the axis which is the number of axes minus three. It is 0 if there are fewer than 3 non-batch axes. For example, if the input is NCHW, the default axis is C. If the input is NHW, then the default axis is H. Bit 0 corresponds to the N dimension boolean.Bit 1 corresponds to the C dimension boolean.Bit 2 corresponds to the H dimension boolean.Bit 3 corresponds to the W dimension boolean.By default, softmax is performed on the axis which is the number of axes minus three. It is 0 ifthere are fewer than 3 axes. For example, if the input is NCHW, the default axis is C. If the inputis NHW, then the default axis is N.For example, to perform softmax on axis R of a NPQRCHW input, set bit 3.- The following constraints must be satisfied to execute this layer on DLA: - Axis must be one of the channel or spatial dimensions. 
- There are two classes of supported input sizes: - Non-axis, non-batch dimensions are all 1 and the axis dimension is at most 8192. This is the recommended case for using softmax since it is the most accurate. 
- At least one non-axis, non-batch dimension greater than 1 and the axis dimension is at most 1024. Note that in this case, there may be some approximation error as the axis dimension size approaches the upper bound. See the TensorRT Developer Guide for more details on the approximation error. 
 
 
IConcatenationLayer¶
- class tensorrt.IConcatenationLayer¶
- A concatenation layer in an - INetworkDefinition.- The output channel size is the sum of the channel sizes of the inputs. The other output sizes are the same as the other input sizes, which must all match. - Variables:
- axis – - intThe axis along which concatenation occurs. The default axis is the number of tensor dimensions minus three, or zero if the tensor has fewer than three dimensions. For example, for a tensor with dimensions NCHW, it is C.
 
IDeconvolutionLayer¶
- class tensorrt.IDeconvolutionLayer¶
- A deconvolution layer in an - INetworkDefinition.- Variables:
- num_output_maps – - intThe number of output feature maps for the deconvolution.
- pre_padding – - DimsHWThe pre-padding. The start of input will be zero-padded by this number of elements in the height and width directions. Default: (0, 0)
- post_padding – - DimsHWThe post-padding. The end of input will be zero-padded by this number of elements in the height and width directions. Default: (0, 0)
- padding_mode – - PaddingModeThe padding mode. Padding mode takes precedence if both- IDeconvolutionLayer.padding_modeand either- IDeconvolutionLayer.pre_paddingor- IDeconvolutionLayer.post_paddingare set.
- num_groups – - intThe number of groups for a deconvolution. The input tensor channels are divided into this many groups, and a deconvolution is executed for each group, using a filter per group. The results of the group convolutions are concatenated to form the output. Note When using groups in int8 mode, the size of the groups (i.e. the channel count divided by the group count) must be a multiple of 4 for both input and output. Default: 1
- kernel – - WeightsThe kernel weights for the deconvolution. The weights are specified as a contiguous array in CKRS order, where C the number of input channels, K the number of output feature maps, and R and S are the height and width of the filter.
- bias – - WeightsThe bias weights for the deconvolution. Bias is optional. To omit bias, set this to an empty- Weightsobject. The bias is applied per-feature-map, so the number of weights (if non-zero) must be equal to the number of output feature maps.
- kernel_size_nd – - DimsThe multi-dimension kernel size of the convolution.
- stride_nd – - DimsThe multi-dimension stride of the deconvolution. Default: (1, …, 1)
- padding_nd – - DimsThe multi-dimension padding of the deconvolution. The input will be zero-padded by this number of elements in each dimension. Padding is symmetric. Default: (0, …, 0)
 
 
IElementWiseLayer¶
- tensorrt.ElementWiseOperation¶
- The binary operations that may be performed by an ElementWise layer. - Members: - SUM : Sum of the two elements - PROD : Product of the two elements - MAX : Max of the two elements - MIN : Min of the two elements - SUB : Subtract the second element from the first - DIV : Divide the first element by the second - POW : The first element to the power of the second element - FLOOR_DIV : Floor division of the first element by the second - AND : Logical AND of two elements - OR : Logical OR of two elements - XOR : Logical XOR of two elements - EQUAL : Check if two elements are equal - GREATER : Check if element in first tensor is greater than corresponding element in second tensor - LESS : Check if element in first tensor is less than corresponding element in second tensor 
- class tensorrt.IElementWiseLayer¶
- A elementwise layer in an - INetworkDefinition.- This layer applies a per-element binary operation between corresponding elements of two tensors. - The input dimensions of the two input tensors must be equal, and the output tensor is the same size as each input. - Variables:
- op – - ElementWiseOperationThe binary operation for the layer.
 
IGatherLayer¶
- class tensorrt.IGatherLayer¶
- A gather layer in an - INetworkDefinition.- Variables:
- axis – - intThe non-batch dimension axis to gather on. The axis must be less than the number of non-batch dimensions in the data input.
- num_elementwise_dims – - intThe number of leading dimensions of indices tensor to be handled elementwise. For GatherMode.DEFAULT, it can be 0 or 1. For GatherMode::kND, it can be between 0 and one less than rank(data). For GatherMode::kELEMENT, it must be 0.
- mode – - GatherModeThe gather mode.
 
 
IPluginV2Layer¶
- class tensorrt.IPluginV2Layer¶
- A plugin layer in an - INetworkDefinition.- Variables:
- plugin – - IPluginV2The plugin for the layer.
 
IPluginV3Layer¶
- class tensorrt.IPluginV3Layer¶
- A plugin layer in an - INetworkDefinition.- Variables:
- plugin – - IPluginV3The plugin for the layer.
 
IUnaryLayer¶
- tensorrt.UnaryOperation¶
- The unary operations that may be performed by a Unary layer. - Members: - EXP : Exponentiation - LOG : Log (base e) - SQRT : Square root - RECIP : Reciprocal - ABS : Absolute value - NEG : Negation - SIN : Sine - COS : Cosine - TAN : Tangent - SINH : Hyperbolic sine - COSH : Hyperbolic cosine - ASIN : Inverse sine - ACOS : Inverse cosine - ATAN : Inverse tangent - ASINH : Inverse hyperbolic sine - ACOSH : Inverse hyperbolic cosine - ATANH : Inverse hyperbolic tangent - CEIL : Ceiling - FLOOR : Floor - ERF : Gauss error function - NOT : Not - SIGN : Sign. If input > 0, output 1; if input < 0, output -1; if input == 0, output 0. - ROUND : Round to nearest even for floating-point data type. - ISINF : Return true if the input value equals +/- infinity for floating-point data type. - ISNAN : Return true if the input value equals NaN for floating-point data type. 
- class tensorrt.IUnaryLayer¶
- A unary layer in an - INetworkDefinition.- Variables:
- op – - UnaryOperationThe unary operation for the layer. When running this layer on DLA, only- UnaryOperation.ABSis supported.
 
IReduceLayer¶
- tensorrt.ReduceOperation¶
- The reduce operations that may be performed by a Reduce layer - Members: - SUM : - PROD : - MAX : - MIN : - AVG : 
- class tensorrt.IReduceLayer¶
- A reduce layer in an - INetworkDefinition.- Variables:
- op – - ReduceOperationThe reduce operation for the layer.
- axes – - intThe axes over which to reduce.
- keep_dims – - boolSpecifies whether or not to keep the reduced dimensions for the layer.
 
 
IPaddingLayer¶
- class tensorrt.IPaddingLayer¶
- A padding layer in an - INetworkDefinition.- Variables:
- pre_padding_nd – - DimsThe padding that is applied at the start of the tensor. Negative padding results in trimming the edge by the specified amount. Only 2 dimensions currently supported.
- post_padding_nd – - DimsThe padding that is applied at the end of the tensor. Negative padding results in trimming the edge by the specified amount. Only 2 dimensions currently supported.
 
 
IParametricReLULayer¶
- class tensorrt.IParametricReLULayer¶
- A parametric ReLU layer in an - INetworkDefinition.- This layer applies a parametric ReLU activation to an input tensor (first input), with slopes taken from a slopes tensor (second input). This can be viewed as a leaky ReLU operation where the negative slope differs from element to element (and can in fact be learned). - The slopes tensor must be unidirectional broadcastable to the input tensor: the rank of the two tensors must be the same, and all dimensions of the slopes tensor must either equal the input tensor or be 1. The output tensor has the same shape as the input tensor. 
ISelectLayer¶
- class tensorrt.ISelectLayer¶
- A select layer in an - INetworkDefinition.- This layer implements an element-wise ternary conditional operation. Wherever - conditionis- True, elements are taken from the first input, and wherever- conditionis- False, elements are taken from the second input.
IShuffleLayer¶
- class tensorrt.Permutation(*args, **kwargs)¶
- The elements of the permutation. The permutation is applied as outputDimensionIndex = permutation[inputDimensionIndex], so to permute from CHW order to HWC order, the required permutation is [1, 2, 0], and to permute from HWC to CHW, the required permutation is [2, 0, 1]. - It supports iteration and indexing and is implicitly convertible to/from Python iterables (like - tupleor- list). Therefore, you can use those classes in place of- Permutation.- Overloaded function. - __init__(self: tensorrt.tensorrt.Permutation) -> None 
- __init__(self: tensorrt.tensorrt.Permutation, arg0: List[int]) -> None 
 
- class tensorrt.IShuffleLayer¶
- A shuffle layer in an - INetworkDefinition.- This class shuffles data by applying in sequence: a transpose operation, a reshape operation and a second transpose operation. The dimension types of the output are those of the reshape dimension. - Variables:
- first_transpose – - PermutationThe permutation applied by the first transpose operation. Default: Identity Permutation
- reshape_dims – - DimsThe reshaped dimensions. Two special values can be used as dimensions. Value 0 copies the corresponding dimension from input. This special value can be used more than once in the dimensions. If number of reshape dimensions is less than input, 0s are resolved by aligning the most significant dimensions of input. Value -1 infers that particular dimension by looking at input and rest of the reshape dimensions. Note that only a maximum of one dimension is permitted to be specified as -1. The product of the new dimensions must be equal to the product of the old.
- second_transpose – - PermutationThe permutation applied by the second transpose operation. Default: Identity Permutation
- zero_is_placeholder – - boolThe meaning of 0 in reshape dimensions. If true, then a 0 in the reshape dimensions denotes copying the corresponding dimension from the first input tensor. If false, then a 0 in the reshape dimensions denotes a zero-length dimension.
 
 - set_input(self: tensorrt.tensorrt.ILayer, index: int, tensor: tensorrt.tensorrt.ITensor) None¶
- Sets the input tensor for the given index. The index must be 0 for a static shuffle layer. A static shuffle layer is converted to a dynamic shuffle layer by calling - set_input()with an index 1. A dynamic shuffle layer cannot be converted back to a static shuffle layer.- For a dynamic shuffle layer, the values 0 and 1 are valid. The indices in the dynamic case are as follows: - Index - Description - 0 - Data or Shape tensor to be shuffled. - 1 - The dimensions for the reshape operation, as a 1D - int32shape tensor.- If this function is called with a value 1, then - num_inputschanges from 1 to 2.- Parameters:
- index – The index of the input tensor. 
- tensor – The input tensor. 
 
 
 
ISliceLayer¶
- class tensorrt.ISliceLayer¶
- A slice layer in an - INetworkDefinition.- The slice layer has two variants, static and dynamic. Static slice specifies the start, size, and stride dimensions at layer creation time via - Dimsand can use the get/set accessor functions of the- ISliceLayer. Dynamic slice specifies one or more of start, size, stride, or axes as- ITensor`s, by using :func:`ILayer.set_inputto add a second, third, fourth, or sixth input respectively. The corresponding- Dimsare used if an input is missing or null.- An application can determine if the - ISliceLayerhas a dynamic output shape based on whether the size or axes input is present and non-null.- The slice layer selects for each dimension a start location from within the input tensor, and copies elements to the output tensor using the specified stride across the input tensor. Start, size, and stride tensors must be 1-D integer-typed shape tensors if not specified via - Dims.- An example of using slice on a tensor: input = {{0, 2, 4}, {1, 3, 5}} start = {1, 0} size = {1, 2} stride = {1, 2} output = {{1, 5}} - If axes is provided then starts, ends, and strides must have the same length as axes and specifies a subset of dimensions to slice. If axes is not provided, starts, ends, and strides must be of the same length as the rank of the input tensor. - An example of using slice on a tensor with axes specified: input = {{0, 2, 4}, {1, 3, 5}} start = {1} size = {2} stride = {1} axes = {1} output = {{2, 4}, {3, 5}} - When the sampleMode is - SampleMode.CLAMPor- SampleMode.REFLECT, for each input dimension, if its size is 0 then the corresponding output dimension must be 0 too.- When the sampleMode is - SampleMode.FILL, the fifth input to the slice layer is used to determine the value to fill in out-of-bound indices. It is an error to specify the fifth input in any other sample mode.- A slice layer can produce a shape tensor if the following conditions are met: - start,- size, and- strideare build time constants, either as static- Dimsor as constant input tensors.
- axes, if provided, is a build time constant, either as static- Dimsor as a constant input tensor.
- The number of elements in the output tensor does not exceed 2 * - Dims.MAX_DIMS.
 - The input tensor is a shape tensor if the output is a shape tensor. - The following constraints must be satisfied to execute this layer on DLA: * - start,- size, and- strideare build time constants, either as static- Dimsor as constant input tensors. *- axes, if provided, is a build time constant, either as static- Dimsor as a constant input tensor. * sampleMode is- SampleMode.DEFAULT,- SampleMode.WRAP, or- SampleMode.FILL. * Strides are 1 for all dimensions. * Slicing is not performed on the first dimension. * The input tensor has four dimensions. * For- SliceMode.FILL, the fill value input is a scalar output of an- IConstantLayerwith value 0 that is not consumed by any other layer.- Variables:
- start – - DimsThe start offset.
- shape – - DimsThe output dimensions.
- stride – - DimsThe slicing stride.
- mode – - SampleModeControls how- ISliceLayerhandles out of bounds coordinates.
- axes – - DimsThe axes that starts, sizes, and strides correspond to.
 
 - set_input(self: tensorrt.tensorrt.ILayer, index: int, tensor: tensorrt.tensorrt.ITensor) None¶
- Sets the input tensor for the given index. The index must be 0 or 4 for a static slice layer. A static slice layer is converted to a dynamic slice layer by calling - set_input()with an index between 1 and 3. A dynamic slice layer cannot be converted back to a static slice layer.- The indices are as follows: - Index - Description - 0 - Data or Shape tensor to be sliced. - 1 - The start tensor to begin slicing, N-dimensional for Data, and 1-D for Shape. - 2 - The size tensor of the resulting slice, N-dimensional for Data, and 1-D for Shape. - 3 - The stride of the slicing operation, N-dimensional for Data, and 1-D for Shape. - 4 - Value for the - SampleMode.FILLslice mode. Disallowed for other modes.- 5 - The axes tensor indicating the axes that starts, sizes, and strides correspond to. Must be a 1-D tensor. - If this function is called with a value greater than 0, then - num_inputschanges from 1 to index + 1.- Parameters:
- index – The index of the input tensor. 
- tensor – The input tensor. 
 
 
 
IShapeLayer¶
- class tensorrt.IShapeLayer¶
- A shape layer in an - INetworkDefinition. Used for getting the shape of a tensor. This class sets the output to a one-dimensional tensor with the dimensions of the input tensor.- For example, if the input is a four-dimensional tensor (of any type) with dimensions [2,3,5,7], the output tensor is a one-dimensional - int64tensor of length 4 containing the sequence 2, 3, 5, 7.
ITopKLayer¶
- tensorrt.TopKOperation¶
- The operations that may be performed by a TopK layer - Members: - MAX : Maximum of the elements - MIN : Minimum of the elements 
- class tensorrt.ITopKLayer¶
- A TopK layer in an - INetworkDefinition.- Variables:
- op – - TopKOperationThe operation for the layer.
- k – - TopKOperationthe k value for the layer. Currently only values up to 3840 are supported. Use the set_input() method with index 1 to pass in dynamic k as a tensor.
- axes – - TopKOperationThe axes along which to reduce.
 
 - set_input(self: tensorrt.tensorrt.ILayer, index: int, tensor: tensorrt.tensorrt.ITensor) None¶
- Sets the input tensor for the given index. The index must be 0 or 1 for a TopK layer. - The indices are as follows: - Index - Description - 0 - Input data tensor. - 1 - A scalar Int32 tensor containing a positive value corresponding to the number
- of top elements to retrieve. Values larger than 3840 will result in a runtime error. If provided, this will override the static k value in calculations. 
 - Parameters:
- index – The index of the input tensor. 
- tensor – The input tensor. 
 
 
 
IMatrixMultiplyLayer¶
- tensorrt.MatrixOperation¶
- The matrix operations that may be performed by a Matrix layer - Members: - NONE : - TRANSPOSE : Transpose each matrix - VECTOR : Treat operand as collection of vectors 
- class tensorrt.IMatrixMultiplyLayer¶
- A matrix multiply layer in an - INetworkDefinition.- Let A be op(getInput(0)) and B be op(getInput(1)) where op(x) denotes the corresponding MatrixOperation. - When A and B are matrices or vectors, computes the inner product A * B: matrix * matrix -> matrixmatrix * vector -> vectorvector * matrix -> vectorvector * vector -> scalar- Inputs of higher rank are treated as collections of matrices or vectors. The output will be a corresponding collection of matrices, vectors, or scalars. - Variables:
- op0 – - MatrixOperationHow to treat the first input.
- op1 – - MatrixOperationHow to treat the second input.
 
 
IRaggedSoftMaxLayer¶
- class tensorrt.IRaggedSoftMaxLayer¶
- A ragged softmax layer in an - INetworkDefinition.- This layer takes a ZxS input tensor and an additional Zx1 bounds tensor holding the lengths of the Z sequences. - This layer computes a softmax across each of the Z sequences. - The output tensor is of the same size as the input tensor. 
IIdentityLayer¶
- class tensorrt.IIdentityLayer¶
- A layer that represents the identity function. - If tensor precision is explicitly specified, it can be used to transform from one precision to another. - Other than conversions between the same type ( - float32->- float32for example), the only valid conversions are:- ( - float32|- float16|- int32|- bool) -> (- float32|- float16|- int32|- bool)- ( - float32|- float16) ->- uint8- uint8-> (- float32|- float16)
IConstantLayer¶
- class tensorrt.IConstantLayer¶
- A constant layer in an - INetworkDefinition.- Note: This layer does not support boolean and uint8 types. 
IResizeLayer¶
- class tensorrt.IResizeLayer¶
- A resize layer in an - INetworkDefinition.- Resize layer can be used for resizing a N-D tensor. - Resize layer currently supports the following configurations: - InterpolationMode.NEAREST - resizes innermost m dimensions of N-D, where 0 < m <= min(3, N) and N > 0. 
- InterpolationMode.LINEAR - resizes innermost m dimensions of N-D, where 0 < m <= min(3, N) and N > 0. 
- InterpolationMode.CUBIC - resizes innermost 2 dimensions of N-D, N >= 2. 
 - Default resize mode is InterpolationMode.NEAREST. - Resize layer provides two ways to resize tensor dimensions: - Set output dimensions directly. It can be done for static as well as dynamic resize layer.
- Static resize layer requires output dimensions to be known at build-time. Dynamic resize layer requires output dimensions to be set as one of the input tensors. 
 
- Set scales for resize. Each output dimension is calculated as floor(input dimension * scale).
- Only static resize layer allows setting scales where the scales are known at build-time. 
 
 - If executing this layer on DLA, the following combinations of parameters are supported: - In NEAREST mode: - (ResizeCoordinateTransformation.ASYMMETRIC, ResizeSelector.FORMULA, ResizeRoundMode.FLOOR) 
- (ResizeCoordinateTransformation.HALF_PIXEL, ResizeSelector.FORMULA, ResizeRoundMode.HALF_DOWN) 
- (ResizeCoordinateTransformation.HALF_PIXEL, ResizeSelector.FORMULA, ResizeRoundMode.HALF_UP) 
 
- In LINEAR and CUBIC mode: - (ResizeCoordinateTransformation.HALF_PIXEL, ResizeSelector.FORMULA) 
- (ResizeCoordinateTransformation.HALF_PIXEL, ResizeSelector.UPPER) 
 
 - Variables:
- shape – - DimsThe output dimensions. Must to equal to input dimensions size.
- scales – - List[float]List of resize scales. If executing this layer on DLA, there are three restrictions: 1.- len(scales)has to be exactly 4. 2. The first two elements in scales need to be exactly 1 (for unchanged batch and channel dimensions). 3. The last two elements in scales, representing the scale values along height and width dimensions, respectively, need to be integer values in the range of [1, 32] for NEAREST mode and [1, 4] for LINEAR. Example of DLA-supported scales: [1, 1, 2, 2].
- resize_mode – - InterpolationModeResize mode can be Linear, Cubic or Nearest.
- coordinate_transformation – - ResizeCoordinateTransformationDocSupported resize coordinate transformation modes are ALIGN_CORNERS, ASYMMETRIC and HALF_PIXEL.
- selector_for_single_pixel – - ResizeSelectorSupported resize selector modes are FORMULA and UPPER.
- nearest_rounding – - ResizeRoundModeSupported resize Round modes are HALF_UP, HALF_DOWN, FLOOR and CEIL.
- exclude_outside – - intIf set to 1, the weight of sampling locations outside the input tensor will be set to 0, and the weight will be renormalized so that their sum is 1.0.
- cubic_coeff – - floatcoefficient ‘a’ used in cubic interpolation.
 
 - set_input(self: tensorrt.tensorrt.ILayer, index: int, tensor: tensorrt.tensorrt.ITensor) None¶
- Sets the input tensor for the given index. - If index == 1 and num_inputs == 1, num_inputs changes to 2. Once such additional input is set, resize layer works in dynamic mode. When index == 1 and num_inputs == 1, the output dimensions are used from the input tensor, overriding the dimensions supplied by shape. - Parameters:
- index – The index of the input tensor. 
- tensor – The input tensor. 
 
 
 
ILoop¶
- class tensorrt.ILoop¶
- Helper for creating a recurrent subgraph. - Variables:
- name – The name of the loop. The name is used in error diagnostics. 
 - add_iterator(self: tensorrt.tensorrt.ILoop, tensor: tensorrt.tensorrt.ITensor, axis: int = 0, reverse: bool = False) tensorrt.tensorrt.IIteratorLayer¶
- Return layer that subscripts tensor by loop iteration. - For reverse=false, this is equivalent to add_gather(tensor, I, 0) where I is a scalar tensor containing the loop iteration number. For reverse=true, this is equivalent to add_gather(tensor, M-1-I, 0) where M is the trip count computed from TripLimits of kind - COUNT.- Parameters:
- tensor – The tensor to iterate over. 
- axis – The axis along which to iterate. 
- reverse – Whether to iterate in the reverse direction. 
 
- Returns:
- The - IIteratorLayer, or- Noneif it could not be created.
 
 - add_loop_output(self: tensorrt.tensorrt.ILoop, tensor: tensorrt.tensorrt.ITensor, kind: tensorrt.tensorrt.LoopOutput, axis: int = 0) tensorrt.tensorrt.ILoopOutputLayer¶
- Make an output for this loop, based on the given tensor. - If - kindis- CONCATENATEor- REVERSE, a second input specifying the concatenation dimension must be added via method- ILoopOutputLayer.set_input().- Parameters:
- kind – The kind of loop output. See - LoopOutput
- axis – The axis for concatenation (if using - kindof- CONCATENATEor- REVERSE).
 
- Returns:
- The added - ILoopOutputLayer, or- Noneif it could not be created.
 
 - add_recurrence(self: tensorrt.tensorrt.ILoop, initial_value: tensorrt.tensorrt.ITensor) tensorrt.tensorrt.IRecurrenceLayer¶
- Create a recurrence layer for this loop with initial_value as its first input. - Parameters:
- initial_value – The initial value of the recurrence layer. 
- Returns:
- The added - IRecurrenceLayer, or- Noneif it could not be created.
 
 - add_trip_limit(self: tensorrt.tensorrt.ILoop, tensor: tensorrt.tensorrt.ITensor, kind: tensorrt.tensorrt.TripLimit) tensorrt.tensorrt.ITripLimitLayer¶
- Add a trip-count limiter, based on the given tensor. - There may be at most one - COUNTand one- WHILElimiter for a loop. When both trip limits exist, the loop exits when the count is reached or condition is falsified. It is an error to not add at least one trip limiter.- For - WHILE, the input tensor must be the output of a subgraph that contains only layers that are not- ITripLimitLayer,- IIteratorLayeror- ILoopOutputLayer. Any- IRecurrenceLayers in the subgraph must belong to the same loop as the- ITripLimitLayer. A trivial example of this rule is that the input to the- WHILEis the output of an- IRecurrenceLayerfor the same loop.- Parameters:
- tensor – The input tensor. Must be available before the loop starts. 
- kind – The kind of trip limit. See - TripLimit
 
- Returns:
- The added - ITripLimitLayer, or- Noneif it could not be created.
 
 
ILoopBoundaryLayer¶
ITripLimitLayer¶
- tensorrt.TripLimit¶
- Describes kinds of trip limits. - Members: - COUNT : Tensor is a scalar of type - int32that contains the trip count.- WHILE : Tensor is a scalar of type - bool. Loop terminates when its value is false.
IRecurrenceLayer¶
- class tensorrt.IRecurrenceLayer¶
- set_input(self: tensorrt.tensorrt.ILayer, index: int, tensor: tensorrt.tensorrt.ITensor) None¶
- Set the first or second input. If index==1 and the number of inputs is one, the input is appended. The first input specifies the initial output value, and must come from outside the loop. The second input specifies the next output value, and must come from inside the loop. The two inputs must have the same dimensions. - Parameters:
- index – The index of the input to set. 
- tensor – The input tensor. 
 
 
 
IIteratorLayer¶
- class tensorrt.IIteratorLayer¶
- Variables:
- axis – The axis to iterate over 
- reverse – For reverse=false, the layer is equivalent to add_gather(tensor, I, 0) where I is a scalar tensor containing the loop iteration number. For reverse=true, the layer is equivalent to add_gather(tensor, M-1-I, 0) where M is the trip count computed from TripLimits of kind - COUNT. The default is reverse=false.
 
 
ILoopOutputLayer¶
- tensorrt.LoopOutput¶
- Describes kinds of loop outputs. - Members: - LAST_VALUE : Output value is value of tensor for last iteration. - CONCATENATE : Output value is concatenation of values of tensor for each iteration, in forward order. - REVERSE : Output value is concatenation of values of tensor for each iteration, in reverse order. 
- class tensorrt.ILoopOutputLayer¶
- An - ILoopOutputLayeris the sole way to get output from a loop.- The first input tensor must be defined inside the loop; the output tensor is outside the loop. The second input tensor, if present, must be defined outside the loop. - If - kindis- LAST_VALUE, a single input must be provided.- If - kindis- CONCATENATEor- REVERSE, a second input must be provided. The second input must be a scalar “shape tensor”, defined before the loop commences, that specifies the concatenation length of the output.- The output tensor has j more dimensions than the input tensor, where j == 0 if - kindis- LAST_VALUEj == 1 if- kindis- CONCATENATEor- REVERSE.- Variables:
- axis – The contenation axis. Ignored if - kindis- LAST_VALUE. For example, if the input tensor has dimensions [b,c,d], and- kindis- CONCATENATE, the output has four dimensions. Let a be the value of the second input. axis=0 causes the output to have dimensions [a,b,c,d]. axis=1 causes the output to have dimensions [b,a,c,d]. axis=2 causes the output to have dimensions [b,c,a,d]. axis=3 causes the output to have dimensions [b,c,d,a]. Default is axis is 0.
- kind – The kind of loop output. See - LoopOutput
 
 - set_input(self: tensorrt.tensorrt.ILayer, index: int, tensor: tensorrt.tensorrt.ITensor) None¶
- Like - ILayer.set_input(), but additionally works if index==1,- num_inputs`==1, in which case :attr:`num_inputschanges to 2.
 
IFillLayer¶
- tensorrt.FillOperation¶
- The tensor fill operations that may performed by an Fill layer. - Members: - LINSPACE : Generate evenly spaced numbers over a specified interval - RANDOM_UNIFORM : Generate a tensor with random values drawn from a uniform distribution - RANDOM_NORMAL : Generate a tensor with random values drawn from a normal distribution 
- class tensorrt.IFillLayer¶
- A fill layer in an - INetworkDefinition.- The data type of the output tensor can be specified by - to_type. Supported output types for each fill operation is as follows.- Operation - to_type - kLINSPACE - int32, int64, float32 - kRANDOM_UNIFORM - float16, float32 - kRANDOM_NORMAL - float16, float32 - Variables:
- to_type – - DataTypeThe specified data type of the output tensor. Defaults to tensorrt.float32.
 - is_alpha_beta_int64(self: tensorrt.tensorrt.IFillLayer) bool¶
 - set_input(self: tensorrt.tensorrt.ILayer, index: int, tensor: tensorrt.tensorrt.ITensor) None¶
- replace an input of this layer with a specific tensor. - Index - Description for kLINSPACE - 0 - Shape tensor, represents the output tensor’s dimensions. - 1 - Start, a scalar, represents the start value. - 2 - Delta, a 1D tensor, length equals to shape tensor’s nbDims, represents the delta value for each dimension. - Index - Description for kRANDOM_UNIFORM - 0 - Shape tensor, represents the output tensor’s dimensions. - 1 - Minimum, a scalar, represents the minimum random value. - 2 - Maximum, a scalar, represents the maximal random value. - Index - Description for kRANDOM_NORMAL - 0 - Shape tensor, represents the output tensor’s dimensions. - 1 - Mean, a scalar, represents the mean of the normal distribution. - 2 - Scale, a scalar, represents the standard deviation of the normal distribution. - Parameters:
- index – the index of the input to modify. 
- tensor – the input tensor. 
 
 
 
IQuantizeLayer¶
- class tensorrt.IQuantizeLayer¶
- A Quantize layer in an - INetworkDefinition.- This layer accepts a floating-point data input tensor, and uses the scale and zeroPt inputs to - quantize the data to an 8-bit signed integer according to: - \(output = clamp(round(input / scale) + zeroPt)\) - Rounding type is rounding-to-nearest ties-to-even (https://en.wikipedia.org/wiki/Rounding#Round_half_to_even). - Clamping is in the range [-128, 127]. - The first input (index 0) is the tensor to be quantized. The second (index 1) and third (index 2) are the scale and zero point respectively. Each of scale and zeroPt must be either a scalar, or a 1D tensor. - The zeroPt tensor is optional, and if not set, will be assumed to be zero. Its data type must be tensorrt.int8. zeroPt must only contain zero-valued coefficients, because only symmetric quantization is supported. The scale value must be either a scalar for per-tensor quantization, or a 1D tensor for per-axis quantization. The size of the 1-D scale tensor must match the size of the quantization axis. The size of the scale must match the size of the zeroPt. - The subgraph which terminates with the scale tensor must be a build-time constant. The same restrictions apply to the zeroPt. The output type, if constrained, must be constrained to tensorrt.int8 or tensorrt.fp8. The input type, if constrained, must be constrained to tensorrt.float32, tensorrt.float16 or tensorrt.bfloat16. The output size is the same as the input size. - IQuantizeLayer supports tensorrt.float32, tensorrt.float16 and tensorrt.bfloat16 precision and will default to tensorrt.float32 precision during instantiation. IQuantizeLayer supports tensorrt.int8, tensorrt.float8, tensorrt.int4 and tensorrt.fp4 output. - Variables:
- axis – - intThe axis along which quantization occurs. The quantization axis is in reference to the input tensor’s dimensions.
- to_type – - DataTypeThe specified data type of the output tensor. Must be tensorrt.int8 or tensorrt.float8.
 
 
IDequantizeLayer¶
- class tensorrt.IDequantizeLayer¶
- A Dequantize layer in an - INetworkDefinition.- This layer accepts a signed 8-bit integer input tensor, and uses the configured scale and zeroPt inputs to dequantize the input according to: \(output = (input - zeroPt) * scale\) - The first input (index 0) is the tensor to be quantized. The second (index 1) and third (index 2) are the scale and zero point respectively. Each of scale and zeroPt must be either a scalar, or a 1D tensor. - The zeroPt tensor is optional, and if not set, will be assumed to be zero. Its data type must be tensorrt.int8. zeroPt must only contain zero-valued coefficients, because only symmetric quantization is supported. The scale value must be either a scalar for per-tensor quantization, or a 1D tensor for per-axis quantization. The size of the 1-D scale tensor must match the size of the quantization axis. The size of the scale must match the size of the zeroPt. - The subgraph which terminates with the scale tensor must be a build-time constant. The same restrictions apply to the zeroPt. The output type, if constrained, must be constrained to tensorrt.int8 or tensorrt.fp8. The input type, if constrained, must be constrained to tensorrt.float32, tensorrt.float16 or tensorrt.bfloat16. The output size is the same as the input size. - IDequantizeLayer supports tensorrt.int8, tensorrt.float8, tensorrt.int4 and tensorrt.fp4 precision and will default to tensorrt.int8 precision during instantiation. IDequantizeLayer supports tensorrt.float32, tensorrt.float16 and tensorrt.bfloat16 output. - Variables:
- axis – - intThe axis along which dequantization occurs. The dequantization axis is in reference to the input tensor’s dimensions.
- to_type – - DataTypeThe specified data type of the output tensor. Must be tensorrt.float32 or tensorrt.float16.
 
 
IDynamicQuantizeLayer¶
- class tensorrt.IDynamicQuantizeLayer¶
- A DynamicQuantize layer in an - INetworkDefinition.- This layer performs dynamic block quantization of its input tensor and outputs the quantized data and the computed block scale-factors. The size of the blocked axis must be divisible by the block size. - The first input (index 0) is the tensor to be quantized. Its data type must be one of DataType::kFLOAT, DataType::kHALF, or DataType::kBF16. Currently only 2D and 3D inputs are supported. - The second input (index 1) is the double quantization scale factor. It is a scalar scale factor used to quantize the computed block scales-factors. - Variables:
- axis – - intThe axis that is sliced into blocks. The axis must be the last dimension or the second to last dimension.
- block_size – - intThe number of elements that are quantized using a shared scale factor. Supports block sizes of 16 with NVFP4 quantization and 32 with MXFP8 quantization.
- output_type – - DataTypeThe data type of the quantized output tensor, must be either DataType::kFP4 (NVFP4 quantization) or DataType::kFP8 (MXFP8 quantization).
- scale_type – - DataTypeThe data type of the scale factor used for quantizing the input data, must be DataType::kFP8 (NVFP4 quantization) or DataType::kE8M0 (MXFP8 quantization).
 
 
IScatterLayer¶
- class tensorrt.IScatterLayer¶
- A Scatter layer as in - INetworkDefinition. :ivar axis: axis to scatter on when using Scatter Element mode (ignored in ND mode) :ivar mode:- ScatterModeThe operation mode of the scatter.
IIfConditional¶
- class tensorrt.IIfConditional¶
- Helper for constructing conditionally-executed subgraphs. - An If-conditional conditionally executes (lazy evaluation) part of the network according to the following pseudo-code: - If condition is true Then: output = trueSubgraph(trueInputs); Else: output = falseSubgraph(falseInputs); Emit output- Condition is a 0D boolean tensor (representing a scalar). trueSubgraph represents a network subgraph that is executed when condition is evaluated to True. falseSubgraph represents a network subgraph that is executed when condition is evaluated to False. - The following constraints apply to If-conditionals: - Both the trueSubgraph and falseSubgraph must be defined. - The number of output tensors in both subgraphs is the same. - The type and shape of each output tensor from the true/false subgraphs are the same, except that the shapes are allowed to differ if the condition is a build-time constant. - add_input(self: tensorrt.tensorrt.IIfConditional, input: tensorrt.tensorrt.ITensor) tensorrt.tensorrt.IIfConditionalInputLayer¶
- Make an input for this if-conditional, based on the given tensor. - Parameters:
- input – An input to the conditional that can be used by either or both of the conditional’s subgraphs. 
 
 - add_output(self: tensorrt.tensorrt.IIfConditional, true_subgraph_output: tensorrt.tensorrt.ITensor, false_subgraph_output: tensorrt.tensorrt.ITensor) tensorrt.tensorrt.IIfConditionalOutputLayer¶
- Make an output for this if-conditional, based on the given tensors. - Each output layer of the if-conditional represents a single output of either the true-subgraph or the false-subgraph of the if-conditional, depending on which subgraph was executed. - Parameters:
- true_subgraph_output – The output of the subgraph executed when this conditional’s condition input evaluates to true. 
- false_subgraph_output – The output of the subgraph executed when this conditional’s condition input evaluates to false. 
 
- Returns:
- The - IIfConditionalOutputLayer, or- Noneif it could not be created.
 
 - set_condition(self: tensorrt.tensorrt.IIfConditional, condition: tensorrt.tensorrt.ITensor) tensorrt.tensorrt.IConditionLayer¶
- Set the condition tensor for this If-Conditional construct. - The - conditiontensor must be a 0D data tensor (scalar) with type- bool.- Parameters:
- condition – The condition tensor that will determine which subgraph to execute. 
- Returns:
- The - IConditionLayer, or- Noneif it could not be created.
 
 
IConditionLayer¶
- class tensorrt.IConditionLayer¶
- Describes the boolean condition of an if-conditional. 
IIfConditionalOutputLayer¶
- class tensorrt.IIfConditionalOutputLayer¶
- Describes kinds of if-conditional outputs. 
IIfConditionalInputLayer¶
- class tensorrt.IIfConditionalInputLayer¶
- Describes kinds of if-conditional inputs. 
IEinsumLayer¶
- class tensorrt.IEinsumLayer¶
- An Einsum layer in an - INetworkDefinition.- This layer implements a summation over the elements of the inputs along dimensions specified by the equation parameter, based on the Einstein summation convention. The layer can have one or more inputs of rank >= 0. All the inputs must be of same data type. This layer supports all TensorRT data types except - bool. There is one output tensor of the same type as the input tensors. The shape of output tensor is determined by the equation.- The equation specifies ASCII lower-case letters for each dimension in the inputs in the same order as the dimensions, separated by comma for each input. The dimensions labeled with the same subscript must match or be broadcastable. Repeated subscript labels in one input take the diagonal. Repeating a label across multiple inputs means that those axes will be multiplied. Omitting a label from the output means values along those axes will be summed. In implicit mode, the indices which appear once in the expression will be part of the output in increasing alphabetical order. In explicit mode, the output can be controlled by specifying output subscript labels by adding an arrow (‘->’) followed by subscripts for the output. For example, “ij,jk->ik” is equivalent to “ij,jk”. Ellipsis (‘…’) can be used in place of subscripts to broadcast the dimensions. See the TensorRT Developer Guide for more details on equation syntax. - Many common operations can be expressed using the Einsum equation. For example: Matrix Transpose: ij->ji Sum: ij-> Matrix-Matrix Multiplication: ik,kj->ij Dot Product: i,i-> Matrix-Vector Multiplication: ik,k->i Batch Matrix Multiplication: ijk,ikl->ijl Batch Diagonal: …ii->…i - Note that TensorRT does not support ellipsis or diagonal operations. - Variables:
- equation – - strThe Einsum equation of the layer. The equation is a comma-separated list of subscript labels, where each label refers to a dimension of the corresponding tensor.
 
IAssertionLayer¶
- class tensorrt.IAssertionLayer¶
- An assertion layer in an - INetworkDefinition.- This layer implements assertions. The input must be a boolean shape tensor. If any element of it is - False, a build-time or run-time error occurs. Asserting equality of input dimensions may help the optimizer.- Variables:
- message – - stringMessage to print if the assertion fails.
 
IOneHotLayer¶
- class tensorrt.IOneHotLayer¶
- A OneHot layer in a network definition. - The OneHot layer has three input tensors: Indices, Values, and Depth, one output tensor, Output, and an axis attribute. :ivar indices: is an Int32 tensor that determines which locations in Output to set as on_value. :ivar values: is a two-element (rank=1) tensor that consists of [off_value, on_value] :ivar depth: is an Int32 shape tensor of rank 0, which contains the depth (number of classes) of the one-hot encoding. The depth tensor must be a build-time constant, and its value should be positive. :returns: a tensor with rank = rank(indices)+1, where the added dimension contains the one-hot encoding. :param axis: specifies to which dimension of the output one-hot encoding is added. - The data types of Output shall be equal to the Values data type. The output is computed by copying off_values to all output elements, then setting on_value on the indices specified by the indices tensor. - when axis = 0: output[indices[i, j, k], i, j, k] = on_value for all i, j, k and off_value otherwise. - when axis = -1: output[i, j, k, indices[i, j, k]] = on_value for all i, j, k and off_value otherwise. 
INonZeroLayer¶
- class tensorrt.INonZeroLayer¶
- A NonZero layer in an - INetworkDefinition.- Computes the indices of the input tensor where the value is non-zero. The returned indices are in row-major order. - The output shape is always {D, C}, where D is the number of dimensions of the input and C is the number of non-zero values. 
INMSLayer¶
- class tensorrt.INMSLayer¶
- A non-maximum suppression layer in an - INetworkDefinition.- Boxes: The input boxes tensor to the layer. This tensor contains the input bounding boxes. It is a linear tensor of type - float32or- float16. It has shape [batchSize, numInputBoundingBoxes, numClasses, 4] if the boxes are per class, or [batchSize, numInputBoundingBoxes, 4] if the same boxes are to be used for each class.- Scores: The input scores tensor to the layer. This tensor contains the per-box scores. It is a linear tensor of the same type as the boxes tensor. It has shape [batchSize, numInputBoundingBoxes, numClasses]. - MaxOutputBoxesPerClass: The input maxOutputBoxesPerClass tensor to the layer. This tensor contains the maximum number of output boxes per batch item per class. It is a scalar (0D tensor) of type - int32.- IoUThreshold is the maximum IoU for selected boxes. It is a scalar (0D tensor) of type - float32in the range [0.0, 1.0]. It is an optional input with default 0.0. Use- set_input()to add this optional tensor.- ScoreThreshold is the value that a box score must exceed in order to be selected. It is a scalar (0D tensor) of type - float32. It is an optional input with default 0.0. Use- set_input()to add this optional tensor.- The SelectedIndices output tensor contains the indices of the selected boxes. It is a linear tensor of type - int32. It has shape [NumOutputBoxes, 3].] Each row contains a (batchIndex, classIndex, boxIndex) tuple. The output boxes are sorted in order of increasing batchIndex and then in order of decreasing score within each batchIndex. For each batchIndex, the ordering of output boxes with the same score is unspecified. If MaxOutputBoxesPerClass is a constant input, the maximum number of output boxes is batchSize * numClasses * min(numInputBoundingBoxes, MaxOutputBoxesPerClass). Otherwise, the maximum number of output boxes is batchSize * numClasses * numInputBoundingBoxes. The maximum number of output boxes is used to determine the upper-bound on allocated memory for this output tensor.- The NumOutputBoxes output tensor contains the number of output boxes in selectedIndices. It is a scalar (0D tensor) of type - int32.- The NMS algorithm iterates through a set of bounding boxes and their confidence scores, in decreasing order of score. Boxes are selected if their score is above a given threshold, and their intersection-over-union (IoU) with previously selected boxes is less than or equal to a given threshold. This layer implements NMS per batch item and per class. - For each batch item, the ordering of candidate bounding boxes with the same score is unspecified. - Variables:
- bounding_box_format – - BoundingBoxFormatThe bounding box format used by the layer. Default is CORNER_PAIRS.
- topk_box_limit – - intThe maximum number of filtered boxes considered for selection. Default is 2000 for SM 5.3 and 6.2 devices, and 5000 otherwise. The TopK box limit must be less than or equal to {2000 for SM 5.3 and 6.2 devices, 5000 otherwise}.
 
 - set_input(self: tensorrt.tensorrt.ILayer, index: int, tensor: tensorrt.tensorrt.ITensor) None¶
- Sets the input tensor for the given index. The indices are as follows: - Index - Description - 0 - The required Boxes tensor. - 1 - The required Scores tensor. - 2 - The required MaxOutputBoxesPerClass tensor. - 3 - The optional IoUThreshold tensor. - 4 - The optional ScoreThreshold tensor. - If this function is called for an index greater or equal to - num_inputs, then afterwards- num_inputsreturns index + 1, and any missing intervening inputs are set to null. Note that only optional inputs can be missing.- Parameters:
- index – The index of the input tensor. 
- tensor – The input tensor. 
 
 
 
IReverseSequenceLayer¶
- class tensorrt.IReverseSequenceLayer¶
- A ReverseSequence layer in an - INetworkDefinition.- This layer performs batch-wise reversal, which slices the input tensor along the axis - batch_axis. For the- i-th slice, the operation reverses the first- Nelements, specified by the corresponding- i-th value in- sequence_lens, along- sequence_axisand keeps the remaining elements unchanged. The output tensor will have the same shape as the input tensor.- Variables:
- batch_axis – - intThe batch axis. Default: 1.
- sequence_axis – - intThe sequence axis. Default: 0.
 
 
INormalizationLayer¶
- class tensorrt.INormalizationLayer¶
- A Normalization layer in an - INetworkDefinition.- The normalization layer performs the following operation: - X - input Tensor Y - output Tensor S - scale Tensor B - bias Tensor - Y = (X - Mean(X, axes)) / Sqrt(Variance(X) + epsilon) * S + B - Where Mean(X, axes) is a reduction over a set of axes, and Variance(X) = Mean((X - Mean(X, axes)) ^ 2, axes). - Variables:
- epsilon – - floatThe epsilon value used for the normalization calculation. Default: 1e-5F.
- axes – - intThe reduction axes for the normalization calculation.
- num_groups – - intThe number of groups to split the channels into for the normalization calculation. Default: 1.
- compute_precision – - DataTypeThe datatype used for the compute precision of this layer. By default TensorRT will run the normalization computation in DataType.kFLOAT32 even in mixed precision mode regardless of any set builder flags to avoid overflow errors. ILayer.precision and ILayer.set_output_type can still be set to control input and output types of this layer. Only DataType.kFLOAT32 and DataType.kHALF are valid for this member. Default: Datatype.FLOAT.
 
 
ISqueezeLayer¶
- class tensorrt.ISqueezeLayer¶
- A Squeeze layer in an - INetworkDefinition.- This layer represents a squeeze operation, removing unit dimensions of the input tensor on a set of axes. - Axes must be resolvable to a constant Int32 or Int64 1D shape tensor. Values in axes must be unique and in the range of [-r, r-1], where r is the rank of the input tensor. For each axis value, the corresponding dimension in the input tensor must be one. - set_input(self: tensorrt.tensorrt.ILayer, index: int, tensor: tensorrt.tensorrt.ITensor) None¶
- Sets the input tensor for the given index. The index must be 0 or 1 for a Squeeze layer. - The indices are as follows: - Index - Description - 0 - Input data tensor. - 1 - The axes to remove. Must be resolvable to a constant Int32 or Int64 1D shape tensor. - Parameters:
- index – The index of the input tensor. 
- tensor – The input tensor. 
 
 
 
IUnsqueezeLayer¶
- class tensorrt.IUnsqueezeLayer¶
- An Unsqueeze layer in an - INetworkDefinition.- This layer represents an unsqueeze operation, which reshapes the input tensor by inserting unit-length dimensions at specified axes of the output. - Axes must be resolvable to a constant Int32 or Int64 shape tensor. Values in axes must be unique and in the range of [-r_final, r_final-1], where r_final is the sum of rank(input) and len(axes). - r_final must be less than Dims.MAX_DIMS. - set_input(self: tensorrt.tensorrt.ILayer, index: int, tensor: tensorrt.tensorrt.ITensor) None¶
- Sets the input tensor for the given index. The index must be 0 or 1 for an Unsqueeze layer. - The indices are as follows: - Index - Description - 0 - Input data tensor. - 1 - The axes to add. Must be resolvable to a constant Int32 or Int64 1D shape tensor. - Parameters:
- index – The index of the input tensor. 
- tensor – The input tensor. 
 
 
 
ICumulativeLayer¶
- class tensorrt.ICumulativeLayer¶
- A cumulative layer in an - INetworkDefinition.- This layer represents a cumulative operation across a tensor. - It computes successive reductions across an axis of a tensor. The output always has the same shape as the input. - If the reduction operation is summation, then this is also known as prefix-sum or cumulative sum. - The operation has forward vs. reverse variants, and inclusive vs. exclusive variants. - For example, let the input be a vector x of length n and the output be vector y. Then y[j] = sum(x[…]) where … denotes a sequence of indices from this list: - inclusive + forward: 0..j 
- inclusive + reverse: j..n-1 
- exclusive + forward: 0..j-1 
- exclusive + reverse: j+1..n-1 
 - For multidimensional tensors, the cumulative applies across a specified axis. For example, given a 2D input, a forward inclusive cumulative across axis 0 generates cumulative sums within each column. - Variables:
- op – - CumulativeOperationThe cumulative operation for the layer.
- exclusive – - boolSpecifies whether it is an exclusive cumulative or inclusive cumulative.
- reverse – - boolSpecifies whether the cumulative operation should be applied backward.