Layer Base Classes¶
ITensor¶
- tensorrt.TensorLocation¶
The physical location of the data.
Members:
DEVICE : Data is stored on the device.
HOST : Data is stored on the host.
- tensorrt.TensorFormat¶
Format of the input/output tensors.
This enum is used by both plugins and network I/O tensors.
For more information about data formats, see the topic “Data Format Description” located in the TensorRT Developer Guide (https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html).
Members:
- LINEAR :
Row major linear format.
For a tensor with dimensions {N, C, H, W}, the W axis always has unit stride, and the stride of every other axis is at least the product of the next dimension times the next stride. the strides are the same as for a C array with dimensions [N][C][H][W].
- CHW2 :
Two wide channel vectorized row major format.
This format is bound to FP16. It is only available for dimensions >= 3.
For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to a C array with dimensions [N][(C+1)/2][H][W][2], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][c/2][h][w][c%2].
- HWC8 :
Eight channel format where C is padded to a multiple of 8.
This format is bound to FP16 and BF16. It is only available for dimensions >= 3.
For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to the array with dimensions [N][H][W][(C+7)/8*8], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][h][w][c].
- CHW4 :
Four wide channel vectorized row major format. This format is bound to FP16 and INT8. It is only available for dimensions >= 3.
For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to a C array with dimensions [N][(C+3)/4][H][W][4], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][c/4][h][w][c%4].
- CHW16 :
Sixteen wide channel vectorized row major format.
This format is bound to FP16. It is only available for dimensions >= 3.
For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to a C array with dimensions [N][(C+15)/16][H][W][16], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][c/16][h][w][c%16].
- CHW32 :
Thirty-two wide channel vectorized row major format.
This format is only available for dimensions >= 3.
For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to a C array with dimensions [N][(C+31)/32][H][W][32], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][c/32][h][w][c%32].
- DHWC8 :
Eight channel format where C is padded to a multiple of 8.
This format is bound to FP16 and BF16, and it is only available for dimensions >= 4.
For a tensor with dimensions {N, C, D, H, W}, the memory layout is equivalent to an array with dimensions [N][D][H][W][(C+7)/8*8], with the tensor coordinates (n, c, d, h, w) mapping to array subscript [n][d][h][w][c].
- CDHW32 :
Thirty-two wide channel vectorized row major format with 3 spatial dimensions.
This format is bound to FP16 and INT8. It is only available for dimensions >= 4.
For a tensor with dimensions {N, C, D, H, W}, the memory layout is equivalent to a C array with dimensions [N][(C+31)/32][D][H][W][32], with the tensor coordinates (n, d, c, h, w) mapping to array subscript [n][c/32][d][h][w][c%32].
- HWC :
Non-vectorized channel-last format. This format is bound to FP32, FP16, INT8, INT64 and BF16, and is only available for dimensions >= 3.
- DLA_LINEAR :
DLA planar format. Row major format. The stride for stepping along the H axis is rounded up to 64 bytes.
This format is bound to FP16/Int8 and is only available for dimensions >= 3.
For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to a C array with dimensions [N][C][H][roundUp(W, 64/elementSize)] where elementSize is 2 for FP16 and 1 for Int8, with the tensor coordinates (n, c, h, w) mapping to array subscript [n][c][h][w].
- DLA_HWC4 :
DLA image format. channel-last format. C can only be 1, 3, 4. If C == 3 it will be rounded to 4. The stride for stepping along the H axis is rounded up to 32 bytes.
This format is bound to FP16/Int8 and is only available for dimensions >= 3.
For a tensor with dimensions {N, C, H, W}, with C’ is 1, 4, 4 when C is 1, 3, 4 respectively, the memory layout is equivalent to a C array with dimensions [N][H][roundUp(W, 32/C’/elementSize)][C’] where elementSize is 2 for FP16 and 1 for Int8, C’ is the rounded C. The tensor coordinates (n, c, h, w) maps to array subscript [n][h][w][c].
- HWC16 :
Sixteen channel format where C is padded to a multiple of 16. This format is bound to FP16/INT8/FP8. It is only available for dimensions >= 3.
For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to the array with dimensions [N][H][W][(C+15)/16*16], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][h][w][c].
- DHWC :
Non-vectorized channel-last format. This format is bound to FP32. It is only available for dimensions >= 4.
- class tensorrt.ITensor¶
A tensor in an
INetworkDefinition
.- Variables:
name –
str
The tensor name. For a network input, the name is assigned by the application. For tensors which are layer outputs, a default name is assigned consisting of the layer name followed by the index of the output in brackets. Each network input and output tensor must have a unique name.shape –
Dims
The shape of a tensor. For a network input the shape is assigned by the application. For a network output it is computed based on the layer parameters and the inputs to the layer. If a tensor size or a parameter is modified in the network, the shape of all dependent tensors will be recomputed. This call is only legal for network input tensors, since the shape of layer output tensors are inferred based on layer inputs and parameters.dtype –
DataType
The data type of a tensor. The type is unchanged if the type is invalid for the given tensor.broadcast_across_batch –
bool
[DEPRECATED] Deprecated in TensorRT 10.0. Always false since the implicit batch dimensions support has been removed.location –
TensorLocation
The storage location of a tensor.is_network_input –
bool
Whether the tensor is a network input.is_network_output –
bool
Whether the tensor is a network output.dynamic_range –
Tuple[float, float]
[DEPRECATED] Deprecated in TensorRT 10.1. Superseded by explicit quantization. A tuple containing the [minimum, maximum] of the dynamic range, orNone
if the range was not set.is_shape –
bool
Whether the tensor is a shape tensor.allowed_formats –
int32
The allowed set of TensorFormat candidates. This should be an integer consisting of one or moreTensorFormat
s, combined via bitwise OR after bit shifting. For example,1 << int(TensorFormat.CHW4) | 1 << int(TensorFormat.CHW32)
.
- get_dimension_name(self: tensorrt.tensorrt.ITensor, index: int) str ¶
Get the name of an input dimension.
- Parameters:
index – index of the dimension.
- Returns:
name of the dimension, or null if dimension is unnamed.
- reset_dynamic_range(self: tensorrt.tensorrt.ITensor) None ¶
[DEPRECATED] Deprecated in TensorRT 10.1. Superseded by explicit quantization. Undo the effect of setting the dynamic range.
- set_dimension_name(self: tensorrt.tensorrt.ITensor, index: int, name: str) None ¶
Name a dimension of an input tensor.
Associate a runtime dimension of an input tensor with a symbolic name. Dimensions with the same non-empty name must be equal at runtime. Knowing this equality for runtime dimensions may help the TensorRT optimizer. Both runtime and build-time dimensions can be named. If the function is called again, with the same index, it will overwrite the previous name. If None is passed as name, it will clear the name of the dimension.
For example, setDimensionName(0, “n”) associates the symbolic name “n” with the leading dimension.
- Parameters:
index – index of the dimension.
name – name of the dimension.
- set_dynamic_range(self: tensorrt.tensorrt.ITensor, min: float, max: float) bool ¶
[DEPRECATED] Deprecated in TensorRT 10.1. Superseded by explicit quantization. Set dynamic range for the tensor. NOTE: It is suggested to use
tensor.dynamic_range = (min, max)
instead.- Parameters:
min – Minimum of the dynamic range.
max – Maximum of the dyanmic range.
- Returns:
true if succeed in setting range. Otherwise false.
ILayer¶
- tensorrt.LayerType¶
Type of Layer
Members:
CONVOLUTION : Convolution layer
GRID_SAMPLE : Grid sample layer
NMS : NMS layer
ACTIVATION : Activation layer
POOLING : Pooling layer
LRN : LRN layer
SCALE : Scale layer
SOFTMAX : Softmax layer
DECONVOLUTION : Deconvolution layer
CONCATENATION : Concatenation layer
ELEMENTWISE : Elementwise layer
PLUGIN : Plugin layer
UNARY : Unary layer
PADDING : Padding layer
SHUFFLE : Shuffle layer
REDUCE : Reduce layer
TOPK : TopK layer
GATHER : Gather layer
MATRIX_MULTIPLY : Matrix multiply layer
RAGGED_SOFTMAX : Ragged softmax layer
CONSTANT : Constant layer
IDENTITY : Identity layer
CAST : Cast layer
PLUGIN_V2 : PluginV2 layer
SLICE : Slice layer
SHAPE : Shape layer
PARAMETRIC_RELU : Parametric ReLU layer
RESIZE : Resize layer
TRIP_LIMIT : Loop Trip limit layer
RECURRENCE : Loop Recurrence layer
ITERATOR : Loop Iterator layer
LOOP_OUTPUT : Loop output layer
SELECT : Select layer
ASSERTION : Assertion layer
FILL : Fill layer
QUANTIZE : Quantize layer
DEQUANTIZE : Dequantize layer
CONDITION : If-conditional Condition layer
CONDITIONAL_INPUT : If-conditional input layer
CONDITIONAL_OUTPUT : If-conditional output layer
SCATTER : Scatter layer
EINSUM : Einsum layer
ONE_HOT : OneHot layer
NON_ZERO : NonZero layer
REVERSE_SEQUENCE : ReverseSequence layer
NORMALIZATION : Normalization layer
PLUGIN_V3 : PluginV3 layer
SQUEEZE : Squeeze layer
UNSQUEEZE : Unsqueeze layer
CUMULATIVE : Cumulative layer
DYNAMIC_QUANTIZE : DynamicQuantize layer
- class tensorrt.ILayer¶
Base class for all layer classes in an
INetworkDefinition
.- Variables:
- Ival metadata:
str
The per-layer metadata.
- get_input(self: tensorrt.tensorrt.ILayer, index: int) tensorrt.tensorrt.ITensor ¶
Get the layer input corresponding to the given index.
- Parameters:
index – The index of the input tensor.
- Returns:
The input tensor, or
None
if the index is out of range.
- get_output(self: tensorrt.tensorrt.ILayer, index: int) tensorrt.tensorrt.ITensor ¶
Get the layer output corresponding to the given index.
- Parameters:
index – The index of the output tensor.
- Returns:
The output tensor, or
None
if the index is out of range.
- get_output_type(self: tensorrt.tensorrt.ILayer, index: int) tensorrt.tensorrt.DataType ¶
Get the output type of the layer.
- Parameters:
index – The index of the output tensor.
- Returns:
The output precision. Default : DataType.FLOAT.
- output_type_is_set(self: tensorrt.tensorrt.ILayer, index: int) bool ¶
Whether the output type has been set for this layer.
- Parameters:
index – The index of the output.
- Returns:
Whether the output type has been explicitly set.
- reset_output_type(self: tensorrt.tensorrt.ILayer, index: int) None ¶
Reset output type of this layer.
- Parameters:
index – The index of the output.
- reset_precision(self: tensorrt.tensorrt.ILayer) None ¶
Reset the computation precision of the layer.
- set_input(self: tensorrt.tensorrt.ILayer, index: int, tensor: tensorrt.tensorrt.ITensor) None ¶
Set the layer input corresponding to the given index.
- Parameters:
index – The index of the input tensor.
tensor – The input tensor.
- set_output_type(self: tensorrt.tensorrt.ILayer, index: int, dtype: tensorrt.tensorrt.DataType) None ¶
Constraint layer to generate output data with given type. Note that this method cannot be used to set the data type of the second output tensor of the topK layer. The data type of the second output tensor of the topK layer is always
int32
.- Parameters:
index – The index of the output tensor to set the type.
dtype – DataType of the output.