Layer Base Classes

ITensor

tensorrt.TensorLocation

The physical location of the data.

Members:

DEVICE : Data is stored on the device.

HOST : Data is stored on the device.

tensorrt.TensorFormat

Format of the input/output tensors.

This enum is extended to be used by both plugins and reformat-free network I/O tensors.

For more information about data formats, see the topic “Data Format Description” located in the TensorRT Developer Guide (https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html).

Members:

DLA_LINEAR :

DLA planar format. Row major format. The stride for stepping along the H axis is rounded up to 64 bytes.

This format is bound to FP16/Int8 and is only available for dimensions >= 3.

For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to a C array with dimensions [N][C][H][roundUp(W, 64/elementSize)] where elementSize is 2 for FP16 and 1 for Int8, with the tensor coordinates (n, c, h, w) mapping to array subscript [n][c][h][w].

CHW32 :

Thirty-two wide channel vectorized row major format.

This format is only available for dimensions >= 3.

For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to a C array with dimensions [N][(C+31)/32][H][W][32], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][c/32][h][w][c%32].

LINEAR :

Row major linear format.

For a tensor with dimensions {N, C, H, W}, the W axis always has unit stride, and the stride of every other axis is at least the the product of of the next dimension times the next stride. the strides are the same as for a C array with dimensions [N][C][H][W].

HWC16 :

Sixteen channel format where C is padded to a multiple of 16. This format is bound to FP16. It is only available for dimensions >= 3.

For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to the array with dimensions [N][H][W][(C+15)/16*16], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][h][w][c].

DLA_HWC4 :

DLA image format. channel-last format. C can only be 1, 3, 4. If C == 3 it will be rounded to 4. The stride for stepping along the H axis is rounded up to 32 bytes.

This format is bound to FP16/Int8 and is only available for dimensions >= 3.

For a tensor with dimensions {N, C, H, W}, with C’ is 1, 4, 4 when C is 1, 3, 4 respectively, the memory layout is equivalent to a C array with dimensions [N][H][roundUp(W, 32/C’/elementSize)][C’] where elementSize is 2 for FP16 and 1 for Int8, C’ is the rounded C. The tensor coordinates (n, c, h, w) maps to array subscript [n][h][w][c].

CHW4 :

Four wide channel vectorized row major format. This format is bound to INT8. It is only available for dimensions >= 3.

For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to a C array with dimensions [N][(C+3)/4][H][W][4], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][c/4][h][w][c%4].

CHW16 :

Sixteen wide channel vectorized row major format.

This format is bound to FP16. It is only available for dimensions >= 3.

For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to a C array with dimensions [N][(C+15)/16][H][W][16], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][c/16][h][w][c%16].

DHWC8 :

Eight channel format where C is padded to a multiple of 8.

This format is bound to FP16, and it is only available for dimensions >= 4.

For a tensor with dimensions {N, C, D, H, W}, the memory layout is equivalent to an array with dimensions [N][D][H][W][(C+7)/8*8], with the tensor coordinates (n, c, d, h, w) mapping to array subscript [n][d][h][w][c].

HWC8 :

Eight channel format where C is padded to a multiple of 8.

This format is bound to FP16. It is only available for dimensions >= 3.

For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to the array with dimensions [N][H][W][(C+7)/8*8], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][h][w][c].

HWC :

Non-vectorized channel-last format. This format is bound to FP32 and is only available for dimensions >= 3.

CHW2 :

Two wide channel vectorized row major format.

This format is bound to FP16. It is only available for dimensions >= 3.

For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to a C array with dimensions [N][(C+1)/2][H][W][2], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][c/2][h][w][c%2].

CDHW32 :

Thirty-two wide channel vectorized row major format with 3 spatial dimensions.

This format is bound to FP16 and INT8. It is only available for dimensions >= 4.

For a tensor with dimensions {N, C, D, H, W}, the memory layout is equivalent to a C array with dimensions [N][(C+31)/32][D][H][W][32], with the tensor coordinates (n, d, c, h, w) mapping to array subscript [n][c/32][d][h][w][c%32].

class tensorrt.ITensor

A tensor in an INetworkDefinition .

Variables
  • namestr The tensor name. For a network input, the name is assigned by the application. For tensors which are layer outputs, a default name is assigned consisting of the layer name followed by the index of the output in brackets.

  • shapeDims The shape of a tensor. For a network input the shape is assigned by the application. For a network output it is computed based on the layer parameters and the inputs to the layer. If a tensor size or a parameter is modified in the network, the shape of all dependent tensors will be recomputed. This call is only legal for network input tensors, since the shape of layer output tensors are inferred based on layer inputs and parameters.

  • dtypeDataType The data type of a tensor. The type is unchanged if the type is invalid for the given tensor.

  • broadcast_across_batchbool Whether to enable broadcast of tensor across the batch. When a tensor is broadcast across a batch, it has the same value for every member in the batch. Memory is only allocated once for the single member. This method is only valid for network input tensors, since the flags of layer output tensors are inferred based on layer inputs and parameters. If this state is modified for a tensor in the network, the states of all dependent tensors will be recomputed.

  • locationTensorLocation The storage location of a tensor.

  • is_network_inputbool Whether the tensor is a network input.

  • is_network_outputbool Whether the tensor is a network output.

  • dynamic_rangeTuple[float, float] A tuple containing the [minimum, maximum] of the dynamic range, or None if the range was not set.

  • is_shapebool Whether the tensor is a shape tensor.

  • allowed_formatsint The allowed set of TensorFormat candidates. This should be an integer consisting of one or more TensorFormat s, combined via bitwise OR after bit shifting. For example, 1 << int(TensorFormats.CHW4) | 1 << int(TensorFormat.CHW32).

reset_dynamic_range(self: tensorrt.tensorrt.ITensor) → None

Undo the effect of setting the dynamic range.

set_dynamic_range(self: tensorrt.tensorrt.ITensor, min: float, max: float) → bool

Set dynamic range for the tensor. NOTE: It is suggested to use tensor.dynamic_range = (min, max) instead.

Parameters
  • min – Minimum of the dynamic range.

  • max – Maximum of the dyanmic range.

Returns

true if succeed in setting range. Otherwise false.

ILayer

tensorrt.LayerType

Type of Layer

Members:

TRIP_LIMIT : Loop Trip limit layer

IDENTITY : Identity layer

PARAMETRIC_RELU : Parametric ReLU layer

RECURRENCE : Loop Recurrence layer

ACTIVATION : Activation layer

SELECT : Select layer

POOLING : Pooling layer

GATHER : Gather layer

UNARY : Unary layer

LRN : LRN layer

RNN_V2 : RNNv2 layer

FULLY_CONNECTED : Fully connected layer

SHAPE : Shape layer

CONSTANT : Constant layer

ITERATOR : Loop Iterator layer

DECONVOLUTION : Deconvolution layer

PLUGIN : Plugin layer

TOPK : TopK layer

RESIZE : Resize layer

PADDING : Padding layer

DEQUANTIZE : Dequantize layer

SCALE : Scale layer

SHUFFLE : Shuffle layer

PLUGIN_V2 : PluginV2 layer

LOOP_OUTPUT : Loop output layer

MATRIX_MULTIPLY : Matrix multiply layer

QUANTIZE : Quantize layer

ELEMENTWISE : Elementwise layer

CONVOLUTION : Convolution layer

REDUCE : Reduce layer

CONCATENATION : Concatenation layer

SOFTMAX : Softmax layer

RAGGED_SOFTMAX : Ragged softmax layer

SLICE : Slice layer

FILL : Fill layer

class tensorrt.ILayer

Base class for all layer classes in an INetworkDefinition .

Variables
  • namestr The name of the layer.

  • typeLayerType The type of the layer.

  • num_inputsint The number of inputs of the layer.

  • num_outputsint The number of outputs of the layer.

  • precisionDataType The computation precision.

  • precision_is_setbool Whether the precision is set or not.

get_input(self: tensorrt.tensorrt.ILayer, index: int) → tensorrt.tensorrt.ITensor

Get the layer input corresponding to the given index.

Parameters

index – The index of the input tensor.

Returns

The input tensor, or None if the index is out of range.

get_output(self: tensorrt.tensorrt.ILayer, index: int) → tensorrt.tensorrt.ITensor

Get the layer output corresponding to the given index.

Parameters

index – The index of the output tensor.

Returns

The output tensor, or None if the index is out of range.

get_output_type(self: tensorrt.tensorrt.ILayer, index: int) → tensorrt.tensorrt.DataType

Get the output type of the layer.

Parameters

index – The index of the output tensor.

Returns

The output precision. Default : DataType.FLOAT.

output_type_is_set(self: tensorrt.tensorrt.ILayer, index: int) → bool

Whether the output type has been set for this layer.

Parameters

index – The index of the output.

Returns

Whether the output type has been explicitly set.

reset_output_type(self: tensorrt.tensorrt.ILayer, index: int) → None

Reset output type of this layer.

Parameters

index – The index of the output.

reset_precision(self: tensorrt.tensorrt.ILayer) → None

Reset the computation precision of the layer.

set_input(self: tensorrt.tensorrt.ILayer, index: int, tensor: tensorrt.tensorrt.ITensor) → None

Set the layer input corresponding to the given index.

Parameters
  • index – The index of the input tensor.

  • tensor – The input tensor.

set_output_type(self: tensorrt.tensorrt.ILayer, index: int, dtype: tensorrt.tensorrt.DataType) → None

Constraint layer to generate output data with given type. Note that this method cannot be used to set the data type of the second output tensor of the topK layer. The data type of the second output tensor of the topK layer is always Int32.

Parameters
  • index – The index of the output tensor to set the type.

  • dtype – DataType of the output.