Data Format Descriptions#

TensorRT supports different data formats. There are two aspects to consider: data type and layout.

Data Type Format

The data type is the representation of each value. Its size determines the range of values and the precision of the representation, which are:

  • FP32 (32-bit floating point or single precision)

  • FP16 (16-bit floating point or half precision)

  • BF16 (1-bit sign, 8-bit exponent, 7-bit mantissa)

  • FP8 (1-bit sign, 4-bit exponent, 3-bit mantissa)

  • FP4 (1-bit sign, 2-bit exponent, 1-bit mantissa)

  • E8M0 (0-bit sign, 8-bit exponent, 0-bit mantissa)

  • INT64 (64-bit integer)

  • INT32 (32-bit integer)

  • INT8 (8-bit integer)

  • UINT8 (unsigned 8-bit integer)

  • INT4 (4-bit integer)

Layout Format

The layout format determines the ordering in which values are stored. Typically, batch dimensions are the leftmost dimensions, and the other dimensions refer to aspects of each data item, such as C is channel, H is height, and W is width in images. Ignoring batch sizes, which always precede these, C, H, and W are typically sorted as CHW or HWC.

Table 9 TensorFormat Enum Quick Reference#

Format Name

TensorFormat Enum

Description

Linear (row-major)

kLINEAR

Default CHW ordering with no vectorization.

NC/2HW2

kCHW2

Channel pairs packed per HxW element (FP16, BF16).

NC/4HW4

kCHW4

4-channel vectors per HxW element (INT8).

NHWC8

kHWC8

8-channel vectors in HWC order (FP16, BF16).

NC/16HW16

kCHW16

16-channel vectors (DLA FP16).

NC/32HW32

kCHW32

32-channel vectors (FP32, FP16, INT8).

NDHWC8

kDHWC8

3D variant of kHWC8 (FP16, BF16).

NC/32DHW32

kCDHW32

3D variant of kCHW32 (FP16, INT8).

NHWC

kHWC

Channel-last without vectorization (FP32, UINT8).

DLA Linear

kDLA_LINEAR

DLA-specific row-major (FP16, INT8).

DLA HWC4

kDLA_HWC4

DLA-specific 4-channel vectors (FP16, INT8).

NHWC16

kHWC16

16-channel vectors in HWC order (FP16, INT8, FP8).

NDHWC

kDHWC

3D channel-last without vectorization (FP32).

For supported data type combinations per format, refer to the I/O Formats table.

The default CHW (kLINEAR) layout stores one HxW matrix per channel: all values of channel 0 are stored first, then all of channel 1, and so on.

CHW (kLINEAR) Layout

In HWC (kHWC), a single HxW matrix holds the data, and each entry is a C-tuple containing all channel values for that pixel. All channel values for one pixel are stored consecutively before moving to the next pixel.

HWC (kHWC) Layout

TensorRT also defines formats that pack channel values into fixed-width groups per pixel, which lets kernels load multiple channels with one vector instruction. NC/2HW2 (kCHW2) and NHWC8 (kHWC8) below are two examples; the table above lists the full set.

In NC/2HW2 (kCHW2), channel values are packed into pairs per pixel; the tensor is stored as ceil(C/2) HxW matrices of two-channel pairs. If C is odd, the last pair contains one padded slot. Within a pair, consecutive channels have stride 1; between pairs, the stride is 2*HxW. For example, a 3-channel input is stored as two pair-planes: [C=0, C=1] and [C=2, pad].

NC/2HW2 (kCHW2) Layout - channel pairs packed per pixel

In NHWC8 (kHWC8), each pixel of the HxW matrix stores its channel values packed into 8-tuples, and C is padded up to a multiple of 8. For example, a 3-channel input has one 8-tuple per pixel containing 3 real values plus 5 padded slots.

NHWC8 (kHWC8) Layout - 8-channel vectors per pixel

Other TensorFormat values follow similar packing rules to kCHW2 and kHWC8.

See also

Understanding Formats Printed in Logs

How to interpret tensor format and stride information in TensorRT log output.

Glossary

Definitions of TensorRT terms including tensor formats and precision types.