transformer_engine.h

Base classes and functions of Transformer Engine API.

Typedefs

typedef void *NVTETensor

TE Tensor type.

NVTETensor is a contiguous tensor type storing a pointer to data of a given shape and type. It does not own the memory it points to.

Enums

enum NVTEDType

TE datatype.

Values:

enumerator kNVTEByte

Byte

enumerator kNVTEInt32

32-bit integer

enumerator kNVTEInt64

32-bit integer

enumerator kNVTEFloat32

32-bit float

enumerator kNVTEFloat16

16-bit float (E5M10)

enumerator kNVTEBFloat16

16-bit bfloat (E8M7)

enumerator kNVTEFloat8E4M3

8-bit float (E4M3)

enumerator kNVTEFloat8E5M2

8-bit float (E5M2)

enumerator kNVTENumTypes

Number of supported types

Functions

NVTETensor nvte_create_tensor(void *dptr, const NVTEShape shape, const NVTEDType dtype, float *amax_dptr, float *scale_dptr, float *scale_inv_dptr)

Create a new TE tensor.

Create a new TE tensor with a given shape, datatype and data. TE tensors are just wrappers on top of raw data and do not own memory.

Parameters:
  • dptr[in] Pointer to the tensor data.

  • shape[in] Shape of the tensor.

  • dtype[in] Data type of the tensor.

  • amax_dptr[in] Pointer to the AMAX value.

  • scale_dptr[in] Pointer to the scale value.

  • scale_inv_dptr[in] Pointer to the inverse of scale value.

Returns:

A new TE tensor.

void nvte_destroy_tensor(NVTETensor tensor)

Destroy a TE tensor.

Since the TE tensor does not own memory, the underlying data is not freed during this operation.

Parameters:

tensor[in] Tensor to be destroyed.

NVTEDType nvte_tensor_type(const NVTETensor tensor)

Get a tensor’s data type.

Parameters:

tensor[in] Tensor.

Returns:

A data type of the input tensor.

NVTEShape nvte_tensor_shape(const NVTETensor tensor)

Get a tensor’s data shape.

Parameters:

tensor[in] Tensor.

Returns:

A shape of the input tensor.

void *nvte_tensor_data(const NVTETensor tensor)

Get a raw pointer to the tensor’s data.

Parameters:

tensor[in] Tensor.

Returns:

A raw pointer to tensor’s data.

float *nvte_tensor_amax(const NVTETensor tensor)

Get a pointer to the tensor’s amax data.

Parameters:

tensor[in] Tensor.

Returns:

A pointer to tensor’s amax data.

float *nvte_tensor_scale(const NVTETensor tensor)

Get a pointer to the tensor’s scale data.

Parameters:

tensor[in] Tensor.

Returns:

A pointer to tensor’s scale data.

float *nvte_tensor_scale_inv(const NVTETensor tensor)

Get a pointer to the tensor’s inverse of scale data.

Parameters:

tensor[in] Tensor.

Returns:

A pointer to tensor’s inverse of scale data.

void nvte_tensor_pack_create(NVTETensorPack *pack)

Create tensors in NVTETensorPack.

void nvte_tensor_pack_destroy(NVTETensorPack *pack)

Destroy tensors in NVTETensorPack.

struct NVTEShape
#include <transformer_engine.h>

Shape of the tensor.

Public Members

const size_t *data

Shape data, of size ndim.

size_t ndim

Number of dimensions.

struct NVTETensorPack
#include <transformer_engine.h>

Pack of tensors, generally used for auxiliary outputs.

Public Members

NVTETensor tensors[MAX_SIZE]

Wrappers of tensors. They do not hold the associated memory.

size_t size = 0

Actual number of tensors in the pack, 0 <= size <= MAX_SIZE.

Public Static Attributes

static const int MAX_SIZE = 10

Max number of tensors in the pack. Assumed <= 10.

namespace transformer_engine

Namespace containing C++ API of Transformer Engine.

Enums

enum class DType

TE datatype.

Values:

enumerator kByte
enumerator kInt32
enumerator kInt64
enumerator kFloat32
enumerator kFloat16
enumerator kBFloat16
enumerator kFloat8E4M3
enumerator kFloat8E5M2
enumerator kNumTypes
struct TensorWrapper
#include <transformer_engine.h>

C++ wrapper for the NVTETensor class.

Public Functions

inline TensorWrapper(void *dptr, const NVTEShape &shape, const DType dtype, float *amax_dptr = nullptr, float *scale_dptr = nullptr, float *scale_inv_dptr = nullptr)

Constructs new TensorWrapper.

Create a new TE tensor with a given shape, datatype and data. TE tensors are just wrappers on top of raw data and do not own memory.

Parameters:
  • dptr[in] Pointer to the tensor data.

  • shape[in] Shape of the tensor.

  • dtype[in] Data type of the tensor.

  • amax_dptr[in] Pointer to the AMAX value.

  • scale_dptr[in] Pointer to the scale value.

  • scale_inv_dptr[in] Pointer to the inverse of scale value.

inline TensorWrapper(void *dptr, const std::vector<size_t> &shape, const DType dtype, float *amax_dptr = nullptr, float *scale_dptr = nullptr, float *scale_inv_dptr = nullptr)

Constructs new TensorWrapper.

Create a new TE tensor with a given shape, datatype and data. TE tensors are just wrappers on top of raw data and do not own memory.

Parameters:
  • dptr[in] Pointer to the tensor data.

  • shape[in] Shape of the tensor.

  • dtype[in] Data type of the tensor.

  • amax_dptr[in] Pointer to the AMAX value.

  • scale_dptr[in] Pointer to the scale value.

  • scale_inv_dptr[in] Pointer to the inverse of scale value.

inline TensorWrapper()

Constructs new empty TensorWrapper.

Create a new empty TE tensor which holds nothing.

inline ~TensorWrapper()

TensorWrapper destructor.

TensorWrapper &operator=(const TensorWrapper &other) = delete
TensorWrapper(const TensorWrapper &other) = delete
inline TensorWrapper(TensorWrapper &&other)

Constructs new TensorWrapper from existing TensorWrapper.

Pass an existing TE tensor to a new TensorWrapper.

Parameters:

other[inout] The source of the data.

inline TensorWrapper &operator=(TensorWrapper &&other)

Assign the data from existing TensorWrapper.

Change ownership of an existing TE tensor.

Parameters:

other[inout] The source of the data.

inline NVTETensor data() const noexcept

Get an underlying NVTETensor.

Returns:

NVTETensor held by this TensorWrapper.

inline const NVTEShape shape() const noexcept

Get the shape of this TensorWrapper.

Returns:

Shape of this TensorWrapper.

inline DType dtype() const noexcept

Get the data type of this TensorWrapper.

Returns:

Data type of this TensorWrapper.

inline void *dptr() const noexcept

Get a raw pointer to the tensor’s data.

Returns:

A raw pointer to tensor’s data.

inline float *amax() const noexcept

Get a pointer to the tensor’s amax data.

Returns:

A pointer to tensor’s amax data.

inline float *scale() const noexcept

Get a pointer to the tensor’s scale data.

Returns:

A pointer to tensor’s scale data.

inline float *scale_inv() const noexcept

Get a pointer to the tensor’s inverse of scale data.

Returns:

A pointer to tensor’s inverse of scale data.

Private Members

NVTETensor tensor_ = nullptr

Wrapped NVTETensor.