transformer_engine.h

Base classes and functions of Transformer Engine API.

Typedefs

typedef void *NVTETensor

TE Tensor type.

NVTETensor is a contiguous tensor type storing a pointer to data of a given shape and type. It does not own the memory it points to.

Enums

enum NVTEDType

TE datatype.

Values:

enumerator kNVTEByte

Byte

enumerator kNVTEInt32

32-bit integer

enumerator kNVTEInt64

32-bit integer

enumerator kNVTEFloat32

32-bit float

enumerator kNVTEFloat16

16-bit float (E5M10)

enumerator kNVTEBFloat16

16-bit bfloat (E8M7)

enumerator kNVTEFloat8E4M3

8-bit float (E4M3)

enumerator kNVTEFloat8E5M2

8-bit float (E5M2)

enumerator kNVTENumTypes

Number of supported types

Functions

NVTETensor nvte_create_tensor(void *dptr, const NVTEShape shape, const NVTEDType dtype, float *amax_dptr, float *scale_dptr, float *scale_inv_dptr)

Create a new TE tensor.

Create a new TE tensor with a given shape, datatype and data. TE tensors are just wrappers on top of raw data and do not own memory.

Parameters
  • dptr[in] Pointer to the tensor data.

  • shape[in] Shape of the tensor.

  • dtype[in] Data type of the tensor.

  • amax_dptr[in] Pointer to the AMAX value.

  • scale_dptr[in] Pointer to the scale value.

  • scale_inv_dptr[in] Pointer to the inverse of scale value.

Returns

A new TE tensor.

void nvte_destroy_tensor(NVTETensor tensor)

Destroy a TE tensor.

Since the TE tensor does not own memory, the underlying data is not freed during this operation.

Parameters

tensor[in] Tensor to be destroyed.

NVTEDType nvte_tensor_type(const NVTETensor tensor)

Get a tensor’s data type.

Parameters

tensor[in] Tensor.

Returns

A data type of the input tensor.

NVTEShape nvte_tensor_shape(const NVTETensor tensor)

Get a tensor’s data shape.

Parameters

tensor[in] Tensor.

Returns

A shape of the input tensor.

void *nvte_tensor_data(const NVTETensor tensor)

Get a raw pointer to the tensor’s data.

Parameters

tensor[in] Tensor.

Returns

A raw pointer to tensor’s data.

float *nvte_tensor_amax(const NVTETensor tensor)

Get a pointer to the tensor’s amax data.

Parameters

tensor[in] Tensor.

Returns

A pointer to tensor’s amax data.

float *nvte_tensor_scale(const NVTETensor tensor)

Get a pointer to the tensor’s scale data.

Parameters

tensor[in] Tensor.

Returns

A pointer to tensor’s scale data.

float *nvte_tensor_scale_inv(const NVTETensor tensor)

Get a pointer to the tensor’s inverse of scale data.

Parameters

tensor[in] Tensor.

Returns

A pointer to tensor’s inverse of scale data.

void nvte_tensor_pack_create(NVTETensorPack *pack)

Create tensors in NVTETensorPack.

void nvte_tensor_pack_destroy(NVTETensorPack *pack)

Destroy tensors in NVTETensorPack.

struct NVTEShape
#include <transformer_engine.h>

Shape of the tensor.

Public Members

const size_t *data

Shape data, of size ndim.

size_t ndim

Number of dimensions.

struct NVTETensorPack
#include <transformer_engine.h>

Pack of tensors, generally used for auxiliary outputs.

Public Members

NVTETensor tensors[MAX_SIZE]

Wrappers of tensors. They do not hold the associated memory.

size_t size = 0

Actual number of tensors in the pack, 0 <= size <= MAX_SIZE.

Public Static Attributes

static const int MAX_SIZE = 10

Max number of tensors in the pack. Assumed <= 10.

namespace transformer_engine

Namespace containing C++ API of Transformer Engine.

Enums

enum class DType

TE datatype.

Values:

enumerator kByte
enumerator kInt32
enumerator kInt64
enumerator kFloat32
enumerator kFloat16
enumerator kBFloat16
enumerator kFloat8E4M3
enumerator kFloat8E5M2
enumerator kNumTypes
struct TensorWrapper
#include <transformer_engine.h>

C++ wrapper for the NVTETensor class.

Public Functions

inline TensorWrapper(void *dptr, const NVTEShape &shape, const DType dtype, float *amax_dptr = nullptr, float *scale_dptr = nullptr, float *scale_inv_dptr = nullptr)

Constructs new TensorWrapper.

Create a new TE tensor with a given shape, datatype and data. TE tensors are just wrappers on top of raw data and do not own memory.

Parameters
  • dptr[in] Pointer to the tensor data.

  • shape[in] Shape of the tensor.

  • dtype[in] Data type of the tensor.

  • amax_dptr[in] Pointer to the AMAX value.

  • scale_dptr[in] Pointer to the scale value.

  • scale_inv_dptr[in] Pointer to the inverse of scale value.

inline TensorWrapper(void *dptr, const std::vector<size_t> &shape, const DType dtype, float *amax_dptr = nullptr, float *scale_dptr = nullptr, float *scale_inv_dptr = nullptr)

Constructs new TensorWrapper.

Create a new TE tensor with a given shape, datatype and data. TE tensors are just wrappers on top of raw data and do not own memory.

Parameters
  • dptr[in] Pointer to the tensor data.

  • shape[in] Shape of the tensor.

  • dtype[in] Data type of the tensor.

  • amax_dptr[in] Pointer to the AMAX value.

  • scale_dptr[in] Pointer to the scale value.

  • scale_inv_dptr[in] Pointer to the inverse of scale value.

inline TensorWrapper()

Constructs new empty TensorWrapper.

Create a new empty TE tensor which holds nothing.

inline ~TensorWrapper()

TensorWrapper destructor.

TensorWrapper &operator=(const TensorWrapper &other) = delete
TensorWrapper(const TensorWrapper &other) = delete
inline TensorWrapper(TensorWrapper &&other)

Constructs new TensorWrapper from existing TensorWrapper.

Pass an existing TE tensor to a new TensorWrapper.

Parameters

other[inout] The source of the data.

inline TensorWrapper &operator=(TensorWrapper &&other)

Assign the data from existing TensorWrapper.

Change ownership of an existing TE tensor.

Parameters

other[inout] The source of the data.

inline NVTETensor data() const noexcept

Get an underlying NVTETensor.

Returns

NVTETensor held by this TensorWrapper.

inline const NVTEShape shape() const noexcept

Get the shape of this TensorWrapper.

Returns

Shape of this TensorWrapper.

inline DType dtype() const noexcept

Get the data type of this TensorWrapper.

Returns

Data type of this TensorWrapper.

inline void *dptr() const noexcept

Get a raw pointer to the tensor’s data.

Returns

A raw pointer to tensor’s data.

inline float *amax() const noexcept

Get a pointer to the tensor’s amax data.

Returns

A pointer to tensor’s amax data.

inline float *scale() const noexcept

Get a pointer to the tensor’s scale data.

Returns

A pointer to tensor’s scale data.

inline float *scale_inv() const noexcept

Get a pointer to the tensor’s inverse of scale data.

Returns

A pointer to tensor’s inverse of scale data.

Private Members

NVTETensor tensor_ = nullptr

Wrapped NVTETensor.