cast.h¶

Functions to cast to/from FP8.

Functions

void nvte_fp8_quantize(const NVTETensor input, const NVTETensor scale, NVTETensor output, NVTETensor amax, NVTETensor scale_inv, cudaStream_t stream)¶

Cast tensor to FP8.

Parameters

input – [in] Input tensor to be cast.
scale – [in] Scaling factor of the output tensor.
output – [out] Output FP8 tensor.
amax – [inout] AMAX value of the output tensor.
scale_inv – [out] Inverse of the output’s scaling factor.
stream – [in] CUDA stream used for the operation.

void nvte_fp8_dequantize(const NVTETensor input, const NVTETensor scale_inv, NVTETensor output, cudaStream_t stream)¶

Cast tensor from FP8.

Parameters

input – [in] Input tensor to be cast.
scale_inv – [in] Inverse of the input’s scaling factor.
output – [out] Output tensor.
stream – [in] CUDA stream used for the operation.