cast.h¶
Functions to cast to/from FP8.
Functions
-
void nvte_fp8_quantize(const NVTETensor input, const NVTETensor scale, NVTETensor output, NVTETensor amax, NVTETensor scale_inv, cudaStream_t stream)¶
Cast tensor to FP8.
- Parameters
input – [in] Input tensor to be cast.
scale – [in] Scaling factor of the output tensor.
output – [out] Output FP8 tensor.
amax – [inout] AMAX value of the output tensor.
scale_inv – [out] Inverse of the output’s scaling factor.
stream – [in] CUDA stream used for the operation.
-
void nvte_fp8_dequantize(const NVTETensor input, const NVTETensor scale_inv, NVTETensor output, cudaStream_t stream)¶
Cast tensor from FP8.
- Parameters
input – [in] Input tensor to be cast.
scale_inv – [in] Inverse of the input’s scaling factor.
output – [out] Output tensor.
stream – [in] CUDA stream used for the operation.