{
"cells": [
{
"cell_type": "markdown",
"id": "7b3e6954",
"metadata": {},
"source": [
"# Using FP8 and FP4 with Transformer Engine\n",
"\n",
"H100 GPU introduced support for a new datatype, FP8 (8-bit floating point), enabling higher throughput of matrix multiplies and convolutions. Blackwell added support for NVFP4 and MXFP8 datatypes. In this example we will introduce these low precision datatypes and show how to use them with Transformer Engine.\n",
"\n",
"## Introduction to FP8\n",
"\n",
"### Structure\n",
"\n",
"The FP8 datatype supported by H100 is actually 2 distinct datatypes, useful in different parts of the training of neural networks:\n",
"\n",
"* E4M3 - it consists of 1 sign bit, 4 exponent bits and 3 bits of mantissa. It can store values up to +/-448 and `nan`.\n",
"* E5M2 - it consists of 1 sign bit, 5 exponent bits and 2 bits of mantissa. It can store values up to +/-57344, +/- `inf` and `nan`. The tradeoff of the increased dynamic range is lower precision of the stored values.\n",
"\n",
"\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"