Release Notes Release 0.3.0 (BETA)

Key Features and Enhancements

Added support for conditional computation of gradients, increasing performance in cases where Transformer weights are frozen.

Added support for activation recomputation using the new checkpoint API.

Fixed Issues in This Release

Fixed a crash occurring when training in FP8 with some input shapes.

Fixed an issue where in some cases the FC2 bias was not used in the LayerNormMLP block.

Known Issues in This Release

There are no known issues in this release.

Breaking Changes in This Release

There are no breaking changes in this release.

Deprecated Features

The checkpoint format used with FP8 is changed. The current release can read old checkpoints, but future releases will drop that capability.