Release Notes Release 0.6.0 (BETA)

Key Features and Enhancements

Added ONNX export support for Transformer Engine modules.

Added the ability to calibrate the Linear layer for FP8 from a higher-precision checkpoint.

Added a zero_centered_gamma option to LayerNorm() (and other methods using LayerNorm()) for increased precision when the parameters of LayerNorm() are stored in FP16 or BFloat16.

Added DotProductAttention(), with optimized implementations of attention, including integration with FlashAttention().

Optimized handling of separate QKV parameters in TransformerLayer().

Fixed Issues in This Release

Fixed an issue with gradients not being propagated properly when training under AMP.

Fixed an issue occurring for very large inputs in the custom Softmax kernel.

Known Issues in This Release

There are no known issues in this release.

Breaking Changes in This Release

There are no breaking changes in this release.

Deprecated Features

There are no deprecated features in this release.