.. include:: /content/common.rsts

Release Notes |ndash| Release 1.8
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


Key Features and Enhancements
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

- [pyTorch] Added a new argument, ``softmax_scale``,
  to the ``DotProductAttention`` API.
- [pyTorch] Extended TransformerEngine’s pyTorch build
  to always compile with tensor parallelism (TP) communication overlap support, and to remove MPI dependency. Also exposed the APIs ``initialize_ub`` and ``destroy_ub`` for communication-gemm overlap configuration.
- [pyTorch] Improved documentation for the ``DotProductAttention`` API,
  including benchmarks and end-to-end test scripts.
- [pyTorch] Incorporated the Fused Adam and Fused SGD optimizers
  into Transformer Engine. They previously had to be installed from the GitHub repository https://github.com/NVIDIA/apex.
  

Fixed Issues
@@@@@@@@@@@@

- [pyTorch] Made internal changes to reduce the amount of CPU overhead.
  
- [pyTorch] Fixed a crash that occured
  when using TorchDynamo with the ``checkpoint`` API.
- [pyTorch] Fixed an issue with loading an FP8 checkpoint
  when using FP8 attention.


Known Issues in This Release
@@@@@@@@@@@@@@@@@@@@@@@@@@@@

There are no known issues in this release.


Breaking Changes in This Release
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

There are no breaking changes in this release.


Deprecated Features
@@@@@@@@@@@@@@@@@@@

There are no deprecated features in this release.