.. include:: /content/common.rsts

Release Notes |ndash| Release 1.11
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Key Features and Enhancements
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

- [pyTorch] Added dtensor support for optimizers.
- [pyTorch] Added context parallel implementation with QKV all-to-all
  collectives.
- [pyTorch] Added support for CPU offloading when using FP8 attention.
- [pyTorch] Implemented padding and unpadding modules for FP8
  that improve e2e performance of MoE models by ~2%.
- [C/pyTorch] Added support for permutation operations for MoE
  and exposed them in the C API.
- [pyTorch] Added support for RoPE when using FP8 attention.
- [pyTorch] Added support for FlashAttention-3.
- [JAX] Implemented context parallel fused attention
  using allgather and reduce-scatter collectives.


Fixed Issues
@@@@@@@@@@@@

- [pyTorch] Fixed a crash in fused adam optimizer
  when master parameters are not set.
- [pyTorch] Fix a crash when using activation recompute with Python 3.10.
- [pyTorch] Made miscellaneous fixes
  in the logic to select the correct attention backend.


Known Issues in This Release
@@@@@@@@@@@@@@@@@@@@@@@@@@@@

There are no known issues in this release.


Breaking Changes in This Release
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

There are no breaking changes in this release.


Deprecated Features
@@@@@@@@@@@@@@@@@@@

There are no deprecated features in this release.