.. include:: /content/common.rsts

Release Notes |ndash| Release 1.1.0
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Key Features and Enhancements
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

- [pyTorch] Memory usage is reduced when using the ``fp8_model_init``
  API during inference.
  
- [pyTorch] Memory usage is reduced when using
  the ``LayerNormLinear``, ``LayernormMLP``, and
  ``TransformerLayer`` APIs.
  
- [JAX] Transformer Engine is migrated to the new
  Custom Partitioning mechanism of parallelism for custom ops in JAX.
  
- [JAX] The attention operation's performance is improved
  when using cuDNN version 8.9.6 or greater.
  
- [C/C++] Transformer Engine can now be built as a subproject. 

Fixed Issues
@@@@@@@@@@@@

- In some cases passing the non-contiguous tensors as Q, K, or V
  to ``DotProductAttention`` would result in an error, "Exception: The provided qkv memory layout is not supported!."


Known Issues in This Release
@@@@@@@@@@@@@@@@@@@@@@@@@@@@

- FlashAttention v2, which is a dependency of this release
  of Transformer Engine, has a known issue with excessive memory usage during installation (https://github.com/Dao-AILab/flash-attention/issues/358). You can work around this issue by either of these means:
  
  - Setting the ``MAX_JOBS`` environment variable to ``1`` during
    Transformer Engine installation
  
  - Installing FlashAttention v1
    (e.g. by ``pip install flash-attn==1.0.9``) before attempting to install Transformer Engine

- [pyTorch] FlashAttention v2.1 has changed the behavior
  of the causal mask when performing cross-attention (see https://github.com/Dao-AILab/flash-attention#21-change-behavior-of-causal-flag for reference). For Transformer Engine to preserve consistent behavior between versions and back ends, FlashAttention is disabled for this use case (i.e. cross-attention with casual masking) when FlashAttention version 2.1+ is installed.

Breaking Changes in This Release
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

- There are no breaking changes in this release.

Deprecated Features
@@@@@@@@@@@@@@@@@@@

- There are no deprecated features in this release.