.. include:: /content/common.rsts Release Notes |ndash| Release 1.10 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Key Features and Enhancements @@@@@@@@@@@@@@@@@@@@@@@@@@@@@ - [pyTorch] Added an option to use keyword arguments with CUDA graphs. - [pyTorch] Implemented a new load-balanced offloading algorithm to utilize the CPU/GPU interconnect bandwidth to the maximum extent. - [pyTorch] Added support for multi-latent attention. - [pyTorch] Added additional documentation, scripts, and benchmarks for the attention backend. - [pyTorch] Added context-parallel implementation with KV allgather for causal attention. - [pyTorch] Added support for data type casting in the fused Adam kernel. - [pyTorch] Added arguments for cumulative and maximum sequence lengths to the ``TransformerLayer`` and ``MultiheadAttention`` APIs. - [pyTorch] Added support for padding mask in unfused backend for dot product attention. - [pyTorch] Expanded operation support in the fusion API (``transformer_engine.pytorch.ops``). - [PaddlePaddle] Added an option to run dot product attention deterministically. - [JAX] Added support for non-deterministic algorithms in the CUDNN flash attention backend for improved performance. - [pyTorch] Made several improvements to reduce the amount CPU overhead during execution. Fixed Issues @@@@@@@@@@@@ - [pyTorch] Fixed miscellaneous bugs in communication-gemm overlap with userbuffers. - [pyTorch] Removed an additional copy of weights stored when using CPU offloading. - [pyTorch] Fixed a crash when running non-causal training with context parallelism. - [pyTorch] Fixed the calculation of tensor parallel size when using MQA/GQA. - [pyTorch] Fixed a crash when using context parallelism with the THD format. - [pyTorch] Fixed a crash in CUDA graphs when skipping warm-up iterations. - [pyTorch] Fixed a bug in ``TransformerLayer`` for the cross attention case where arguments were incorrectly propagated to ``DotProductAttention``. - [C] Hid arbitrary symbols exposed globally in the shared object in order to avoid symbol conflict errors, which could cause a crash during library loading and imports. Known Issues in This Release @@@@@@@@@@@@@@@@@@@@@@@@@@@@ There are no known issues in this release. Breaking Changes in This Release @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ There are no breaking changes in this release. Deprecated Features @@@@@@@@@@@@@@@@@@@ There are no deprecated features in this release.