.. include:: /content/common.rsts Release Notes |ndash| Release 1.12 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Key Features and Enhancements @@@@@@@@@@@@@@@@@@@@@@@@@@@@@ - [pyTorch] Added `rotary_base` argument for RoPE instead of hard-coding the value to 10000. - [pyTorch] Added support for the `pool` argument in the `make_graphed_callables` API. - [pyTorch] Made miscellaneous minor improvements to mitigate CPU overhead. - [pyTorch/C] Fixed window size calculation when using cuDNN attention backend. - [pyTorch] Expanded fused RoPE kernel support to include Context parallelism and "thd" qkv-format. - [pyTorch] Made ``flash-attn`` an optional dependency. - [JAX] Added support for sliding window attention. Fixed Issues @@@@@@@@@@@@ - [pyTorch/C] Fixed window size calculation when using cuDNN attention backend. - [pyTorch] Fixed miscellaneous bugs in the ``flash-attn`` version 3 backend. - [pyTorch] Fixed an issue using the ``flash-attn`` backend with Context Parallelism. - [pyTorch] Fixed a numerical error when using FP8 with activation recompute. - [pyTorch] Fixed an issue in the backward pass of the `GroupedLinear` class when weights don't require gradient. - [JAX] Fixed a numerical bug in the cuDNN attention backend when using Context Parallelism. Known Issues in This Release @@@@@@@@@@@@@@@@@@@@@@@@@@@@ There are no known issues in this release. Breaking Changes in This Release @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ There are no breaking changes in this release. Deprecated Features @@@@@@@@@@@@@@@@@@@ There are no deprecated features in this release.