Release Notes – Release 2.11¶
Key Features and Enhancements¶
[PyTorch] Enabled the reference Current Scaling recipe for FP8 training. (#2368)
[PyTorch] Improved Random Hadamard Transform (RHT) device tensor caching to reduce memory allocations and improve performance for NVFP4 quantization. (#2395)
[PyTorch] Implemented selective activation checkpointing for
LayerNormMLPmodule. (#2311)[C, PyTorch, JAX] Improved performance of MXFP8 quantization. (#2062)
[C, PyTorch] Improved performance of NVFP4 quantization. (#2351)
[PyTorch] Improved FSDP2 all-gather performance and added support for
FusedAdamoptimizer with FSDP2. (#2370)[PyTorch] Extended debug tools to support GroupedLinear layers. (#1953)
[JAX] Added Triton kernel bindings for JAX, enabling custom Triton kernels in JAX workflows. (#2437)
[C] Introduced experimental
NVTEGroupedTensorclass and helper functions. (#2388)[C, PyTorch, JAX] Added FP8 support for primary weights in MXFP8 format with partial casting and amax calculations. (#2055)
[JAX] Added support for context parallelism (CP) for THD format and sliding window attention (SWA) using all-gather (AG), striped load balancing with stripe size greater than 1. (#2379)
[JAX] Implemented JAX primitives for token permutation operations on single GPU for mixture-of-experts routing. (#2473)
[PyTorch] Added THD format support for
max_logitclipping andMuonClipgradient clipping operations. (#2480)
Fixed Issues¶
[PyTorch] Fixed a numerical issue when noncontiguous tensor was passed to cross_entropy backward pass. (#2402)
[PyTorch] Fixed CUDA graph execution order for backward weight gradient computation when using chunked layers. (#2376)
[C] Fixed runtime library loading logic to properly handle missing dependencies and load order. (#2297)
[Jax] Removed use of scan loop as the default for ring attention due for improved performance. (#2503).
Known Issues in This Release¶
There are no known issues in this release.
Breaking Changes in This Release¶
There are no breaking changes in this release.
. _DeprecatedFeatures:
Deprecated Features¶
There are no deprecated features in this release.