Release Notes – Release 0.4.0 (BETA)¶
Key Features and Enhancements¶
nvte_multi_cast_transpose() C function for handling multiple casts at the same time.
Moved softmax kernels to the framework-agnostic C API layer.
Added a performance optimization tutorial.
Fixed Issues in This Release¶
Fixed a crash occurring for some inputs in the
LayerNorm() backward call.
Known Issues in This Release¶
There are no known issues in this release.
Breaking Changes in This Release¶
The C API is reworked to be more flexible when handling scaling parameters.
LayerNorm module parameter names are changed to weight and bias to match pyTorch.
There are no deprecated features in this release.