Release Notes – Release 0.4.0 (BETA)¶
Key Features and Enhancements¶
Added new nvte_multi_cast_transpose()
C function for handling multiple casts at the same time.
Moved softmax kernels to the framework-agnostic C API layer.
Added a performance optimization tutorial.
Fixed Issues in This Release¶
Fixed a crash occurring for some inputs in the LayerNorm()
backward call.
Known Issues in This Release¶
There are no known issues in this release.
Breaking Changes in This Release¶
The C API is reworked to be more flexible when handling scaling parameters.
The LayerNorm
module parameter names are changed to weight and bias to match pyTorch.
Deprecated Features¶
There are no deprecated features in this release.