Release Notes#

cuBLASMp v0.5.0#

Released: June 16, 2025

Breaking changes#

  • cuBLASMp has transitioned from using the Communication Abstraction Library (libcal) to using NCCL directly. This is a breaking change and requires changes to cuBLASMp initialization in the user application.

New features#

cuBLASMp v0.4.0#

Released: March 10, 2025

cuBLASMp v0.3.1#

Released: December 10, 2024

  • Add option to set the amount of SMs to be used for communication (currently relevant only for Atomic GEMM + ReduceScatter).

  • Decrease workspace size requirement in TP overlap GEMMs.

  • Remove extra synchronization in TP overlap GEMMs.

  • Allow C matrix to be null when beta is 0.

  • Fix GEMM implementation for complex types with transA / transB being CUBLAS_OP_T.

cuBLASMp v0.3.0#

Released: November 4, 2024

Breaking changes#

cuBLASMp v0.2.1#

Released: May 29, 2024

  • Added mixed and lower precision support.

  • Bug fixes.

cuBLASMp v0.2.0#

Released: April 4, 2024

cuBLASMp v0.1.2#

Released: February 22, 2024

cuBLASMp v0.1.1#

Released: January 11, 2024

cuBLASMp v0.1.0#

Released: December 11, 2023

  • Early access release.

  • This release focuses on functionality.