NVIDIA Deep Learning NCCL Documentation
Release Notes (PDF) - v2.21.5 - Last updated April 5, 2024

NCCL Release 2.11.4

This is the NCCL 2.11.4 release notes. For previous NCCL release notes, refer to the NCCL Archives.

Compatibility

NCCL 2.11.4 has been tested with the following:

Key Features and Enhancements

This NCCL release includes the following key features and enhancements.

  • Added new API for creating a reduction operation which multiplies the input by a rank-specific scalar before doing an inter-rank summation (see: ncclRedOpCreatePreMulSum).

  • Improved CollNet (SHARP) performance of ncclAllReduce when captured in a CUDA Graph via user buffer registration.

  • Added env NCCL_NET_PLUGIN=“<suffix>” to allow the user a way to choose among multiple NCCL net plugins by substituting into libnccl-net-<suffix>.so.

Fixed Issues

The following issues have been resolved in NCCL 2.11.4:
  • Fixed memory leak of NVB connections.

  • Fixed crash of ncclGroup() containing mixed datatypes/operations (GitHub issue #560, introduced in NCCL 2.10.3).

  • Fixed topology detection of IB Virtual Functions (SR-IOV).