NVIDIA Deep Learning NCCL Documentation
Release Notes (PDF) - v2.20.5 - Last updated March 6, 2024

NCCL Release 2.4.7

This is the NCCL 2.4.7 release notes. This release includes fixes from the previous NCCL 2.4.x releases as well as the following additional changes. For previous NCCL release notes, see the archived NCCL Release Notes.

Key Features and Enhancements

This NCCL release includes the following key features and enhancements.
  • Improved bootstrap socket connection reliability at scale.
  • Added detection of IBM/Power NVLink bridge device.
  • Added NUMA support to PCI-E distance calculations on x86 architectures.
    Note: This adds a new level (5) for the NCCL_P2P_LEVEL and NCCL_NET_GDR_LEVEL environment variables. See the NCCL documentation for more details.
  • Added the NCCL_IGNORE_CPU_AFFINITY environment variable.

Compatibility

NCCL 2.4.7 has been tested with the following:

Fixed Issues

The following issues have been resolved in NCCL 2.4.7:
  • Fixed hostname hashing issue. (GitHub issue #187)
  • Fixed memory leaks. (GitHub issue #180)
  • Fixed compiler warning. (GitHub issue #178)
  • Replaced non-standard variable length arrays. (GitHub issue #171)
  • Fixed Tree and Shared Memory crash. (GitHub PR #185)
  • Fixed hangs during long running jobs.
  • Fixed the NCCL_RINGS environment variable handling.
  • Added extra checks to catch duplicate calls to ncclCommDestroy(). (GitHub issue #191)

Known Issues

  • On single node Power systems with 4 GPUs, some performance regressions have been observed compared to NCCL 2.4.2. These will be addressed in future NCCL releases.
  • By default, NCCL does not enable direct P2P communication through different PCIe root ports on Intel Skylake CPU and later. This is due to a known performance issue when using P2P on these CPU versions. There is now a new BIOS and performance tuning option available (PCIe Peer-to-Peer Serialization) from Intel and their OEM vendors that resolves this P2P bandwidth issue. If the BIOS performance tuning option has been enabled, then NCCL direct P2P connections can be re-enabled by setting NCCL_P2P_LEVEL=5.