NCCL Release 2.4.7
This is the NCCL 2.4.7 release notes. This release
includes fixes from the previous NCCL 2.4.x releases as well as the following additional
changes. For previous NCCL release notes, see the archived NCCL Release Notes.
Key Features and Enhancements
This NCCL release includes the following key features and enhancements.
- Improved bootstrap socket connection reliability at scale.
- Added detection of IBM/Power NVLink bridge device.
- Added NUMA support to PCI-E distance calculations on x86
architectures.
Note: This adds a new level (5) for the NCCL_P2P_LEVEL and NCCL_NET_GDR_LEVEL environment variables. See the NCCL documentation for more details.
- Added the NCCL_IGNORE_CPU_AFFINITY environment variable.
Compatibility
NCCL 2.4.7 has been tested with the following:
- Deep learning framework 19.04 containers
- This NCCL release supports; CUDA 9.0, CUDA 9.2 , CUDA 10.0, and CUDA 10.1.
Fixed Issues
The following issues have been resolved in NCCL 2.4.7:
- Fixed hostname hashing issue. (GitHub issue #187)
- Fixed memory leaks. (GitHub issue #180)
- Fixed compiler warning. (GitHub issue #178)
- Replaced non-standard variable length arrays. (GitHub issue #171)
- Fixed Tree and Shared Memory crash. (GitHub PR #185)
- Fixed hangs during long running jobs.
- Fixed the NCCL_RINGS environment variable handling.
- Added extra checks to catch duplicate calls to ncclCommDestroy(). (GitHub issue #191)
Known Issues
- On single node Power systems with 4 GPUs, some performance regressions have been observed compared to NCCL 2.4.2. These will be addressed in future NCCL releases.
- By default, NCCL does not enable direct P2P communication through different PCIe root ports on Intel Skylake CPU and later. This is due to a known performance issue when using P2P on these CPU versions. There is now a new BIOS and performance tuning option available (PCIe Peer-to-Peer Serialization) from Intel and their OEM vendors that resolves this P2P bandwidth issue. If the BIOS performance tuning option has been enabled, then NCCL direct P2P connections can be re-enabled by setting NCCL_P2P_LEVEL=5.