Release Notes :: NVIDIA Deep Learning NCCL Documentation

This is the NCCL 2.14.3 release notes. For previous NCCL release notes, refer to the NCCL Archives.

Compatibility

NCCL 2.14.3 has been tested with the following:

Deep learning framework containers. Refer to the Support Matrix for the supported container version.
This NCCL release supports CUDA 10.2, CUDA 11.0, and CUDA 11.7.

Key Features and Enhancements

This NCCL release includes the following key features and enhancements.

Add support for improved fault tolerance: non-blocking mode, new init function with configuration, new finalize function.
Reintroduce the collnet+chain algorithm.
Add LL protocol for intra-node send/recv communication, and inter-node (disabled by default).
Communicate through the network within a node instead of shared memory if performance would be better that way.

Fixed Issues

The following issues have been resolved in NCCL 2.14.3:

Wait for CUDA graph destruction before freeing communicator.
Remove aggressive polling during enqueue.
Fix DMABUF fallback on MOFED 5.4 and earlier.
Fix NCCL_DEBUG_FILE functionality.

Updating the GPG Repository Key

To best ensure the security and reliability of our RPM and Debian package repositories, NVIDIA is updating and rotating the signing keys used by apt, dnf/yum, and zypper package managers beginning on April 27, 2022. Failure to update your repository signing keys will result in package management errors when attempting to access or install NCCL packages. To ensure continued access to the latest NCCL release, please follow the updated NCCL installation guide.

NCCL Release 2.14.3

Compatibility

Key Features and Enhancements

Fixed Issues

Updating the GPG Repository Key