NVIDIA Deep Learning NCCL Documentation
Release Notes (PDF) - v2.20.5 - Last updated March 6, 2024

NCCL Release 2.8.4

This is the NCCL 2.8.4 release notes. For previous NCCL release notes, refer to the NCCL Archives.

Compatibility

NCCL 2.8.4 has been tested with the following:

Key Features and Enhancements

This NCCL release includes the following key features and enhancements.
  • Added support for Zhaoxin CPUs

Known Issues

Send/receive operations have a number of limitations:

  • Using send/receive operations in combination to launch work on multiple GPUs from a single process can fail or hang if the GPUs process different amounts of data. Setting NCCL_LAUNCH_MODE=PARALLEL can work around the issue, but can also cause other problems. For more information, see the NCCL User Guide section Troubleshooting > Known Issues > Concurrency Between NCCL and CUDA calls.

Fixed Issues

The following issues have been resolved in NCCL 2.8.4:
  • Fixed hang for some imbalanced send/recv operation (alltoallv).