NCCL Release 2.8.4
This is the NCCL 2.8.4 release notes. For previous NCCL release notes, refer to the NCCL Archives.
Compatibility
Key Features and Enhancements
-
Added support for Zhaoxin CPUs
Known Issues
Send/receive operations have a number of limitations:
-
Using send/receive operations in combination to launch work on multiple GPUs from a single process can fail or hang if the GPUs process different amounts of data. Setting NCCL_LAUNCH_MODE=PARALLEL can work around the issue, but can also cause other problems. For more information, see the NCCL User Guide section Troubleshooting > Known Issues > Concurrency Between NCCL and CUDA calls.
Fixed Issues
-
Fixed hang for some imbalanced send/recv operation (alltoallv).