NCCL Release 2.5.6
This is the NCCL 2.5.6 release notes. For previous NCCL release notes, refer to the NCCL Archives.
Key Features and Enhancements
-
Added new protocol to improve performance on small operations.
-
Improved topology detection and tree/ring creation (#179, #262).
-
Improved multi-node tree performance by sending/receiving from different GPUs.
-
Added model-based tuning to switch between the different algorithms and protocols.
-
Added detection for duplicate CUDA devices and return an error (#231).
-
Added tuning for Google Cloud’s gVNIC platform.
Compatibility
-
Deep learning framework containers. Refer to theSupport Matrix for the supported container version.
-
This NCCL release supports CUDA 9.0, CUDA 10.0, and CUDA 10.2.