NCCL Release 2.10.3
This is the NCCL 2.10.3 release notes. For previous NCCL release notes, refer to the NCCL Archives.
Compatibility
-
Deep learning framework containers. Refer to the Support Matrix for the supported container version.
-
This NCCL release supports CUDA 10.2, CUDA 11.0, and CUDA 11.4.
Key Features and Enhancements
This NCCL release includes the following key features and enhancements.
-
Added ncclAvg operation
-
Added support for bfloat16 type
-
Added support for multiple IB queue pairs
-
Added WSL2 support (single GPU only)
-
Improved performance for aggregation
-
Improved performance for medium sizes
-
Improved network error reporting
-
Added NCCL_NET environment variable to use a specific network
-
Added auto-load of XML topology from default location
Fixed Issues
-
Fixed graph search on cubemesh topologies.
-
Fixed all-to-all affinity to improve latency.
-
Fixed hang in cubemesh NVB connections during initialization.