NCCL Release 2.10.3
This is the NCCL 2.10.3 release notes. For previous NCCL release notes, refer to the NCCL Archives.
Compatibility
- 
                                 Deep learning framework containers. Refer to the Support Matrix for the supported container version. 
- 
                                 This NCCL release supports CUDA 10.2, CUDA 11.0, and CUDA 11.4. 
Key Features and Enhancements
This NCCL release includes the following key features and enhancements.
- 
                              Added ncclAvg operation 
- 
                              Added support for bfloat16 type 
- 
                              Added support for multiple IB queue pairs 
- 
                              Added WSL2 support (single GPU only) 
- 
                              Improved performance for aggregation 
- 
                              Improved performance for medium sizes 
- 
                              Improved network error reporting 
- 
                              Added NCCL_NET environment variable to use a specific network 
- 
                              Added auto-load of XML topology from default location 
Fixed Issues
- 
                                 Fixed graph search on cubemesh topologies. 
- 
                                 Fixed all-to-all affinity to improve latency. 
- 
                                 Fixed hang in cubemesh NVB connections during initialization.