Release Notes :: NVIDIA Deep Learning NCCL Documentation

NCCL Release 2.4.2

This NCCL release includes the following key features and enhancements.

Implemented tree-based algorithms for better All Reduce performance at scale and with small and medium size messages.
Support for external network plugins (e.g., libfabric).
Add ncclCommGetAsyncError() function to report errors happening during collective operations.
Add ncclCommAbort() function to destroy a communicator, aborting any outstanding operations.
Support different ranks having a different CUDA_VISIBLE_DEVICES.
Add a best-effort mechanism to check for size mismatch among collective calls.

Support communication between Mesos containers (Github issue #155).
Fix case where posix_fallocate() returns EINTR (Github issue #137).
NCCL threads no longer escape the CPU affinity set by the user or job scheduler.