Point-to-point communication¶
(Since NCCL 2.7)
Point-to-point communication can be used to express any communication pattern between ranks.
Any point-to-point communication needs two NCCL calls : a call to ncclSend()
on one
rank and a corresponding ncclRecv()
on the other rank, with the same count and data
type.
Multiple calls to ncclSend()
and ncclRecv()
targeting different peers
can be fused together with ncclGroupStart()
and ncclGroupEnd()
to form more
complex communication patterns such as one-to-all (scatter), all-to-one (gather),
all-to-all or communication with neighbors in a N-dimensional space.
Point-to-point calls within a group will be blocking until that group of calls completes, but calls within a group can be seen as progressing independently, hence should never block each other. It is therefore important to merge calls that need to progress concurrently to avoid deadlocks.
Below are a few examples of classic point-to-point communication patterns used by parallel applications. NCCL semantics allow for all variants with different sizes, datatypes, and buffers, per rank.
Sendrecv¶
In MPI terms, a sendrecv operation is when two ranks exchange data, both sending and receiving at the same time. This can be done by merging both ncclSend and ncclRecv calls into one :
ncclGroupStart();
ncclSend(sendbuff, sendcount, sendtype, peer, comm, stream);
ncclRecv(recvbuff, recvcount, recvtype, peer, comm, stream);
ncclGroupEnd();
One-to-all (scatter)¶
A one-to-all operation from a root
rank can be expressed by merging all send and receive
operations in a group :
ncclGroupStart();
if (rank == root) {
for (int r=0; r<nranks; r++)
ncclSend(sendbuff[r], size, type, r, comm, stream);
}
ncclRecv(recvbuff, size, type, root, comm, stream);
ncclGroupEnd();
All-to-one (gather)¶
Similarly, an all-to-one operations to a root
rank would be implemented this way :
ncclGroupStart();
if (rank == root) {
for (int r=0; r<nranks; r++)
ncclRecv(recvbuff[r], size, type, r, comm, stream);
}
ncclSend(sendbuff, size, type, root, comm, stream);
ncclGroupEnd();
All-to-all¶
An all-to-all operation would be a merged loop of send/recv operations to/from all peers :
ncclGroupStart();
for (int r=0; r<nranks; r++) {
ncclSend(sendbuff[r], sendcount, sendtype, r, comm, stream);
ncclRecv(recvbuff[r], recvcount, recvtype, r, comm, stream);
}
ncclGroupEnd();
Neighbor exchange¶
Finally, exchanging data with neighbors in an N-dimensions space could be done with :
ncclGroupStart();
for (int d=0; d<ndims; d++) {
ncclSend(sendbuff[d], sendcount, sendtype, next[d], comm, stream);
ncclRecv(recvbuff[d], recvcount, recvtype, prev[d], comm, stream);
}
ncclGroupEnd();