Point To Point Communication Functions¶
(Since NCCL 2.7) Point-to-point communication primitives need to be used when ranks need to send and receive arbitrary data from each other, which cannot be expressed as a broadcast or allgather, i.e. when all data sent and received is different.
ncclSend¶
-
ncclResult_t
ncclSend
(const void* sendbuff, size_t count, ncclDataType_t datatype, int peer, ncclComm_t comm, cudaStream_t stream)¶ Send data from
sendbuff
to rankpeer
.Rank
peer
needs to call ncclRecv with the samedatatype
and the samecount
from this rank.This operation is blocking for the GPU. If multiple
ncclSend()
andncclRecv()
operations need to progress concurrently to complete, they must be fused within ancclGroupStart()
/ncclGroupEnd()
section.
Related links: Point-to-point communication.
ncclRecv¶
-
ncclResult_t
ncclRecv
(void* recvbuff, size_t count, ncclDataType_t datatype, int peer, ncclComm_t comm, cudaStream_t stream)¶ Receive data from rank
peer
intorecvbuff
.Rank
peer
needs to call ncclSend with the samedatatype
and the samecount
to this rank.This operation is blocking for the GPU. If multiple
ncclSend()
andncclRecv()
operations need to progress concurrently to complete, they must be fused within ancclGroupStart()
/ncclGroupEnd()
section.
Related links: Point-to-point communication.