Point To Point Communication Functions¶
(Since NCCL 2.7) Point-to-point communication primitives need to be used when ranks need to send and receive arbitrary data from each other, which cannot be expressed as a broadcast or allgather, i.e. when all data sent and received is different.
ncclSend¶
- 
ncclResult_t ncclSend(const void* sendbuff, size_t count, ncclDataType_t datatype, int peer, ncclComm_t comm, cudaStream_t stream)¶
- Send data from - sendbuffto rank- peer.- Rank - peerneeds to call ncclRecv with the same- datatypeand the same- countfrom this rank.- This operation is blocking for the GPU. If multiple - ncclSend()and- ncclRecv()operations need to progress concurrently to complete, they must be fused within a- ncclGroupStart()/- ncclGroupEnd()section.
Related links: Point-to-point communication.
ncclRecv¶
- 
ncclResult_t ncclRecv(void* recvbuff, size_t count, ncclDataType_t datatype, int peer, ncclComm_t comm, cudaStream_t stream)¶
- Receive data from rank - peerinto- recvbuff.- Rank - peerneeds to call ncclSend with the same- datatypeand the same- countto this rank.- This operation is blocking for the GPU. If multiple - ncclSend()and- ncclRecv()operations need to progress concurrently to complete, they must be fused within a- ncclGroupStart()/- ncclGroupEnd()section.
Related links: Point-to-point communication.