.. _point-to-point: **************************** Point-to-point communication **************************** (Since NCCL 2.7) Point-to-point communication can be used to express any communication pattern between ranks. Any point-to-point communication needs two NCCL calls : a call to :c:func:`ncclSend` on one rank and a corresponding :c:func:`ncclRecv` on the other rank, with the same count and data type. Multiple calls to :c:func:`ncclSend` and :c:func:`ncclRecv` targeting different peers can be fused together with :c:func:`ncclGroupStart` and :c:func:`ncclGroupEnd` to form more complex communication patterns such as one-to-all (scatter), all-to-one (gather), all-to-all or communication with neighbors in an N-dimensional space. Point-to-point calls within a group will be blocking until that group of calls completes, but calls within a group can be seen as progressing independently, hence should never block each other. It is therefore important to merge calls that need to progress concurrently to avoid deadlocks. Below are a few examples of classic point-to-point communication patterns used by parallel applications. NCCL semantics allow for all variants with different sizes, datatypes, and buffers, per rank. Sendrecv -------- In MPI terms, a sendrecv operation is when two ranks exchange data, both sending and receiving at the same time. This can be done by merging both ncclSend and ncclRecv calls into one : .. code:: C ncclGroupStart(); ncclSend(sendbuff, sendcount, sendtype, peer, comm, stream); ncclRecv(recvbuff, recvcount, recvtype, peer, comm, stream); ncclGroupEnd(); One-to-all (scatter) -------------------- A one-to-all operation from a ``root`` rank can be expressed by merging all send and receive operations in a group : .. code:: C ncclGroupStart(); if (rank == root) { for (int r=0; r