Group primitives define the behavior of the current thread to avoid blocking. They can therefore be used from multiple threads independently.
Related links: Group Calls.
End a group call.
Returns when all operations since ncclGroupStart have been processed. This means communication primitives have been enqueued to the provided streams, but are not necessary complete.
When used with the ncclCommInitRank call, the ncclGroupEnd call waits for all communicators to be initialized.
Note: There is a maximum of 2048 NCCL operations that can be inserted between the ncclGroupStart and ncclGroupEnd calls. If this limit is exceeded, then a warning message will be emitted and the NCCL operation will return a failure code.