In general NCCL will accept any CUDA pointers that are accessible from the CUDA device associated to the communicator object. This includes:
- device memory local to the CUDA device
- host memory registered using CUDA SDK APIs cudaHostRegister or cudaGetDevicePointer
- managed and unified memory
The only exception is device memory located on another device but accessible from the current device using peer access. NCCL will return an error in that case to avoid programming errors (only when NCCL_CHECK_POINTERS=1 since 2.2.12).