core.resharding.copy_services.nccl_copy_service#
Module Contents#
Classes#
Simple container describing a single NCCL send operation. |
|
Simple container describing a single NCCL receive operation. |
|
Thin wrapper around torch.distributed batch_isend_irecv to submit and execute a batch of point-to-point sends and recvs. |
Data#
API#
- core.resharding.copy_services.nccl_copy_service.logger#
‘getLogger(…)’
- class core.resharding.copy_services.nccl_copy_service.SendOp#
Simple container describing a single NCCL send operation.
- task_id: int | None#
None
- tensor: torch.Tensor#
None
- dest_rank: int#
None
- class core.resharding.copy_services.nccl_copy_service.RecvOp#
Simple container describing a single NCCL receive operation.
- task_id: int | None#
None
- tensor: torch.Tensor#
None
- src_rank: int#
None
- class core.resharding.copy_services.nccl_copy_service.NCCLCopyService#
Bases:
core.resharding.copy_services.base.CopyServiceThin wrapper around torch.distributed batch_isend_irecv to submit and execute a batch of point-to-point sends and recvs.
Initialization
- submit_send(src_tensor: torch.Tensor, dest_rank: int)#
- submit_send_with_id(
- task_id: int,
- src_tensor: torch.Tensor,
- dest_rank: int,
Submit a send operation with a unique task identifier.
- submit_recv(dest_tensor: torch.Tensor, src_rank: int)#
Submit a receive operation.
- submit_recv_with_id(
- task_id: int,
- dest_tensor: torch.Tensor,
- src_rank: int,
Submit a receive operation with a unique task identifier.
- run()#