core.resharding.copy_services.base#
Module Contents#
Classes#
Abstract interface for submitting and executing batched P2P copy operations. |
API#
- class core.resharding.copy_services.base.CopyService#
Bases:
abc.ABCAbstract interface for submitting and executing batched P2P copy operations.
All backends accept an optional task_id on submit calls. The task_id is a globally unique identifier shared between the matching send and recv for the same transfer. It is required for local (same-rank) copy matching and for the NVSHMEM backend’s scheduling. Backends that do not need it for remote transfers simply ignore it.
- abstractmethod submit_send(
- src_tensor: torch.Tensor,
- dest_rank: int,
- task_id: Optional[int] = None,
Register a tensor send from the current rank to
dest_rank.
- abstractmethod submit_recv(
- dest_tensor: torch.Tensor,
- src_rank: int,
- task_id: Optional[int] = None,
Register a tensor receive into
dest_tensorfromsrc_rank.
- abstractmethod run()#
Execute all previously submitted send/recv operations as a single batch.