core.resharding.execution#

Module Contents#

Functions#

execute_reshard_plan

Execute a reshard plan (from centralized controller). A communication service must be provided to abstract transport. Expected service API: submit_send(tensor, dest_rank), submit_recv(tensor, src_rank), run().

Data#

API#

core.resharding.execution.logger#

‘getLogger(…)’

core.resharding.execution.execute_reshard_plan(
plan: core.resharding.utils.ReshardPlan,
src_module: torch.nn.Module,
dst_module: torch.nn.Module,
service: core.resharding.copy_services.base.CopyService,
) None#

Execute a reshard plan (from centralized controller). A communication service must be provided to abstract transport. Expected service API: submit_send(tensor, dest_rank), submit_recv(tensor, src_rank), run().

Supports None for src_module and/or dst_module to allow ranks in non-collocated mode:

  • src_module=None: Rank only receives data (destination-only)

  • dst_module=None: Rank only sends data (source-only)

  • Both provided: Rank participates in both send and recv (collocated mode)