bridge.models.mimo.mimo_ddp#

DDP wrapping utilities for MIMO models.

Called from the training layer after MimoModelProvider.provide().

Note: This module only supports DDP wrapping. FSDP is not yet implemented.

Module Contents#

Functions#

wrap_mimo_model_distributed

Wrap MIMO model’s submodules with DDP.

API#

bridge.models.mimo.mimo_ddp.wrap_mimo_model_distributed(
mimo_model: megatron.core.models.mimo.MimoModel,
ddp_config: megatron.core.distributed.DistributedDataParallelConfig,
mimo_parallelism_config: megatron.bridge.models.mimo.mimo_config.MimoParallelismConfig,
grids: Dict[str, megatron.core.hyper_comm_grid.HyperCommGrid],
pg_collections: Dict[str, Optional[megatron.core.process_groups_config.ProcessGroupCollection]],
) megatron.core.models.mimo.MimoModel#

Wrap MIMO model’s submodules with DDP.

Modifies mimo_model in-place and returns it.

Parameters:
  • mimo_model – The MimoModel to wrap.

  • ddp_config – DDP configuration from Bridge.

  • mimo_parallelism_config – MIMO parallelism configuration.

  • grids – Module name to HyperCommGrid mapping.

  • pg_collections – Module name to ProcessGroupCollection mapping.

Returns:

The same mimo_model with wrapped submodules.