bridge.models.megatron_mimo.megatron_mimo_ddp#
DDP wrapping utilities for MegatronMIMO models.
Called from the training layer after MegatronMIMOProvider.provide().
Note: This module only supports DDP wrapping. FSDP is not yet implemented.
Module Contents#
Functions#
Wrap MegatronMIMO model’s submodules with DDP. |
API#
- bridge.models.megatron_mimo.megatron_mimo_ddp.wrap_megatron_mimo_model_distributed(
- megatron_mimo_model: megatron.core.models.mimo.MimoModel,
- ddp_config: megatron.core.distributed.DistributedDataParallelConfig,
- megatron_mimo_parallelism_config: megatron.bridge.models.megatron_mimo.megatron_mimo_config.MegatronMIMOParallelismConfig,
- grids: Dict[str, megatron.core.hyper_comm_grid.HyperCommGrid],
- pg_collections: Dict[str, Optional[megatron.core.process_groups_config.ProcessGroupCollection]],
Wrap MegatronMIMO model’s submodules with DDP.
Modifies megatron_mimo_model in-place and returns it.
- Parameters:
megatron_mimo_model – The MimoModel to wrap.
ddp_config – DDP configuration from Bridge.
megatron_mimo_parallelism_config – MegatronMIMO parallelism configuration.
grids – Module name to HyperCommGrid mapping.
pg_collections – Module name to ProcessGroupCollection mapping.
- Returns:
The same megatron_mimo_model with wrapped submodules.