bridge.training.pretrain_megatron_mimo#
Entry point for MegatronMIMO pretraining.
Thin entry point that orchestrates runtime config updates, setup, and training.
Mirrors the standard pretrain.py β setup() β train() pattern.
See also:
setup_megatron_mimo.py: MegatronMIMO-specific setup logic (analogous tosetup.py)train_megatron_mimo.py: MegatronMIMO training loop (analogous totrain.py)config.py:megatron_mimo_runtime_config_update()(analogous toruntime_config_update())
Module Contents#
Functions#
Entry point for MegatronMIMO pretraining. |
Data#
API#
- bridge.training.pretrain_megatron_mimo.logger#
βgetLogger(β¦)β
- bridge.training.pretrain_megatron_mimo.pretrain_megatron_mimo(
- cfg: megatron.bridge.training.config.ConfigContainer,
- forward_step_func: Callable,
- build_data_iterators_fn: Callable,
- global_state: Optional[megatron.bridge.training.state.GlobalState] = None,
Entry point for MegatronMIMO pretraining.
Steps:
Apply MegatronMIMO runtime config updates (finalize sub-configs, set data_parallel_size=1)
Call setup_megatron_mimo() to get model, optimizer, schedulers, infra, communicators
Call train_megatron_mimo() with all components
- Parameters:
cfg β ConfigContainer with training configuration.
cfg.modelmust be anMegatronMIMOProvider.cfg.optimizer(aBridgeOptimizerConfig) is used to create theMimoOptimizerand per-module LR schedulers.forward_step_func β Forward step function for training.
build_data_iterators_fn β Function to build data iterators. Signature: (cfg, megatron_mimo_infra) -> (train_iter, valid_iter)
global_state β Optional GlobalState for testing. If not provided, creates a new one. Production callers should not pass this.