bridge.training.pretrain_megatron_mimo#

Entry point for MegatronMIMO pretraining.

Thin entry point that orchestrates runtime config updates, setup, and training. Mirrors the standard pretrain.py β†’ setup() β†’ train() pattern.

See also:

  • setup_megatron_mimo.py: MegatronMIMO-specific setup logic (analogous to setup.py)

  • train_megatron_mimo.py: MegatronMIMO training loop (analogous to train.py)

  • config.py: megatron_mimo_runtime_config_update() (analogous to runtime_config_update())

Module Contents#

Functions#

pretrain_megatron_mimo

Entry point for MegatronMIMO pretraining.

Data#

API#

bridge.training.pretrain_megatron_mimo.logger#

β€˜getLogger(…)’

bridge.training.pretrain_megatron_mimo.pretrain_megatron_mimo(
cfg: megatron.bridge.training.config.ConfigContainer,
forward_step_func: Callable,
build_data_iterators_fn: Callable,
global_state: Optional[megatron.bridge.training.state.GlobalState] = None,
) None#

Entry point for MegatronMIMO pretraining.

Steps:

  1. Apply MegatronMIMO runtime config updates (finalize sub-configs, set data_parallel_size=1)

  2. Call setup_megatron_mimo() to get model, optimizer, schedulers, infra, communicators

  3. Call train_megatron_mimo() with all components

Parameters:
  • cfg – ConfigContainer with training configuration. cfg.model must be an MegatronMIMOProvider. cfg.optimizer (a BridgeOptimizerConfig) is used to create the MimoOptimizer and per-module LR schedulers.

  • forward_step_func – Forward step function for training.

  • build_data_iterators_fn – Function to build data iterators. Signature: (cfg, megatron_mimo_infra) -> (train_iter, valid_iter)

  • global_state – Optional GlobalState for testing. If not provided, creates a new one. Production callers should not pass this.