bridge.training.optim
#
Module Contents#
Functions#
Set up the optimizer and scheduler. |
|
Get the optimizer parameter scheduler. |
API#
- bridge.training.optim.setup_optimizer(
- optimizer_config: megatron.core.optimizer.OptimizerConfig,
- scheduler_config: megatron.bridge.training.config.SchedulerConfig,
- model: Union[megatron.core.transformer.module.MegatronModule, list[megatron.core.transformer.module.MegatronModule]],
- use_gloo_process_groups: bool = False,
- no_weight_decay_cond: Optional[Callable[[str, torch.nn.Parameter], bool]] = None,
- scale_lr_cond: Optional[Callable[[str, torch.nn.Parameter], bool]] = None,
- lr_mult: float = 1.0,
Set up the optimizer and scheduler.
- Parameters:
optimizer_config β Configuration for the optimizer
scheduler_config β Configuration for the scheduler
model β The model to optimize
use_gloo_process_groups β Whether to use Gloo process groups
no_weight_decay_cond β Condition for parameters to exclude from weight decay
scale_lr_cond β Condition for parameters to scale learning rate
lr_mult β Learning rate multiplier
- Returns:
tuple containing the optimizer and scheduler
- bridge.training.optim._get_scheduler(
- optimizer_config: megatron.core.optimizer.OptimizerConfig,
- scheduler_config: megatron.bridge.training.config.SchedulerConfig,
- optimizer: megatron.core.optimizer.MegatronOptimizer,
Get the optimizer parameter scheduler.
- Parameters:
optimizer_config β Configuration for the optimizer
scheduler_config β Configuration for the scheduler
optimizer β The optimizer to schedule
- Returns:
The optimizer parameter scheduler