Important
You are viewing the NeMo 2.0 documentation. This release introduces significant changes to the API and a new library, NeMo Run. We are currently porting all features from NeMo 1.0 to 2.0. For documentation on previous versions or features not yet available in 2.0, please refer to the NeMo 24.07 documentation.
Migrate Optimizer Configuration from NeMo 1.0 to NeMo 2.0#
In NeMo 2.0, the optimizer configuration has changed from a YAML-based approach to using the OptimizerConfig
class from Megatron-Core. This guide will help you migrate your optimizer setup.
NeMo 1.0 (Previous Release)#
In NeMo 1.0, the optimizer was configured in the YAML configuration file.
model:
optim:
name: fused_adam
lr: 2e-4
weight_decay: 0.01
betas:
- 0.9
- 0.98
sched:
name: CosineAnnealing
warmup_steps: 500
constant_steps: 0
min_lr: 2e-5
NeMo 2.0 (New Release)#
In NeMo 2.0, we use the OptimizerConfig
class from Megatron Core, which is wrapped by NeMo’s MegatronOptimizerModule
. Here’s how to set it up:
from nemo.collections import llm
from nemo import lightning as nl
from megatron.core.optimizer import OptimizerConfig
optim = nl.MegatronOptimizerModule(
config=OptimizerConfig(
optimizer="adam",
lr=0.001,
use_distributed_optimizer=True
),
lr_scheduler=nl.lr_scheduler.CosineAnnealingScheduler(),
)
llm.train(..., optim=optim)
Migration Steps#
Remove the
optim
section from your YAML configuration file.Import the necessary modules in your Python script:
from nemo.collections import llm from nemo import lightning as nl from megatron.core.optimizer import OptimizerConfig
Create an instance of
MegatronOptimizerModule
with the appropriateOptimizerConfig
.Configure the
OptimizerConfig
with parameters similar to your previous YAML configuration:optimizer
: String name of the optimizer (e.g., “adam” instead of “fused_adam”)lr
: Learning rateuse_distributed_optimizer
: Set toTrue
to use the distributed optimizer
Set up the learning rate scheduler separately using NeMo’s scheduler classes.
Pass the
optim
object to thellm.train()
function.
By following these steps, you’ll successfully migrate your optimizer configuration from NeMo 1.0 to NeMo 2.0. Be aware that the exact parameter names and available options may differ, so consult the OptimizerConfig documentation for a complete list of supported parameters.