> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.optim.scheduler

Learning rate decay and weight decay incr functions.

## Module Contents

### Classes

| Name                                                                                            | Description                             |
| ----------------------------------------------------------------------------------------------- | --------------------------------------- |
| [`OptimizerParamScheduler`](#nemo_automodel-components-optim-scheduler-OptimizerParamScheduler) | Anneals learning rate and weight decay. |

### Data

[`_T`](#nemo_automodel-components-optim-scheduler-_T)

[`logger`](#nemo_automodel-components-optim-scheduler-logger)

### API

```python
class nemo_automodel.components.optim.scheduler.OptimizerParamScheduler(
    optimizer: torch.optim.optimizer.Optimizer,
    init_lr: float,
    max_lr: float,
    min_lr: float,
    lr_warmup_steps: int,
    lr_decay_steps: int,
    lr_decay_style: str,
    start_wd: float,
    end_wd: float,
    wd_incr_steps: int,
    wd_incr_style: str,
    use_checkpoint_opt_param_scheduler: typing.Optional[bool] = True,
    override_opt_param_scheduler: typing.Optional[bool] = False,
    wsd_decay_steps: typing.Optional[int] = None,
    lr_wsd_decay_style: typing.Optional[str] = None
)
```

Anneals learning rate and weight decay.

**Parameters:**

the optimizer to be used

initial learning rate

maximum learning rate

minimum learning rate

number of warmup steps

number of decay steps

decay style for learning rate

initial weight decay

final weight decay

number of weight decay increment steps

weight decay increment style

whether to use the checkpoint values
for the optimizer param scheduler. Defaults to True.

whether to override the optimizer param
scheduler values with the class values. Defaults to False.

number of weight decay decay steps. Defaults to None.

decay style for learning rate during weight decay decay
steps. Defaults to None.

```python
nemo_automodel.components.optim.scheduler.OptimizerParamScheduler.__repr__() -> str
```

Return a string representation of the OptimizerParamScheduler.

```python
nemo_automodel.components.optim.scheduler.OptimizerParamScheduler._check_and_set(
    cls_value: nemo_automodel.components.optim.scheduler._T,
    sd_value: nemo_automodel.components.optim.scheduler._T,
    name: str
) -> nemo_automodel.components.optim.scheduler._T
```

Auxiliary function for checking the values in the checkpoint and setting them.

**Parameters:**

class value

checkpoint value

name of the parameter

```python
nemo_automodel.components.optim.scheduler.OptimizerParamScheduler.get_lr(
    param_group: dict[str, typing.Any]
) -> float
```

Learning rate decay functions from: [https://openreview.net/pdf?id=BJYwwY9ll](https://openreview.net/pdf?id=BJYwwY9ll) pg. 4.

```python
nemo_automodel.components.optim.scheduler.OptimizerParamScheduler.get_wd() -> float
```

Weight decay incr functions.

```python
nemo_automodel.components.optim.scheduler.OptimizerParamScheduler.load_state_dict(
    state_dict: dict[str, typing.Any]
) -> None
```

Load the state dict.

**Parameters:**

state dict to be load

```python
nemo_automodel.components.optim.scheduler.OptimizerParamScheduler.state_dict() -> dict[str, typing.Any]
```

Return the state dict.

```python
nemo_automodel.components.optim.scheduler.OptimizerParamScheduler.step(
    increment: int
) -> None
```

Set lr for all parameters groups.

**Parameters:**

number of steps to increment

```python
nemo_automodel.components.optim.scheduler._T = TypeVar('_T')
```

```python
nemo_automodel.components.optim.scheduler.logger = logging.getLogger(__name__)
```