> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.checkpoint.config

Public config surface for the checkpoint component.

`CheckpointingConfig` holds the typed parameters that drive checkpointing
behaviour and exposes `.build()` to construct the :class:`Checkpointer`
engine (defined in `checkpointing.py`).  Every field has a sensible default
so the recipe layer can construct it directly from the YAML `checkpoint:`
block plus the model-derived `model_repo_id` / `model_cache_dir` /
`is_peft` arguments — there is no separate builder/adapter.

## Module Contents

### Classes

| Name                                                                                        | Description                                             |
| ------------------------------------------------------------------------------------------- | ------------------------------------------------------- |
| [`CheckpointingConfig`](#nemo_automodel-components-checkpoint-config-CheckpointingConfig)   | Configuration for checkpointing.                        |
| [`SaveConsolidatedMode`](#nemo_automodel-components-checkpoint-config-SaveConsolidatedMode) | Controls when consolidated HF safetensors are exported. |

### Functions

| Name                                                                                                        | Description                                                              |
| ----------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------ |
| [`_is_geq_torch_2_9`](#nemo_automodel-components-checkpoint-config-_is_geq_torch_2_9)                       | Check if the current torch version is greater than or equal to 2.9.0.    |
| [`_normalize_save_consolidated`](#nemo_automodel-components-checkpoint-config-_normalize_save_consolidated) | Normalize legacy bools and string aliases to a consolidated export mode. |

### Data

[`__all__`](#nemo_automodel-components-checkpoint-config-__all__)

### API

```python
class nemo_automodel.components.checkpoint.config.CheckpointingConfig(
    enabled: bool = True,
    checkpoint_dir: str | pathlib.Path = 'checkpoints/',
    model_save_format: str = 'safetensors',
    model_cache_dir: str | pathlib.Path | None = None,
    model_repo_id: str | None = None,
    save_consolidated: bool | str | nemo_automodel.components.checkpoint.config.SaveConsolidatedMode = 'final',
    is_peft: bool = False,
    model_state_dict_keys: list[str] | None = None,
    is_async: bool = False,
    dequantize_base_checkpoint: bool | None = None,
    original_model_root_dir: str | None = None,
    skip_task_head_prefixes_for_base_model: list[str] | None = None,
    single_rank_consolidation: bool = False,
    staging_dir: str | None = None,
    v4_compatible: bool = False,
    diffusers_compatible: bool = False,
    best_metric_key: str = 'default'
)
```

Dataclass

Configuration for checkpointing.

Every field has a default so the recipe layer can construct this directly
from the YAML `checkpoint:` block merged with the model-derived
`model_repo_id` / `model_cache_dir` / `is_peft` values.  When
`model_cache_dir` is `None` it falls back to the HF hub cache.

```python
nemo_automodel.components.checkpoint.config.CheckpointingConfig.__post_init__()
```

Resolve the cache dir, enforce PEFT constraints, and coerce the save format/mode.

```python
nemo_automodel.components.checkpoint.config.CheckpointingConfig.build(
    dp_rank: int,
    tp_rank: int,
    pp_rank: int,
    moe_mesh: torch.distributed.device_mesh.DeviceMesh | None = None
) -> nemo_automodel.components.checkpoint.checkpointing.Checkpointer
```

Build the :class:`Checkpointer` engine for this config.

`Checkpointer` is imported lazily to avoid a circular import
(`checkpointing.py` imports `CheckpointingConfig` from this module)
and to keep the heavy DCP/safetensors deps out of module load.

**Parameters:**

Data-parallel rank.

Tensor-parallel rank.

Pipeline-parallel rank.

Optional device mesh for MoE checkpointing.

**Returns:** `Checkpointer`

class:`Checkpointer`.

```python
class nemo_automodel.components.checkpoint.config.SaveConsolidatedMode
```

**Bases:** `enum.Enum`

Controls when consolidated HF safetensors are exported.

```python
nemo_automodel.components.checkpoint.config._is_geq_torch_2_9() -> bool
```

Check if the current torch version is greater than or equal to 2.9.0.

```python
nemo_automodel.components.checkpoint.config._normalize_save_consolidated(
    value: bool | str | nemo_automodel.components.checkpoint.config.SaveConsolidatedMode
) -> nemo_automodel.components.checkpoint.config.SaveConsolidatedMode
```

Normalize legacy bools and string aliases to a consolidated export mode.

```python
nemo_automodel.components.checkpoint.config.__all__ = ['CheckpointingConfig', 'SaveConsolidatedMode']
```