> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.models.common.hf_checkpointing_mixin

HuggingFace-compatible checkpointing mixin for NeMo Automodel.

This module provides a mixin class that gives models HuggingFace-compatible
save\_pretrained() and from\_pretrained() methods while using NeMo's checkpointing
infrastructure internally.

Key design principle: We do NOT override state\_dict() or load\_state\_dict().
PyTorch's DCP expects these to behave like standard nn.Module methods.
HF format conversions happen only in save\_pretrained() and from\_pretrained() via Checkpointer.

Checkpointer is passed explicitly (dependency injection) - no global state.

## Module Contents

### Classes

| Name                                                                                                           | Description                                                                  |
| -------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------- |
| [`HFCheckpointingMixin`](#nemo_automodel-components-models-common-hf_checkpointing_mixin-HFCheckpointingMixin) | Mixin providing HF-compatible API using NeMo's checkpointing infrastructure. |

### Data

[`logger`](#nemo_automodel-components-models-common-hf_checkpointing_mixin-logger)

### API

```python
class nemo_automodel.components.models.common.hf_checkpointing_mixin.HFCheckpointingMixin()
```

Mixin providing HF-compatible API using NeMo's checkpointing infrastructure.

Provides save\_pretrained() and from\_pretrained() methods that use Checkpointer
for unified distributed/async support with HF format conversion.

Key design: We do NOT override state\_dict() or load\_state\_dict() because
PyTorch's DCP expects these to behave like standard nn.Module methods.

For PreTrainedModel subclasses:

* super().from\_pretrained() handles: downloads, quantization config, meta device init
* Checkpointer.load\_base\_model() handles: actual weight loading with format conversion

For nn.Module subclasses (no parent from\_pretrained):

* Falls back to manual config loading + Checkpointer

```python
nemo_automodel.components.models.common.hf_checkpointing_mixin.HFCheckpointingMixin.save_pretrained(
    save_directory: str,
    checkpointer: typing.Optional[nemo_automodel.components.checkpoint.checkpointing.Checkpointer] = None,
    tokenizer: typing.Optional[transformers.tokenization_utils.PreTrainedTokenizerBase] = None,
    kwargs = {}
) -> None
```

Save model in HF-compatible format using Checkpointer infrastructure.

Supports distributed saving, sharding, and async checkpointing.

**Parameters:**

Output path

Checkpointer instance. Uses self.\_checkpointer if not provided.

Optional tokenizer to save alongside model

Additional arguments, including `peft_config` and
`is_final_checkpoint`. Direct callers that do not have recipe
step-scheduler context default `is_final_checkpoint` to
`False`.

```python
nemo_automodel.components.models.common.hf_checkpointing_mixin.logger = logging.getLogger(__name__)
```