nemo_automodel.components.checkpoint.state_dict_adapter

View as Markdown

Module Contents

Classes

NameDescription
StateDictAdapterAbstract base class for state dict transformations.

API

class nemo_automodel.components.checkpoint.state_dict_adapter.StateDictAdapter()
Abstract

Abstract base class for state dict transformations.

This class defines the interface for converting between native model state dict format and other model state dict formats.

nemo_automodel.components.checkpoint.state_dict_adapter.StateDictAdapter.convert_single_tensor_to_hf(
fqn: str,
tensor: typing.Any,
kwargs = {}
) -> list[tuple[str, typing.Any]]
abstract

Convert a single tensor from native format to HuggingFace format.

Parameters:

fqn
str

Fully qualified name of the tensor in native format

tensor
Any

The tensor to convert

**kwargs
Defaults to {}

Additional arguments for conversion

Returns: list[tuple[str, Any]]

List of (fqn, tensor) tuples in HuggingFace format.

nemo_automodel.components.checkpoint.state_dict_adapter.StateDictAdapter.from_hf(
hf_state_dict: dict[str, typing.Any],
device_mesh: typing.Optional[torch.distributed.device_mesh.DeviceMesh] = None,
kwargs = {}
) -> dict[str, typing.Any]
abstract

Obtain native model state dict from HuggingFace format.

Parameters:

hf_state_dict
dict[str, Any]

The HuggingFace format state dict

device_mesh
Optional[DeviceMesh]Defaults to None

Optional device mesh for DTensor expert parallelism. If provided, only loads experts needed for the current rank.

Returns: dict[str, Any]

The converted native model state dict

nemo_automodel.components.checkpoint.state_dict_adapter.StateDictAdapter.to_hf(
state_dict: dict[str, typing.Any],
kwargs = {}
) -> dict[str, typing.Any]
abstract

Convert from native model state dict to HuggingFace format.

Parameters:

state_dict
dict[str, Any]

The native model state dict

Returns: dict[str, Any]

The converted HuggingFace format state dict