nemo_automodel.components.checkpoint.state_dict_adapter

Module Contents

Classes

Name	Description
`StateDictAdapter`	Abstract base class for state dict transformations.

API

class nemo_automodel.components.checkpoint.state_dict_adapter.StateDictAdapter()

Abstract

Abstract base class for state dict transformations.

This class defines the interface for converting between native model state dict format and other model state dict formats.

nemo_automodel.components.checkpoint.state_dict_adapter.StateDictAdapter.convert_single_tensor_to_hf(
    fqn: str,
    tensor: typing.Any,
    kwargs = {}
) -> list[tuple[str, typing.Any]]

abstract

Convert a single tensor from native format to HuggingFace format.

Parameters:

fqn

str

Fully qualified name of the tensor in native format

tensor

Any

The tensor to convert

**kwargs

Defaults to {}

Additional arguments for conversion

Returns: list[tuple[str, Any]]

List of (fqn, tensor) tuples in HuggingFace format.

nemo_automodel.components.checkpoint.state_dict_adapter.StateDictAdapter.from_hf(
    hf_state_dict: dict[str, typing.Any],
    device_mesh: typing.Optional[torch.distributed.device_mesh.DeviceMesh] = None,
    kwargs = {}
) -> dict[str, typing.Any]

abstract

Obtain native model state dict from HuggingFace format.

Parameters:

hf_state_dict

dict[str, Any]

The HuggingFace format state dict

device_mesh

Optional[DeviceMesh]Defaults to None

Optional device mesh for DTensor expert parallelism. If provided, only loads experts needed for the current rank.

Returns: dict[str, Any]

The converted native model state dict

nemo_automodel.components.checkpoint.state_dict_adapter.StateDictAdapter.to_hf(
    state_dict: dict[str, typing.Any],
    kwargs = {}
) -> dict[str, typing.Any]

abstract

Convert from native model state dict to HuggingFace format.

Parameters:

state_dict

dict[str, Any]

The native model state dict

Returns: dict[str, Any]

The converted HuggingFace format state dict