nemo_automodel.components.models.step3p7.state_dict_adapter#

Module Contents#

Classes#

Step3p7StateDictAdapter

Adapter for Step3.7 VLM checkpoints.

Functions#

API#

nemo_automodel.components.models.step3p7.state_dict_adapter._mtp_layer_range(config: Any) tuple[int, int]#
class nemo_automodel.components.models.step3p7.state_dict_adapter.Step3p7StateDictAdapter(
config: Any,
moe_config: nemo_automodel.components.moe.config.MoEConfig,
backend: nemo_automodel.components.models.common.BackendConfig,
dtype: torch.dtype = torch.bfloat16,
)#

Bases: nemo_automodel.components.checkpoint.state_dict_adapter.StateDictAdapter

Adapter for Step3.7 VLM checkpoints.

The released checkpoint stores the Step3.5 language backbone at top-level keys such as model.layers.* and stores vision keys as vision_model.* / vit_large_projector.*. The native AutoModel VLM keeps the language backbone under model.language_model so PP can split it as a nested text module, and reuses the Step3p5 expert-weight adapter for EP sharding.

Initialization

static _is_text_key(key: str) bool#
static _to_text_hf_key(key: str) str#
static _to_native_text_key(key: str) str#
static _map_non_text_from_hf(key: str) str | None#
static _map_non_text_to_hf(key: str) str#
_map_mtp_from_hf(key: str) str | None#
_map_mtp_to_hf(key: str) str | None#
from_hf(
hf_state_dict: dict[str, Any],
device_mesh: torch.distributed.device_mesh.DeviceMesh | None = None,
**kwargs: Any,
) dict[str, Any]#
to_hf(
state_dict: dict[str, Any],
exclude_key_regex: str | None = None,
quantization: bool = False,
**kwargs: Any,
) dict[str, Any]#
convert_single_tensor_to_hf(
fqn: str,
tensor: Any,
**kwargs: Any,
) list[tuple[str, Any]]#