nemo_automodel.components.models.qwen2_5_omni.state_dict_adapter

Module Contents

Classes

Name	Description
`Qwen2_5OmniStateDictAdapter`	HF Qwen2.5-Omni checkpoint adapter (thinker-only path).

Data

_DROP_PREFIXES

_DROP_THINKER_KEY_SUBSTRINGS

_THINKER_PREFIX

logger

API

class nemo_automodel.components.models.qwen2_5_omni.state_dict_adapter.Qwen2_5OmniStateDictAdapter(
    config: typing.Any,
    backend: nemo_automodel.components.models.common.BackendConfig | None = None,
    dtype: torch.dtype = torch.float32
)

Bases: StateDictAdapter

HF Qwen2.5-Omni checkpoint adapter (thinker-only path).

HF Qwen/Qwen2.5-Omni-* checkpoints store keys under three top-level prefixes: thinker.*, talker.*, token2wav.*. For ASR/text fine-tuning we only train the Thinker, so this adapter:

on from_hf: drops talker.* and token2wav.* keys and strips the thinker. prefix so keys align with our NeMo Thinker class.
on to_hf: re-adds the thinker. prefix so the saved checkpoint can be merged back with the original talker/token2wav shards.

Qwen2.5-Omni-3B is dense (no MoE), so no expert grouping logic is needed — this is a thin key-renaming adapter.

backend

= backend or BackendConfig()

nemo_automodel.components.models.qwen2_5_omni.state_dict_adapter.Qwen2_5OmniStateDictAdapter.convert_single_tensor_to_hf(
    fqn: str,
    tensor: typing.Any,
    kwargs = {}
) -> list[tuple[str, typing.Any]]

nemo_automodel.components.models.qwen2_5_omni.state_dict_adapter.Qwen2_5OmniStateDictAdapter.from_hf(
    hf_state_dict: dict[str, typing.Any],
    device_mesh: typing.Optional[typing.Any] = None,
    kwargs = {}
) -> dict[str, typing.Any]

nemo_automodel.components.models.qwen2_5_omni.state_dict_adapter.Qwen2_5OmniStateDictAdapter.to_hf(
    state_dict: dict[str, typing.Any],
    exclude_key_regex: typing.Optional[str] = None,
    quantization: bool = False,
    kwargs = {}
) -> dict[str, typing.Any]

nemo_automodel.components.models.qwen2_5_omni.state_dict_adapter._DROP_PREFIXES = ('talker.', 'token2wav.')

nemo_automodel.components.models.qwen2_5_omni.state_dict_adapter._DROP_THINKER_KEY_SUBSTRINGS = ('audio_tower.audio_bos_eos_token',)

nemo_automodel.components.models.qwen2_5_omni.state_dict_adapter._THINKER_PREFIX = 'thinker.'

nemo_automodel.components.models.qwen2_5_omni.state_dict_adapter.logger = logging.getLogger(__name__)