nemo_automodel.components.models.qwen2_5_omni.state_dict_adapter#

Module Contents#

Classes#

Qwen2_5OmniStateDictAdapter

HF Qwen2.5-Omni checkpoint adapter (thinker-only path).

Data#

API#

nemo_automodel.components.models.qwen2_5_omni.state_dict_adapter.logger#

‘getLogger(…)’

nemo_automodel.components.models.qwen2_5_omni.state_dict_adapter._THINKER_PREFIX#

‘thinker.’

nemo_automodel.components.models.qwen2_5_omni.state_dict_adapter._DROP_PREFIXES#

(‘talker.’, ‘token2wav.’)

nemo_automodel.components.models.qwen2_5_omni.state_dict_adapter._DROP_THINKER_KEY_SUBSTRINGS#

(‘audio_tower.audio_bos_eos_token’,)

class nemo_automodel.components.models.qwen2_5_omni.state_dict_adapter.Qwen2_5OmniStateDictAdapter(
config: Any,
backend: nemo_automodel.components.models.common.BackendConfig | None = None,
dtype: torch.dtype = torch.float32,
)#

Bases: nemo_automodel.components.checkpoint.state_dict_adapter.StateDictAdapter

HF Qwen2.5-Omni checkpoint adapter (thinker-only path).

HF Qwen/Qwen2.5-Omni-* checkpoints store keys under three top-level prefixes: thinker.*, talker.*, token2wav.*. For ASR/text fine-tuning we only train the Thinker, so this adapter:

  • on from_hf: drops talker.* and token2wav.* keys and strips the thinker. prefix so keys align with our NeMo Thinker class.

  • on to_hf: re-adds the thinker. prefix so the saved checkpoint can be merged back with the original talker/token2wav shards.

Qwen2.5-Omni-3B is dense (no MoE), so no expert grouping logic is needed — this is a thin key-renaming adapter.

Initialization

to_hf(
state_dict: dict[str, Any],
exclude_key_regex: Optional[str] = None,
quantization: bool = False,
**kwargs,
) dict[str, Any]#
from_hf(
hf_state_dict: dict[str, Any],
device_mesh: Optional[Any] = None,
**kwargs,
) dict[str, Any]#
convert_single_tensor_to_hf(
fqn: str,
tensor: Any,
**kwargs,
) list[tuple[str, Any]]#