nemo_automodel.components.models.llava_onevision.state_dict_adapter#
State dict adapter for LLaVA-OneVision-1.5.
Module Contents#
Classes#
Converts between HF LLaVA-OneVision checkpoints and NeMo format. |
API#
- class nemo_automodel.components.models.llava_onevision.state_dict_adapter.LlavaOneVisionStateDictAdapter(config: Any, **kwargs)#
Bases:
nemo_automodel.components.checkpoint.state_dict_adapter.StateDictAdapterConverts between HF LLaVA-OneVision checkpoints and NeMo format.
HF checkpoint key patterns: model.visual.{…} -> Rice ViT weights model.language_model.{…} -> Qwen3 LLM weights lm_head.{…} -> Language model head
NeMo model key patterns: model.vision_tower.{…} -> Rice ViT weights model.language_model.{…} -> Qwen3 LLM weights lm_head.{…} -> Language model head
The adapter primarily handles key renaming to bridge HF and NeMo formats.
Initialization
- to_hf(
- state_dict: dict[str, Any],
- exclude_key_regex: Optional[str] = None,
- quantization: bool = False,
- **kwargs,
Rename NeMo keys to HF keys. Tensors passed through as-is.
- from_hf(
- hf_state_dict: dict[str, Any],
- **kwargs,
Rename HF keys to NeMo keys.
This adapter only performs key renaming. Tensor shapes should match since we use the same architecture as the HF implementation.
- _hf_to_nemo_key(hf_key: str) str#
Convert HF checkpoint key to NeMo key.
- _nemo_to_hf_key(nemo_key: str) str#
Convert NeMo key to HF checkpoint key.