nemo_automodel.components.models.llama.state_dict_adapter#
State dict adapter for Llama model.
The model uses separate q_proj / k_proj / v_proj / gate_proj / up_proj that match HuggingFace key names exactly, so the adapter is a passthrough (only tied-weight handling is applied in from_hf).
Module Contents#
Classes#
State dict adapter for Llama models. |
Data#
API#
- nemo_automodel.components.models.llama.state_dict_adapter.logger#
‘getLogger(…)’
- class nemo_automodel.components.models.llama.state_dict_adapter.LlamaStateDictAdapter(config: transformers.LlamaConfig)#
State dict adapter for Llama models.
Uses separate projections that match HuggingFace key names exactly, so from_hf / to_hf are simple passthroughs (only tied-weight handling in from_hf).
.. rubric:: Example
from transformers import LlamaConfig
config = LlamaConfig.from_pretrained(“meta-llama/Llama-3-8B”) adapter = LlamaStateDictAdapter(config)
Convert HF checkpoint to custom format#
custom_state_dict = adapter.from_hf(hf_state_dict)
Convert custom checkpoint back to HF format#
hf_state_dict = adapter.to_hf(custom_state_dict)
Initialization
Initialize adapter with Llama config.
- from_hf(
- hf_state_dict: dict[str, Any],
- **kwargs,
- to_hf(
- state_dict: dict[str, Any],
- exclude_key_regex: Optional[str] = None,
- **kwargs,