nemo_automodel.components.models.llama.state_dict_adapter#

State dict adapter for Llama model.

The model uses separate q_proj / k_proj / v_proj / gate_proj / up_proj that match HuggingFace key names exactly, so the adapter is a passthrough (only tied-weight handling is applied in from_hf).

Module Contents#

Classes#

LlamaStateDictAdapter

State dict adapter for Llama models.

Data#

API#

nemo_automodel.components.models.llama.state_dict_adapter.logger#

‘getLogger(…)’

class nemo_automodel.components.models.llama.state_dict_adapter.LlamaStateDictAdapter(config: transformers.LlamaConfig)#

State dict adapter for Llama models.

Uses separate projections that match HuggingFace key names exactly, so from_hf / to_hf are simple passthroughs (only tied-weight handling in from_hf).

.. rubric:: Example

from transformers import LlamaConfig

config = LlamaConfig.from_pretrained(“meta-llama/Llama-3-8B”) adapter = LlamaStateDictAdapter(config)

Convert HF checkpoint to custom format#

custom_state_dict = adapter.from_hf(hf_state_dict)

Convert custom checkpoint back to HF format#

hf_state_dict = adapter.to_hf(custom_state_dict)

Initialization

Initialize adapter with Llama config.

from_hf(
hf_state_dict: dict[str, Any],
**kwargs,
) dict[str, Any]#
to_hf(
state_dict: dict[str, Any],
exclude_key_regex: Optional[str] = None,
**kwargs,
) dict[str, Any]#