nemo_automodel.components.models.bagel.hf_backbone_loader#
Helpers for initializing BAGEL pretraining runs from HF backbones.
Module Contents#
Functions#
Load a SigLIP vision config from a vision-only or full SigLIP HF folder. |
|
Remove wrapper path fragments that are not part of logical parameter FQNs. |
|
Reset BAGEL-added Q/K norm weights missing from vanilla Qwen checkpoints. |
|
Resolve a local path or download a HF snapshot containing model weights. |
|
Load a HF safetensors/bin checkpoint as a full CPU state dict. |
|
Copy UND Qwen weights into |
|
Load vanilla Qwen weights into BAGEL’s language model after AM sharding. |
|
Load SigLIP weights into BAGEL’s packed-NaViT vision model after AM sharding. |
|
Initialize BAGEL-owned modules not loaded from Qwen/SigLIP checkpoints. |
|
Load Qwen/SigLIP HF backbone weights into an already-built BAGEL model. |
|
Build BAGEL from upstream Qwen/SigLIP backbone configs. |
Data#
API#
- nemo_automodel.components.models.bagel.hf_backbone_loader.logger#
‘getLogger(…)’
- nemo_automodel.components.models.bagel.hf_backbone_loader._load_siglip_vision_config(vit_path: str)#
Load a SigLIP vision config from a vision-only or full SigLIP HF folder.
- nemo_automodel.components.models.bagel.hf_backbone_loader._normalize_wrapped_param_name(name: str) str#
Remove wrapper path fragments that are not part of logical parameter FQNs.
- nemo_automodel.components.models.bagel.hf_backbone_loader._reset_qwen_qk_norms_for_hf_backbone(language_model) int#
Reset BAGEL-added Q/K norm weights missing from vanilla Qwen checkpoints.
- nemo_automodel.components.models.bagel.hf_backbone_loader._resolve_hf_weight_path(model_path: str) str#
Resolve a local path or download a HF snapshot containing model weights.
- nemo_automodel.components.models.bagel.hf_backbone_loader._load_hf_state_dict(model_path: str) dict[str, torch.Tensor]#
Load a HF safetensors/bin checkpoint as a full CPU state dict.
- nemo_automodel.components.models.bagel.hf_backbone_loader._copy_qwen_mot_weights_from_und(language_model) int#
Copy UND Qwen weights into
*_moe_gensiblings after sharding/wrapping.
- nemo_automodel.components.models.bagel.hf_backbone_loader._load_qwen_backbone_into_bagel(
- model,
- llm_path: str,
- *,
- copy_init_moe: bool,
Load vanilla Qwen weights into BAGEL’s language model after AM sharding.
- nemo_automodel.components.models.bagel.hf_backbone_loader._load_siglip_backbone_into_bagel(model, vit_path: str) None#
Load SigLIP weights into BAGEL’s packed-NaViT vision model after AM sharding.
- nemo_automodel.components.models.bagel.hf_backbone_loader.initialize_bagel_non_backbone_weights(model: torch.nn.Module) None#
Initialize BAGEL-owned modules not loaded from Qwen/SigLIP checkpoints.
- nemo_automodel.components.models.bagel.hf_backbone_loader.load_bagel_hf_backbone_weights(
- model: torch.nn.Module,
- model_cfg: Any,
Load Qwen/SigLIP HF backbone weights into an already-built BAGEL model.
- nemo_automodel.components.models.bagel.hf_backbone_loader.build_bagel_from_hf_backbones(
- *,
- model_cfg: Any,
- stage: int,
- vae_config: Dict[str, int] | None,
- meta_init: bool = False,
- load_backbone_weights: bool = True,
Build BAGEL from upstream Qwen/SigLIP backbone configs.
- nemo_automodel.components.models.bagel.hf_backbone_loader.__all__#
[‘build_bagel_from_hf_backbones’, ‘initialize_bagel_non_backbone_weights’, ‘load_bagel_hf_backbone_w…