nemo_automodel.components.models.bagel.hf_backbone_loader#

Helpers for initializing BAGEL pretraining runs from HF backbones.

Module Contents#

Functions#

_load_siglip_vision_config

Load a SigLIP vision config from a vision-only or full SigLIP HF folder.

_normalize_wrapped_param_name

Remove wrapper path fragments that are not part of logical parameter FQNs.

_reset_qwen_qk_norms_for_hf_backbone

Reset BAGEL-added Q/K norm weights missing from vanilla Qwen checkpoints.

_resolve_hf_weight_path

Resolve a local path or download a HF snapshot containing model weights.

_load_hf_state_dict

Load a HF safetensors/bin checkpoint as a full CPU state dict.

_copy_qwen_mot_weights_from_und

Copy UND Qwen weights into *_moe_gen siblings after sharding/wrapping.

_load_qwen_backbone_into_bagel

Load vanilla Qwen weights into BAGEL’s language model after AM sharding.

_load_siglip_backbone_into_bagel

Load SigLIP weights into BAGEL’s packed-NaViT vision model after AM sharding.

initialize_bagel_non_backbone_weights

Initialize BAGEL-owned modules not loaded from Qwen/SigLIP checkpoints.

load_bagel_hf_backbone_weights

Load Qwen/SigLIP HF backbone weights into an already-built BAGEL model.

build_bagel_from_hf_backbones

Build BAGEL from upstream Qwen/SigLIP backbone configs.

Data#

API#

nemo_automodel.components.models.bagel.hf_backbone_loader.logger#

‘getLogger(…)’

nemo_automodel.components.models.bagel.hf_backbone_loader._load_siglip_vision_config(vit_path: str)#

Load a SigLIP vision config from a vision-only or full SigLIP HF folder.

nemo_automodel.components.models.bagel.hf_backbone_loader._normalize_wrapped_param_name(name: str) str#

Remove wrapper path fragments that are not part of logical parameter FQNs.

nemo_automodel.components.models.bagel.hf_backbone_loader._reset_qwen_qk_norms_for_hf_backbone(language_model) int#

Reset BAGEL-added Q/K norm weights missing from vanilla Qwen checkpoints.

nemo_automodel.components.models.bagel.hf_backbone_loader._resolve_hf_weight_path(model_path: str) str#

Resolve a local path or download a HF snapshot containing model weights.

nemo_automodel.components.models.bagel.hf_backbone_loader._load_hf_state_dict(model_path: str) dict[str, torch.Tensor]#

Load a HF safetensors/bin checkpoint as a full CPU state dict.

nemo_automodel.components.models.bagel.hf_backbone_loader._copy_qwen_mot_weights_from_und(language_model) int#

Copy UND Qwen weights into *_moe_gen siblings after sharding/wrapping.

nemo_automodel.components.models.bagel.hf_backbone_loader._load_qwen_backbone_into_bagel(
model,
llm_path: str,
*,
copy_init_moe: bool,
) None#

Load vanilla Qwen weights into BAGEL’s language model after AM sharding.

nemo_automodel.components.models.bagel.hf_backbone_loader._load_siglip_backbone_into_bagel(model, vit_path: str) None#

Load SigLIP weights into BAGEL’s packed-NaViT vision model after AM sharding.

nemo_automodel.components.models.bagel.hf_backbone_loader.initialize_bagel_non_backbone_weights(model: torch.nn.Module) None#

Initialize BAGEL-owned modules not loaded from Qwen/SigLIP checkpoints.

nemo_automodel.components.models.bagel.hf_backbone_loader.load_bagel_hf_backbone_weights(
model: torch.nn.Module,
model_cfg: Any,
) None#

Load Qwen/SigLIP HF backbone weights into an already-built BAGEL model.

nemo_automodel.components.models.bagel.hf_backbone_loader.build_bagel_from_hf_backbones(
*,
model_cfg: Any,
stage: int,
vae_config: Dict[str, int] | None,
meta_init: bool = False,
load_backbone_weights: bool = True,
) torch.nn.Module#

Build BAGEL from upstream Qwen/SigLIP backbone configs.

nemo_automodel.components.models.bagel.hf_backbone_loader.__all__#

[‘build_bagel_from_hf_backbones’, ‘initialize_bagel_non_backbone_weights’, ‘load_bagel_hf_backbone_w…