> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.models.bagel.hf_backbone_loader

Helpers for initializing BAGEL pretraining runs from HF backbones.

## Module Contents

### Functions

| Name                                                                                                                                        | Description                                                                   |
| ------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------- |
| [`_copy_qwen_mot_weights_from_und`](#nemo_automodel-components-models-bagel-hf_backbone_loader-_copy_qwen_mot_weights_from_und)             | Copy UND Qwen weights into `*_moe_gen` siblings after sharding/wrapping.      |
| [`_load_hf_state_dict`](#nemo_automodel-components-models-bagel-hf_backbone_loader-_load_hf_state_dict)                                     | Load a HF safetensors/bin checkpoint as a full CPU state dict.                |
| [`_load_qwen_backbone_into_bagel`](#nemo_automodel-components-models-bagel-hf_backbone_loader-_load_qwen_backbone_into_bagel)               | Load vanilla Qwen weights into BAGEL's language model after AM sharding.      |
| [`_load_siglip_backbone_into_bagel`](#nemo_automodel-components-models-bagel-hf_backbone_loader-_load_siglip_backbone_into_bagel)           | Load SigLIP weights into BAGEL's packed-NaViT vision model after AM sharding. |
| [`_load_siglip_vision_config`](#nemo_automodel-components-models-bagel-hf_backbone_loader-_load_siglip_vision_config)                       | Load a SigLIP vision config from a vision-only or full SigLIP HF folder.      |
| [`_normalize_wrapped_param_name`](#nemo_automodel-components-models-bagel-hf_backbone_loader-_normalize_wrapped_param_name)                 | Remove wrapper path fragments that are not part of logical parameter FQNs.    |
| [`_reset_qwen_qk_norms_for_hf_backbone`](#nemo_automodel-components-models-bagel-hf_backbone_loader-_reset_qwen_qk_norms_for_hf_backbone)   | Reset BAGEL-added Q/K norm weights missing from vanilla Qwen checkpoints.     |
| [`_resolve_hf_weight_path`](#nemo_automodel-components-models-bagel-hf_backbone_loader-_resolve_hf_weight_path)                             | Resolve a local path or download a HF snapshot containing model weights.      |
| [`build_bagel_from_hf_backbones`](#nemo_automodel-components-models-bagel-hf_backbone_loader-build_bagel_from_hf_backbones)                 | Build BAGEL from upstream Qwen/SigLIP backbone configs.                       |
| [`initialize_bagel_non_backbone_weights`](#nemo_automodel-components-models-bagel-hf_backbone_loader-initialize_bagel_non_backbone_weights) | Initialize BAGEL-owned modules not loaded from Qwen/SigLIP checkpoints.       |
| [`load_bagel_hf_backbone_weights`](#nemo_automodel-components-models-bagel-hf_backbone_loader-load_bagel_hf_backbone_weights)               | Load Qwen/SigLIP HF backbone weights into an already-built BAGEL model.       |

### Data

[`__all__`](#nemo_automodel-components-models-bagel-hf_backbone_loader-__all__)

[`logger`](#nemo_automodel-components-models-bagel-hf_backbone_loader-logger)

### API

```python
nemo_automodel.components.models.bagel.hf_backbone_loader._copy_qwen_mot_weights_from_und(
    language_model
) -> int
```

Copy UND Qwen weights into `*_moe_gen` siblings after sharding/wrapping.

```python
nemo_automodel.components.models.bagel.hf_backbone_loader._load_hf_state_dict(
    model_path: str
) -> dict[str, torch.Tensor]
```

Load a HF safetensors/bin checkpoint as a full CPU state dict.

```python
nemo_automodel.components.models.bagel.hf_backbone_loader._load_qwen_backbone_into_bagel(
    model,
    llm_path: str,
    copy_init_moe: bool
) -> None
```

Load vanilla Qwen weights into BAGEL's language model after AM sharding.

```python
nemo_automodel.components.models.bagel.hf_backbone_loader._load_siglip_backbone_into_bagel(
    model,
    vit_path: str
) -> None
```

Load SigLIP weights into BAGEL's packed-NaViT vision model after AM sharding.

```python
nemo_automodel.components.models.bagel.hf_backbone_loader._load_siglip_vision_config(
    vit_path: str
)
```

Load a SigLIP vision config from a vision-only or full SigLIP HF folder.

```python
nemo_automodel.components.models.bagel.hf_backbone_loader._normalize_wrapped_param_name(
    name: str
) -> str
```

Remove wrapper path fragments that are not part of logical parameter FQNs.

```python
nemo_automodel.components.models.bagel.hf_backbone_loader._reset_qwen_qk_norms_for_hf_backbone(
    language_model
) -> int
```

Reset BAGEL-added Q/K norm weights missing from vanilla Qwen checkpoints.

```python
nemo_automodel.components.models.bagel.hf_backbone_loader._resolve_hf_weight_path(
    model_path: str
) -> str
```

Resolve a local path or download a HF snapshot containing model weights.

```python
nemo_automodel.components.models.bagel.hf_backbone_loader.build_bagel_from_hf_backbones(
    model_cfg: typing.Any,
    stage: int,
    vae_config: typing.Dict[str, int] | None,
    meta_init: bool = False,
    load_backbone_weights: bool = True
) -> torch.nn.Module
```

Build BAGEL from upstream Qwen/SigLIP backbone configs.

```python
nemo_automodel.components.models.bagel.hf_backbone_loader.initialize_bagel_non_backbone_weights(
    model: torch.nn.Module
) -> None
```

Initialize BAGEL-owned modules not loaded from Qwen/SigLIP checkpoints.

```python
nemo_automodel.components.models.bagel.hf_backbone_loader.load_bagel_hf_backbone_weights(
    model: torch.nn.Module,
    model_cfg: typing.Any
) -> None
```

Load Qwen/SigLIP HF backbone weights into an already-built BAGEL model.

```python
nemo_automodel.components.models.bagel.hf_backbone_loader.__all__ = ['build_bagel_from_hf_backbones', 'initialize_bagel_non_backbone_weights', 'load...
```

```python
nemo_automodel.components.models.bagel.hf_backbone_loader.logger = logging.getLogger(__name__)
```