`bridge.models.llama_nemotron.llama_nemotron_bridge`#

Module Contents#

Classes#

LlamaNemotronBridge

Megatron Bridge for Heterogeneous Llama-Nemotron models (Super/Ultra).

API#

class bridge.models.llama_nemotron.llama_nemotron_bridge.LlamaNemotronBridge#

Bases: megatron.bridge.models.conversion.model_bridge.MegatronModelBridge

Megatron Bridge for Heterogeneous Llama-Nemotron models (Super/Ultra).

This bridge handles heterogeneous Llama-Nemotron models that use the DeciLMForCausalLM architecture with block_configs for heterogeneous layer specifications. These models require special handling because:

They use custom modeling code (DeciLMForCausalLM) loaded via auto_map
They have heterogeneous block configurations (different layers have different specs)
They require trust_remote_code=True to load from HuggingFace

Supported models (examples):

nvidia/Llama-3_3-Nemotron-Super-49B-v1 (80 layers, 8192 hidden)
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5 (updated v1.5 release)
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 (162 layers, 16384 hidden)

Homogeneous Llama-Nemotron models (Nano/70B) use standard LlamaForCausalLM architecture and are handled by the regular LlamaBridge.

.. rubric:: Example

from megatron.bridge import AutoBridge

DeciLMForCausalLM models will automatically use this bridge

bridge = AutoBridge.from_hf_pretrained( … “nvidia/Llama-3_3-Nemotron-Super-49B-v1_5”, … trust_remote_code=True … ) provider = bridge.to_megatron_provider()

provider_bridge( hf_pretrained: megatron.bridge.models.hf_pretrained.causal_lm.PreTrainedCausalLM, ) → megatron.bridge.models.llama.llama_provider.Llama31ModelProvider#

mapping_registry() → megatron.bridge.models.conversion.mapping_registry.MegatronMappingRegistry#

bridge.models.llama_nemotron.llama_nemotron_bridge#

Module Contents#

Classes#

API#

`bridge.models.llama_nemotron.llama_nemotron_bridge`#