`bridge.models.llama_nemotron.llama_nemotron_provider`#

Module Contents#

Classes#

`Llama31NemotronNano8BProvider`	Configuration class for the Llama3.1-Nemotron-Nano-8B model. Maps to: nvidia/Llama-3.1-Nemotron-Nano-8B-v1 Based on Llama31Config8B with kv_channels=128
`Llama31Nemotron70BProvider`	Configuration class for the Llama3.1-Nemotron-70B model. Maps to: nvidia/Llama-3.1-Nemotron-70B-Instruct-HF Based on Llama31Config70B with kv_channels=128
`Llama33NemotronSuper49BProvider`	Configuration class for the Llama3.3-Nemotron-Super-49B model. Maps to: nvidia/Llama-3_3-Nemotron-Super-49B-v1 Based on Llama31Config70B with heterogeneous architecture and kv_channels=128
`Llama31NemotronUltra253BProvider`	Configuration class for the Llama3.1-Nemotron-Ultra-253B model. Maps to: nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 Based on Llama31Config405B with heterogeneous architecture and kv_channels=128
`LlamaNemotronHeterogeneousProvider`	Generic provider for heterogeneous (NAS) Llama-Nemotron models using DeciLMForCausalLM.

Functions#

heterogeneous_layer_spec

Determine the most appropriate layer specification based on availability.

Data#

logger

API#

bridge.models.llama_nemotron.llama_nemotron_provider.logger#: ‘getLogger(…)’

bridge.models.llama_nemotron.llama_nemotron_provider.heterogeneous_layer_spec( config, ) → megatron.core.transformer.spec_utils.ModuleSpec#

Determine the most appropriate layer specification based on availability.

Uses Transformer Engine specs since TE is a required dependency.

Parameters:: config – GPT configuration object
Returns:: The selected module specification
Return type:: ModuleSpec

class bridge.models.llama_nemotron.llama_nemotron_provider.Llama31NemotronNano8BProvider#

Bases: megatron.bridge.models.llama.llama_provider.Llama31ModelProvider8B

Configuration class for the Llama3.1-Nemotron-Nano-8B model. Maps to: nvidia/Llama-3.1-Nemotron-Nano-8B-v1 Based on Llama31Config8B with kv_channels=128

kv_channels: int#: 128

bf16: bool#: True

fp16: bool#: False

params_dtype: torch.dtype#: None

autocast_dtype: torch.dtype#: None

class bridge.models.llama_nemotron.llama_nemotron_provider.Llama31Nemotron70BProvider#

Bases: megatron.bridge.models.llama.llama_provider.Llama31ModelProvider70B

Configuration class for the Llama3.1-Nemotron-70B model. Maps to: nvidia/Llama-3.1-Nemotron-70B-Instruct-HF Based on Llama31Config70B with kv_channels=128

kv_channels: int#: 128

bf16: bool#: True

fp16: bool#: False

params_dtype: torch.dtype#: None

autocast_dtype: torch.dtype#: None

class bridge.models.llama_nemotron.llama_nemotron_provider.Llama33NemotronSuper49BProvider#

Bases: megatron.bridge.models.llama.llama_provider.Llama31ModelProvider70B, megatron.bridge.models.transformer_config.HeterogeneousTransformerConfig

Configuration class for the Llama3.3-Nemotron-Super-49B model. Maps to: nvidia/Llama-3_3-Nemotron-Super-49B-v1 Based on Llama31Config70B with heterogeneous architecture and kv_channels=128

Developer Note: For MRO, Llama31ModelProvider70B must come first to ensure proper provider functionality, then HeterogeneousTransformerConfig for heterogeneous support.

hidden_size: int#: 8192

num_attention_heads: int#: 64

num_layers: int#: 80

kv_channels: int#: 128

bf16: bool#: True

fp16: bool#: False

params_dtype: torch.dtype#: None

autocast_dtype: torch.dtype#: None

heterogeneous_layers_config_path: str | None#: None

heterogeneous_layers_config_encoded_json: str = <Multiline-String>#

transformer_layer_spec: megatron.core.transformer.spec_utils.ModuleSpec | Callable#: None

class bridge.models.llama_nemotron.llama_nemotron_provider.Llama31NemotronUltra253BProvider#

Bases: megatron.bridge.models.llama.llama_provider.Llama31ModelProvider405B, megatron.bridge.models.transformer_config.HeterogeneousTransformerConfig

Configuration class for the Llama3.1-Nemotron-Ultra-253B model. Maps to: nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 Based on Llama31Config405B with heterogeneous architecture and kv_channels=128

Developer Note: For MRO, Llama31ModelProvider405B must come first to ensure proper provider functionality, then HeterogeneousTransformerConfig for heterogeneous support.

num_layers: int#: 162

hidden_size: int#: 16384

num_attention_heads: int#: 128

kv_channels: int#: 128

bf16: bool#: True

fp16: bool#: False

params_dtype: torch.dtype#: None

autocast_dtype: torch.dtype#: None

heterogeneous_layers_config_path: str | None#: None

heterogeneous_layers_config_encoded_json: str = <Multiline-String>#

transformer_layer_spec: megatron.core.transformer.spec_utils.ModuleSpec | Callable#: None

class bridge.models.llama_nemotron.llama_nemotron_provider.LlamaNemotronHeterogeneousProvider#

Bases: megatron.bridge.models.llama.llama_provider.Llama31ModelProvider, megatron.bridge.models.transformer_config.HeterogeneousTransformerConfig

Generic provider for heterogeneous (NAS) Llama-Nemotron models using DeciLMForCausalLM.

Sizes and all architectural details are driven directly from the HF config provided at runtime via kwargs (num_layers, hidden_size, heads, kv_channels, etc.).

bf16: bool#: True

fp16: bool#: False

params_dtype: torch.dtype#: None

autocast_dtype: torch.dtype#: None

heterogeneous_layers_config_path: str | None#: None

heterogeneous_layers_config_encoded_json: str = <Multiline-String>#

transformer_layer_spec: megatron.core.transformer.spec_utils.ModuleSpec | Callable#: None

bridge.models.llama_nemotron.llama_nemotron_provider#