bridge.models.llama_nemotron.llama_nemotron_provider
#
Module Contents#
Classes#
Configuration class for the Llama3.1-Nemotron-Nano-8B model. Maps to: nvidia/Llama-3.1-Nemotron-Nano-8B-v1 Based on Llama31Config8B with kv_channels=128 |
|
Configuration class for the Llama3.1-Nemotron-70B model. Maps to: nvidia/Llama-3.1-Nemotron-70B-Instruct-HF Based on Llama31Config70B with kv_channels=128 |
|
Configuration class for the Llama3.3-Nemotron-Super-49B model. Maps to: nvidia/Llama-3_3-Nemotron-Super-49B-v1 Based on Llama31Config70B with heterogeneous architecture and kv_channels=128 |
|
Configuration class for the Llama3.1-Nemotron-Ultra-253B model. Maps to: nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 Based on Llama31Config405B with heterogeneous architecture and kv_channels=128 |
|
Generic provider for heterogeneous (NAS) Llama-Nemotron models using DeciLMForCausalLM. |
Functions#
Determine the most appropriate layer specification based on availability. |
Data#
API#
- bridge.models.llama_nemotron.llama_nemotron_provider.logger#
‘getLogger(…)’
- bridge.models.llama_nemotron.llama_nemotron_provider.heterogeneous_layer_spec(
- config,
Determine the most appropriate layer specification based on availability.
Uses Transformer Engine specs since TE is a required dependency.
- Parameters:
config – GPT configuration object
- Returns:
The selected module specification
- Return type:
ModuleSpec
- class bridge.models.llama_nemotron.llama_nemotron_provider.Llama31NemotronNano8BProvider#
Bases:
megatron.bridge.models.llama.llama_provider.Llama31ModelProvider8B
Configuration class for the Llama3.1-Nemotron-Nano-8B model. Maps to: nvidia/Llama-3.1-Nemotron-Nano-8B-v1 Based on Llama31Config8B with kv_channels=128
- kv_channels: int#
128
- bf16: bool#
True
- fp16: bool#
False
- params_dtype: torch.dtype#
None
- autocast_dtype: torch.dtype#
None
- class bridge.models.llama_nemotron.llama_nemotron_provider.Llama31Nemotron70BProvider#
Bases:
megatron.bridge.models.llama.llama_provider.Llama31ModelProvider70B
Configuration class for the Llama3.1-Nemotron-70B model. Maps to: nvidia/Llama-3.1-Nemotron-70B-Instruct-HF Based on Llama31Config70B with kv_channels=128
- kv_channels: int#
128
- bf16: bool#
True
- fp16: bool#
False
- params_dtype: torch.dtype#
None
- autocast_dtype: torch.dtype#
None
- class bridge.models.llama_nemotron.llama_nemotron_provider.Llama33NemotronSuper49BProvider#
Bases:
megatron.bridge.models.llama.llama_provider.Llama31ModelProvider70B
,megatron.bridge.models.transformer_config.HeterogeneousTransformerConfig
Configuration class for the Llama3.3-Nemotron-Super-49B model. Maps to: nvidia/Llama-3_3-Nemotron-Super-49B-v1 Based on Llama31Config70B with heterogeneous architecture and kv_channels=128
Developer Note: For MRO, Llama31ModelProvider70B must come first to ensure proper provider functionality, then HeterogeneousTransformerConfig for heterogeneous support.
8192
- num_attention_heads: int#
64
- num_layers: int#
80
- kv_channels: int#
128
- bf16: bool#
True
- fp16: bool#
False
- params_dtype: torch.dtype#
None
- autocast_dtype: torch.dtype#
None
- heterogeneous_layers_config_path: str | None#
None
- heterogeneous_layers_config_encoded_json: str = <Multiline-String>#
- transformer_layer_spec: megatron.core.transformer.spec_utils.ModuleSpec | Callable#
None
- class bridge.models.llama_nemotron.llama_nemotron_provider.Llama31NemotronUltra253BProvider#
Bases:
megatron.bridge.models.llama.llama_provider.Llama31ModelProvider405B
,megatron.bridge.models.transformer_config.HeterogeneousTransformerConfig
Configuration class for the Llama3.1-Nemotron-Ultra-253B model. Maps to: nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 Based on Llama31Config405B with heterogeneous architecture and kv_channels=128
Developer Note: For MRO, Llama31ModelProvider405B must come first to ensure proper provider functionality, then HeterogeneousTransformerConfig for heterogeneous support.
- num_layers: int#
162
16384
- num_attention_heads: int#
128
- kv_channels: int#
128
- bf16: bool#
True
- fp16: bool#
False
- params_dtype: torch.dtype#
None
- autocast_dtype: torch.dtype#
None
- heterogeneous_layers_config_path: str | None#
None
- heterogeneous_layers_config_encoded_json: str = <Multiline-String>#
- transformer_layer_spec: megatron.core.transformer.spec_utils.ModuleSpec | Callable#
None
- class bridge.models.llama_nemotron.llama_nemotron_provider.LlamaNemotronHeterogeneousProvider#
Bases:
megatron.bridge.models.llama.llama_provider.Llama31ModelProvider
,megatron.bridge.models.transformer_config.HeterogeneousTransformerConfig
Generic provider for heterogeneous (NAS) Llama-Nemotron models using DeciLMForCausalLM.
Sizes and all architectural details are driven directly from the HF config provided at runtime via kwargs (num_layers, hidden_size, heads, kv_channels, etc.).
- bf16: bool#
True
- fp16: bool#
False
- params_dtype: torch.dtype#
None
- autocast_dtype: torch.dtype#
None
- heterogeneous_layers_config_path: str | None#
None
- heterogeneous_layers_config_encoded_json: str = <Multiline-String>#
- transformer_layer_spec: megatron.core.transformer.spec_utils.ModuleSpec | Callable#
None