bridge.models.llama_nemotron.llama_nemotron_provider#

Module Contents#

Classes#

LlamaNemotronHeterogeneousProvider

Generic provider for heterogeneous (NAS) Llama-Nemotron models using DeciLMForCausalLM.

Functions#

heterogeneous_layer_spec

Determine the most appropriate layer specification based on availability.

Data#

API#

bridge.models.llama_nemotron.llama_nemotron_provider.logger#

‘getLogger(…)’

bridge.models.llama_nemotron.llama_nemotron_provider.heterogeneous_layer_spec(
config,
) megatron.core.transformer.spec_utils.ModuleSpec#

Determine the most appropriate layer specification based on availability.

Uses Transformer Engine specs since TE is a required dependency.

Parameters:

config – GPT configuration object

Returns:

The selected module specification

Return type:

ModuleSpec

class bridge.models.llama_nemotron.llama_nemotron_provider.LlamaNemotronHeterogeneousProvider#

Bases: megatron.bridge.models.gpt_provider.GPTModelProvider, megatron.bridge.models.transformer_config.HeterogeneousTransformerConfig

Generic provider for heterogeneous (NAS) Llama-Nemotron models using DeciLMForCausalLM.

Sizes and all architectural details are driven directly from the HF config provided at runtime via kwargs (num_layers, hidden_size, heads, kv_channels, etc.).

normalization: str#

‘RMSNorm’

activation_func: Callable#

None

gated_linear_unit: bool#

True

position_embedding_type: str#

‘rope’

add_bias_linear: bool#

False

hidden_dropout: float#

0.0

attention_dropout: float#

0.0

share_embeddings_and_output_weights: bool#

False

bias_activation_fusion: bool#

True

masked_softmax_fusion: bool#

True

persist_layer_norm: bool#

True

bias_dropout_fusion: bool#

True

apply_rope_fusion: bool#

True

rotary_percent: float#

1.0

num_query_groups: int#

8

init_method_std: float#

0.02

bf16: bool#

True

fp16: bool#

False

params_dtype: torch.dtype#

None

autocast_dtype: torch.dtype#

None

heterogeneous_layers_config_path: str | None#

None

heterogeneous_layers_config_encoded_json: str = <Multiline-String>#
transformer_layer_spec: megatron.core.transformer.spec_utils.ModuleSpec | Callable#

None