bridge.models.falcon_h1.falconh1_provider#
Module Contents#
Classes#
Configuration and provider for FalconH1 hybrid models. |
Functions#
Return the default FalconH1 stack spec. |
Data#
API#
- bridge.models.falcon_h1.falconh1_provider.logger#
‘getLogger(…)’
- bridge.models.falcon_h1.falconh1_provider.get_default_falconh1_stack_spec()#
Return the default FalconH1 stack spec.
This is a named function (not a lambda) to allow proper serialization and reconstruction from checkpoints. Named functions can be imported via their module path, unlike lambdas.
- Returns:
Default FalconH1 stack specification
- class bridge.models.falcon_h1.falconh1_provider.FalconH1ModelProvider#
Bases:
megatron.bridge.models.falcon_h1.modeling_falconh1.falconh1_model.FalconH1Config,megatron.bridge.models.model_provider.ModelProviderMixin[megatron.bridge.models.falcon_h1.modeling_falconh1.falconh1_model.FalconH1Model]Configuration and provider for FalconH1 hybrid models.
This class extends FalconH1Config with model instantiation capabilities and provides a method to create configured FalconH1 models.
- seq_length: int#
4096
- fp16_lm_cross_entropy: bool#
False
- parallel_output: bool#
True
False
- params_dtype: torch.dtype#
None
- fp16: bool#
False
- bf16: bool#
True
- hybrid_attention_ratio: float#
0.0
- hybrid_mlp_ratio: float#
0.0
- falconh1_ratio: float#
1.0
- hybrid_override_pattern: Optional[str]#
None
- position_embedding_type: Literal[learned_absolute, rope, none]#
‘rope’
- rotary_percent: float#
1.0
- rotary_base: int#
100000000000
- seq_len_interpolation_factor: Optional[float]#
None
- apply_rope_fusion: bool#
False
- make_vocab_size_divisible_by: int#
128
- vocab_size: Optional[int]#
None
- should_pad_vocab: bool#
False
- gated_linear_unit: bool#
True
- normalization: str#
‘RMSNorm’
- add_bias_linear: bool#
False
0.0
- attention_dropout: float#
0.0
- layernorm_epsilon: float#
1e-05
- attention_backend: megatron.core.transformer.enums.AttnBackend#
None
- deallocate_pipeline_outputs: bool#
True
- bias_dropout_fusion: bool#
False
- cross_entropy_loss_fusion: bool#
False
- transformer_impl: str#
‘local’
- embedding_multiplier: float#
1.0
- lm_head_multiplier: float#
1.0
- key_multiplier: float#
1.0
- attention_in_multiplier: float#
1.0
- attention_out_multiplier: float#
1.0
- ssm_in_multiplier: float#
1.0
- ssm_out_multiplier: float#
1.0
- mlp_multipliers: tuple#
(1.0, 1.0)
- ssm_multipliers: tuple#
(1.0, 1.0, 1.0, 1.0, 1.0)
- falconh1_stack_spec: Union[megatron.core.transformer.ModuleSpec, Callable[[], megatron.core.transformer.ModuleSpec]]#
None
- provide(
- pre_process=None,
- post_process=None,
- vp_stage=None,
Configure and instantiate a FalconH1 model based on this configuration.
- Parameters:
pre_process – Whether to include pre-processing in the model, defaults to first pipeline stage
post_process – Whether to include post-processing in the model, defaults to last pipeline stage
vp_stage – Virtual pipeline stage (currently unsupported)
- Returns:
Configured FalconH1 model instance
- Return type:
FalconH1Model
- finalize() None#