core.post_training.modelopt.gpt.model_specs#
Module Contents#
Functions#
Mix the native spec with TENorm. |
API#
- core.post_training.modelopt.gpt.model_specs.get_gpt_modelopt_spec(
- config: megatron.core.transformer.transformer_config.TransformerConfig,
- local_core_attention: bool = False,
- remap_te_layernorm: bool = False,
- real_quant_cfg: str = 'None',
- qk_l2_norm: bool = False,
- use_arbitrary_attention_mask: bool = False,
Mix the native spec with TENorm.
This is essentially the native local spec except for the layernorm implementation is using TENorm from Transformer-Engine. The issue is that FusedLayerNorm from apex has stopped supporting RMSNorm needed by llama.
- Parameters:
config – model’s transformer config
local_core_attention – whether to use local DotProductAttention or TEDotProductAttention
remap_te_layernorm – whether to perform sharded state_dict prefix mapping on layernorm
real_quant_cfg – Model Optimizer real quantization config
qk_l2_norm – whether to use Llama4 L2 norm for Q and K
use_arbitrary_attention_mask – whether to use arbitrary attention mask instead of causal