`core.post_training.modelopt.gpt.model_specs`#

Module Contents#

Functions#

get_gpt_modelopt_spec

Mix the native spec with TENorm.

API#

core.post_training.modelopt.gpt.model_specs.get_gpt_modelopt_spec( config: megatron.core.transformer.transformer_config.TransformerConfig, local_core_attention: bool = False, remap_te_layernorm: bool = False, real_quant_cfg: str = 'None', qk_l2_norm: bool = False, use_arbitrary_attention_mask: bool = False, )#

Mix the native spec with TENorm.

This is essentially the native local spec except for the layernorm implementation is using TENorm from Transformer-Engine. The issue is that FusedLayerNorm from apex has stopped supporting RMSNorm needed by llama.

Parameters:

config – model’s transformer config
local_core_attention – whether to use local DotProductAttention or TEDotProductAttention
remap_te_layernorm – whether to perform sharded state_dict prefix mapping on layernorm
real_quant_cfg – Model Optimizer real quantization config
qk_l2_norm – whether to use Llama4 L2 norm for Q and K
use_arbitrary_attention_mask – whether to use arbitrary attention mask instead of causal

core.post_training.modelopt.gpt.model_specs#

Module Contents#

Functions#

API#

`core.post_training.modelopt.gpt.model_specs`#