`core.post_training.modelopt.mamba.model_specs`#

Module Contents#

Functions#

get_mamba_stack_modelopt_spec

Mix the native spec with TENorm.

API#

core.post_training.modelopt.mamba.model_specs.get_mamba_stack_modelopt_spec( local_core_attention: bool = False, remap_te_layernorm: bool = False, ) → megatron.core.transformer.spec_utils.ModuleSpec#

Mix the native spec with TENorm.

This is essentially the native local spec except for the layernorm implementation is using TENorm from Transformer-Engine.