core.post_training.modelopt.mamba.model_specs#
Module Contents#
Functions#
Mix the native spec with TENorm. |
API#
- core.post_training.modelopt.mamba.model_specs.get_mamba_stack_modelopt_spec(
- local_core_attention: bool = False,
- remap_te_layernorm: bool = False,
Mix the native spec with TENorm.
This is essentially the native local spec except for the layernorm implementation is using TENorm from Transformer-Engine.