core.ssm.mlp_layer#

Module Contents#

Classes#

MLPLayer

Drop-in replacement for TransformerLayer but initializes only an MLP via the spec.

API#

class core.ssm.mlp_layer.MLPLayer(
config: megatron.core.transformer.TransformerConfig,
submodules: megatron.core.transformer.TransformerLayerSubmodules,
layer_number: int = 1,
hidden_dropout: float = None,
pg_collection: Optional[megatron.core.process_groups_config.ProcessGroupCollection] = None,
)#

Bases: megatron.core.transformer.TransformerLayer

Drop-in replacement for TransformerLayer but initializes only an MLP via the spec.

Initialization