`core.ssm.mamba_hybrid_layer_allocation`#

Module Contents#

Classes#

Symbols for different layer types.

Functions#

`_allocate_auto`
`_allocate_override`
`_layer_counts_match`
`allocate_layers`	Allocates layers according to the requested distribution of layer types.
`get_layer_maps_from_layer_type_list`	Returns maps from global layer index to the corresponding layer index for each layer type in [Attention, Mamba, MLP, MoE] given a layer type list.

Data#

API#

core.ssm.mamba_hybrid_layer_allocation.logger#: ‘getLogger(…)’

class core.ssm.mamba_hybrid_layer_allocation.Symbols#

Symbols for different layer types.

MAMBA#: ‘M’

ATTENTION#: ‘*’

MLP#: ‘-’

MOE#: ‘E’

VALID#: None

core.ssm.mamba_hybrid_layer_allocation._allocate_auto( total_layers_count: int, target_attention_ratio: float, target_mlp_ratio: float, ) → list#

core.ssm.mamba_hybrid_layer_allocation._allocate_override( total_layers_count: int, override_pattern: str, ) → list#

core.ssm.mamba_hybrid_layer_allocation._layer_counts_match(a: list, b: list) → bool#

core.ssm.mamba_hybrid_layer_allocation.allocate_layers( total_layers_count: int, target_attention_ratio: float, target_mlp_ratio: float, override_pattern: str = None, ) → list#: Allocates layers according to the requested distribution of layer types.

core.ssm.mamba_hybrid_layer_allocation.get_layer_maps_from_layer_type_list( layer_type_list: List[str], ) → Tuple[Dict[int, int], Dict[int, int], Dict[int, int]]#: Returns maps from global layer index to the corresponding layer index for each layer type in [Attention, Mamba, MLP, MoE] given a layer type list.