core.ssm.mamba_hybrid_layer_allocation#

Module Contents#

Classes#

Symbols

Symbols for different layer types.

Functions#

_allocate_auto

_allocate_override

_layer_counts_match

allocate_layers

Allocates layers according to the requested distribution of layer types.

get_layer_maps_from_layer_type_list

Returns maps from global layer index to the corresponding layer index for each layer type in [Attention, Mamba, MLP, MoE] given a layer type list.

Data#

API#

core.ssm.mamba_hybrid_layer_allocation.logger#

‘getLogger(…)’

class core.ssm.mamba_hybrid_layer_allocation.Symbols#

Symbols for different layer types.

MAMBA#

‘M’

ATTENTION#

‘*’

MLP#

‘-’

MOE#

‘E’

VALID#

None

core.ssm.mamba_hybrid_layer_allocation._allocate_auto(
total_layers_count: int,
target_attention_ratio: float,
target_mlp_ratio: float,
) list#
core.ssm.mamba_hybrid_layer_allocation._allocate_override(
total_layers_count: int,
override_pattern: str,
) list#
core.ssm.mamba_hybrid_layer_allocation._layer_counts_match(a: list, b: list) bool#
core.ssm.mamba_hybrid_layer_allocation.allocate_layers(
total_layers_count: int,
target_attention_ratio: float,
target_mlp_ratio: float,
override_pattern: str = None,
) list#

Allocates layers according to the requested distribution of layer types.

core.ssm.mamba_hybrid_layer_allocation.get_layer_maps_from_layer_type_list(
layer_type_list: List[str],
) Tuple[Dict[int, int], Dict[int, int], Dict[int, int]]#

Returns maps from global layer index to the corresponding layer index for each layer type in [Attention, Mamba, MLP, MoE] given a layer type list.