core.models.backends#

Module Contents#

Classes#

BackendSpecProvider

A protocol for providing the submodules used in Spec building.

LocalSpecProvider

A protocol for providing Local submodules used in Spec building.

InferenceSpecProvider

A protocol for providing the submodules used in Spec building.

API#

class core.models.backends.BackendSpecProvider#

Bases: typing.Protocol

A protocol for providing the submodules used in Spec building.

abstractmethod column_parallel_linear() type#

Which column parallel linear module the backend uses

abstractmethod row_parallel_linear() type#

Which row parallel linear module the backend uses

abstractmethod fuse_layernorm_and_linear() bool#

Does the backend support a single module for layernorm and linear

abstractmethod column_parallel_layer_norm_linear() Optional[type]#

Which module for sequential layernorm and linear

abstractmethod layer_norm(
rms_norm: bool = False,
for_qk: bool = False,
) megatron.core.transformer.torch_norm.LayerNormBuilder#

Which module for layernorm

abstractmethod core_attention() type#

Which module to use for attention

abstractmethod grouped_mlp_modules(
moe_use_grouped_gemm: bool,
moe_use_legacy_grouped_gemm: bool,
) tuple[type, megatron.core.transformer.mlp.MLPSubmodules | megatron.core.transformer.moe.experts.TEGroupedMLPSubmodules | None]#

Which module and submodules to use for grouped mlp

abstractmethod activation_func() megatron.core.transformer.mlp.TEActivationFunctionBuilder | None#

Which module to use for activation function

class core.models.backends.LocalSpecProvider#

Bases: core.models.backends.BackendSpecProvider

A protocol for providing Local submodules used in Spec building.

column_parallel_linear() type#

Which column parallel linear module the backend uses

row_parallel_linear() type#

Which row parallel linear module the backend uses

fuse_layernorm_and_linear() bool#

Does the backend choose a single module for layernorm and linear

column_parallel_layer_norm_linear() Optional[type]#

Which module for sequential layernorm and linear

layer_norm(
rms_norm: bool = False,
for_qk: bool = False,
) megatron.core.transformer.torch_norm.LayerNormBuilder#

Which module to use for layer norm

core_attention() type#

Which module to use for attention

grouped_mlp_modules(
moe_use_grouped_gemm: bool,
moe_use_legacy_grouped_gemm: bool,
) tuple[type[megatron.core.transformer.moe.experts.GroupedMLP], None] | tuple[type[megatron.core.transformer.moe.experts.SequentialMLP], megatron.core.transformer.mlp.MLPSubmodules]#

Which module and submodules to use for grouped mlp

activation_func() megatron.core.transformer.mlp.TEActivationFunctionBuilder | None#

Which module to use for activation function

class core.models.backends.InferenceSpecProvider#

Bases: core.models.backends.BackendSpecProvider

A protocol for providing the submodules used in Spec building.

linear() type#

Which linear module TE backend uses

column_parallel_linear() type#

Which column parallel linear module TE backend uses

row_parallel_linear() type#

Which row parallel linear module TE backend uses

fuse_layernorm_and_linear() bool#

TE backend chooses a single module for layernorm and linear

column_parallel_layer_norm_linear() type[megatron.core.tensor_parallel.inference_layers.InferenceLayerNormColumnParallelLinear]#

Which module for sequential layernorm and linear

layer_norm(
rms_norm: bool = False,
for_qk: bool = False,
) megatron.core.transformer.torch_norm.LayerNormBuilder#

Which module to use for layer norm

core_attention() type[megatron.core.extensions.transformer_engine.TEDotProductAttention]#

Which module to use for attention

activation_func() megatron.core.transformer.mlp.TEActivationFunctionBuilder | None#

Which module to use for activation function

abstractmethod grouped_mlp_modules(
moe_use_grouped_gemm: bool,
moe_use_legacy_grouped_gemm: bool,
) tuple[type, megatron.core.transformer.mlp.MLPSubmodules | megatron.core.transformer.moe.experts.TEGroupedMLPSubmodules | None]#