core.models.backends#

Module Contents#

Classes#

BackendSpecProvider

A protocol for providing the submodules used in Spec building.

LocalSpecProvider

A protocol for providing Local submodules used in Spec building.

InferenceSpecProvider

A protocol for providing the submodules used in Spec building.

API#

class core.models.backends.BackendSpecProvider#

Bases: typing.Protocol

A protocol for providing the submodules used in Spec building.

abstractmethod column_parallel_linear() type#

Which column parallel linear module the backend uses

abstractmethod row_parallel_linear() type#

Which row parallel linear module the backend uses

abstractmethod fuse_layernorm_and_linear() bool#

Does the backend support a single module for layernorm and linear

abstractmethod column_parallel_layer_norm_linear() Optional[type]#

Which module for sequential layernorm and linear

abstractmethod layer_norm(rms_norm: bool = False, for_qk: bool = False) type#

Which module for layernorm

abstractmethod core_attention() type#

Which module to use for attention

abstractmethod grouped_mlp_modules(
moe_use_grouped_gemm: bool,
moe_use_legacy_grouped_gemm: bool,
) Tuple[type, Optional[megatron.core.transformer.mlp.MLPSubmodules]]#

Which module and submodules to use for grouped mlp

abstractmethod activation_func() type#

Which module to use for activation function

class core.models.backends.LocalSpecProvider#

Bases: core.models.backends.BackendSpecProvider

A protocol for providing Local submodules used in Spec building.

column_parallel_linear() type#

Which column parallel linear module the backend uses

row_parallel_linear() type#

Which row parallel linear module the backend uses

fuse_layernorm_and_linear() bool#

Does the backend choose a single module for layernorm and linear

column_parallel_layer_norm_linear() Optional[type]#

Which module for sequential layernorm and linear

layer_norm(rms_norm: bool = False, for_qk: bool = False) type#

Which module to use for layer norm

core_attention() type#

Which module to use for attention

grouped_mlp_modules(
moe_use_grouped_gemm: bool,
moe_use_legacy_grouped_gemm: bool,
) Tuple[type, Optional[megatron.core.transformer.mlp.MLPSubmodules]]#

Which module and submodules to use for grouped mlp

activation_func() type#

Which module to use for activation function

class core.models.backends.InferenceSpecProvider#

Bases: core.models.backends.BackendSpecProvider

A protocol for providing the submodules used in Spec building.

linear() type#

Which linear module TE backend uses

column_parallel_linear() type#

Which column parallel linear module TE backend uses

row_parallel_linear() type#

Which row parallel linear module TE backend uses

fuse_layernorm_and_linear() bool#

TE backend chooses a single module for layernorm and linear

column_parallel_layer_norm_linear() Optional[type]#

Which module for sequential layernorm and linear

layer_norm(rms_norm: bool = False, for_qk: bool = False) type#

Which module to use for layer norm

core_attention() type#

Which module to use for attention

activation_func() type#

Which module to use for activation function

abstractmethod grouped_mlp_modules(
moe_use_grouped_gemm: bool,
moe_use_legacy_grouped_gemm: bool,
) Tuple[type, Optional[megatron.core.transformer.mlp.MLPSubmodules]]#