core.models.backends#
Module Contents#
Classes#
A protocol for providing the submodules used in Spec building. |
|
A protocol for providing Local submodules used in Spec building. |
|
A protocol for providing the submodules used in Spec building. |
API#
- class core.models.backends.BackendSpecProvider#
Bases:
typing.ProtocolA protocol for providing the submodules used in Spec building.
- abstractmethod column_parallel_linear() type#
Which column parallel linear module the backend uses
- abstractmethod row_parallel_linear() type#
Which row parallel linear module the backend uses
- abstractmethod fuse_layernorm_and_linear() bool#
Does the backend support a single module for layernorm and linear
- abstractmethod column_parallel_layer_norm_linear() Optional[type]#
Which module for sequential layernorm and linear
- abstractmethod layer_norm(rms_norm: bool = False, for_qk: bool = False) type#
Which module for layernorm
- abstractmethod core_attention() type#
Which module to use for attention
- abstractmethod grouped_mlp_modules(
- moe_use_grouped_gemm: bool,
- moe_use_legacy_grouped_gemm: bool,
Which module and submodules to use for grouped mlp
- abstractmethod activation_func() type#
Which module to use for activation function
- class core.models.backends.LocalSpecProvider#
Bases:
core.models.backends.BackendSpecProviderA protocol for providing Local submodules used in Spec building.
- column_parallel_linear() type#
Which column parallel linear module the backend uses
- row_parallel_linear() type#
Which row parallel linear module the backend uses
- fuse_layernorm_and_linear() bool#
Does the backend choose a single module for layernorm and linear
- column_parallel_layer_norm_linear() Optional[type]#
Which module for sequential layernorm and linear
- layer_norm(rms_norm: bool = False, for_qk: bool = False) type#
Which module to use for layer norm
- core_attention() type#
Which module to use for attention
- grouped_mlp_modules(
- moe_use_grouped_gemm: bool,
- moe_use_legacy_grouped_gemm: bool,
Which module and submodules to use for grouped mlp
- activation_func() type#
Which module to use for activation function
- class core.models.backends.InferenceSpecProvider#
Bases:
core.models.backends.BackendSpecProviderA protocol for providing the submodules used in Spec building.
- linear() type#
Which linear module TE backend uses
- column_parallel_linear() type#
Which column parallel linear module TE backend uses
- row_parallel_linear() type#
Which row parallel linear module TE backend uses
- fuse_layernorm_and_linear() bool#
TE backend chooses a single module for layernorm and linear
- column_parallel_layer_norm_linear() Optional[type]#
Which module for sequential layernorm and linear
- layer_norm(rms_norm: bool = False, for_qk: bool = False) type#
Which module to use for layer norm
- core_attention() type#
Which module to use for attention
- activation_func() type#
Which module to use for activation function
- abstractmethod grouped_mlp_modules(
- moe_use_grouped_gemm: bool,
- moe_use_legacy_grouped_gemm: bool,