nemo_automodel.components.moe.config
nemo_automodel.components.moe.config
MoE model configuration.
Module Contents
Classes
API
Dataclass
Configuration for routed and shared MoE expert modules.
activation_alpha
activation_limit
aux_loss_coeff
dim
dtype
expert_activation
expert_bias
expert_dim
Dimension used for expert projections (latent size when set, otherwise model dim).
force_e_score_correction_bias
gate_bias_update_factor
inter_dim
moe_inter_dim
moe_latent_size
n_activated_experts
n_expert_groups
n_limited_groups
n_routed_experts
n_shared_experts
norm_topk_prob
route_scale
router_bias
score_func
shared_expert_activation
shared_expert_gate
shared_expert_inter_dim
softmax_before_topk
swiglu_limit
train_gate
Dataclass
Configuration for MoE load balance metrics logging.
detailed_every_steps
enabled
mode
top_k_experts