`nemo_automodel.components._peft.lora_experts`#

Module Contents#

Classes#

`GroupedExpertsLoRA`	GroupedExperts + LoRA.
`GroupedExpertsDeepEPLoRA`	GroupedExpertsDeepEP + LoRA.

Functions#

_to_local

Convert DTensor to local tensor, or return as-is.

API#

nemo_automodel.components._peft.lora_experts._to_local(proj)#: Convert DTensor to local tensor, or return as-is.

class nemo_automodel.components._peft.lora_experts.GroupedExpertsLoRA( orig_module: nemo_automodel.components.moe.experts.GroupedExperts, lora_dim=8, alpha=32, lora_A_init_method='xavier', lora_dtype=None, )#

Bases: nemo_automodel.components.moe.experts.GroupedExperts

GroupedExperts + LoRA.

This class wraps GroupedExperts to apply LoRA to the expert weights.

.. attribute:: lora_dim

Rank of the LoRA adapter.

Type:: int

.. attribute:: scale

Scaling factor for the LoRA adapter (alpha / dim).

Type:: float

.. attribute:: lora_gate_and_up_A

LoRA A matrix for gate and up projections.

Type:: nn.Parameter

.. attribute:: lora_gate_and_up_B

LoRA B matrix for gate and up projections.

Type:: nn.Parameter

.. attribute:: lora_down_A

LoRA A matrix for down projection.

Type:: nn.Parameter

.. attribute:: lora_down_B

LoRA B matrix for down projection.

Type:: nn.Parameter

Initialization

Initializes the GroupedExperts module.

Parameters:

config – MoE configuration containing expert parameters.
backend – Backend configuration. When backend.experts == “torch_mm”, uses torch._grouped_mm instead of per-expert loop.

static _init_adapter( obj, lora_dim=8, alpha=32, lora_A_init_method='xavier', lora_dtype=None, )#

init_lora_weights(init_method)#

Initialize LoRA weights.

IMPORTANT: This method is called by the PEFT framework’s _init_peft_adapters after the model is materialized from meta device to the target device. The method name is critical - it serves as a hook for the framework. Do not rename or remove this method.

Parameters:: init_method (str) – Initialization method (‘xavier’ or ‘kaiming’).

forward( x: torch.Tensor, token_mask: torch.Tensor, weights: torch.Tensor, indices: torch.Tensor, )#

Forward pass for GroupedExpertsLoRA with LoRA injection.

Mirrors GroupedExperts.forward but injects LoRA computations into the expert processing at the projection level.

_forward_loop( x, weights, indices, token_mask, gate_and_up_projs, down_projs, lora_gate_and_up_A, lora_gate_and_up_B, lora_down_A, lora_down_B, n_local_experts, experts_start_idx, experts_end_idx, )#: Per-expert loop forward path with LoRA injection.

_forward_grouped_mm( x, token_mask, weights, indices, gate_and_up_projs, down_projs, lora_gate_and_up_A, lora_gate_and_up_B, lora_down_A, lora_down_B, n_local_experts, experts_start_idx, )#: Grouped GEMM forward path with LoRA injection using torch._grouped_mm.

class nemo_automodel.components._peft.lora_experts.GroupedExpertsDeepEPLoRA( orig_module: nemo_automodel.components.moe.experts.GroupedExpertsDeepEP, lora_dim=8, alpha=32, lora_A_init_method='xavier', lora_dtype=None, )#

Bases: nemo_automodel.components.moe.experts.GroupedExpertsDeepEP

GroupedExpertsDeepEP + LoRA.

This class wraps GroupedExpertsDeepEP to apply LoRA to the expert weights using DeepEP kernels.