nemo_automodel.components._peft.lora
#
Module Contents#
Classes#
Linear + LoRA, maintains ckpts structure (i.e. Linear’s weight/bias remain at the same FQN). |
|
Subclass of LinearLoRA that uses triton kernels for forward and backward passes. |
|
Autograd function that calls the triton kernel wrappers for the LoRA forward and backward passes. |
Functions#
Monkey-patches a nn.Linear (orig_linear param) to be a LinearLoRA. |
|
Replace selected nn.Linear layers with LinearLoRA layers (in-place). |
API#
- class nemo_automodel.components._peft.lora.PeftConfig[source]#
- target_modules: list#
‘field(…)’
- exclude_modules: list#
‘field(…)’
- match_all_linear: bool#
False
- dim: int#
8
- alpha: int#
32
- dropout: float#
0.0
- dropout_position: Literal[pre, post]#
‘post’
- lora_A_init: str#
‘xavier’
- lora_dtype: Optional[torch.dtype]#
None
- use_triton: bool#
False
- class nemo_automodel.components._peft.lora.LinearLoRA(
- orig_linear,
- dim=8,
- alpha=32,
- dropout=0.0,
- dropout_position='post',
- lora_A_init_method='xavier',
- lora_dtype=None,
Bases:
torch.nn.Linear
Linear + LoRA, maintains ckpts structure (i.e. Linear’s weight/bias remain at the same FQN).
The _init_wrapper and _forward methods provide the LoRA functionality. We want to be able to use those inside LinearLoRA but also for monkey-patching modules, without repeating the same code -> therefore those are decorated with @staticmethod.
Initialization
LinearLora constructor.
- Parameters:
orig_linear (nn.Module) – the linear module to augment.
dim (int) – lora’s dim in_features -> dim -> out_features.
alpha (int) – lora’s scaling alpha.
dropout (float) – dropout prob (default: 0.0).
dropout_position (str) – where to apply dropout rel. to lora (choices= [‘pre’, ‘post’], default=post)
lora_A_init_method (str) – init method for lora_A (choices= [‘xavier’, ‘uniform’])
lora_dtype (torch.dtype) – weight’s dtype, by default will use orig_linear’s but if they
weights (are quantized)
- init_lora_weights(init_method: str)[source]#
Initialize the LoRA weights.
- Parameters:
init_method (str) – Method to initialize the LoRA weights.
- static _init_adapter(
- obj,
- dim=8,
- alpha=32,
- dropout=0.0,
- dropout_position='post',
- lora_A_init_method='xavier',
- lora_dtype=None,
Adds LoRA weights to obj. Obj is either a LinearLoRA or an nn.Module (when monkey-patching).
- Parameters:
obj (LinearLoRA | nn.Module) – input module to adapt.
dim (int) – lora’s dim in_features -> dim -> out_features.
alpha (int) – lora’s scaling alpha.
dropout (float) – dropout prob (default: 0.0).
dropout_position (str) – where to apply dropout rel. to lora (choices= [‘pre’, ‘post’], default=post)
lora_A_init_method (str) – init method for lora_A (choices= [‘xavier’, ‘uniform’])
lora_dtype (torch.dtype) – weight’s dtype, by default will use orig_linear’s but if they
weights (are quantized)
- forward(x)[source]#
Forward pass through the original linear layer augmented with the LoRA pathway.
Applies LoRA either before or after the dropout, depending on the configuration. The result of the original linear transformation is combined with the LoRA output.
- Parameters:
x (Tensor) – Input tensor of shape (batch_size, in_features).
- Returns:
Output tensor of shape (batch_size, out_features).
- Return type:
Tensor
- class nemo_automodel.components._peft.lora.TritonLinearLoRA(
- orig_linear,
- dim=8,
- alpha=32,
- dropout=0.0,
- dropout_position='post',
- lora_A_init_method='xavier',
- lora_dtype=None,
Bases:
nemo_automodel.components._peft.lora.LinearLoRA
Subclass of LinearLoRA that uses triton kernels for forward and backward passes.
- Parameters:
orig_linear (nn.Module) – the linear module to augment.
dim (int) – lora’s dim in_features -> dim -> out_features.
alpha (int) – lora’s scaling alpha.
dropout (float) – dropout prob (default: 0.0).
dropout_position (str) – where to apply dropout rel. to lora (choices= [‘pre’, ‘post’], default=post)
lora_A_init_method (str) – init method for lora_A (choices= [‘xavier’, ‘uniform’])
lora_dtype (torch.dtype) – weight’s dtype, by default will use orig_linear’s but if they
weights (are quantized)
Initialization
LinearLora constructor.
- Parameters:
orig_linear (nn.Module) – the linear module to augment.
dim (int) – lora’s dim in_features -> dim -> out_features.
alpha (int) – lora’s scaling alpha.
dropout (float) – dropout prob (default: 0.0).
dropout_position (str) – where to apply dropout rel. to lora (choices= [‘pre’, ‘post’], default=post)
lora_A_init_method (str) – init method for lora_A (choices= [‘xavier’, ‘uniform’])
lora_dtype (torch.dtype) – weight’s dtype, by default will use orig_linear’s but if they
weights (are quantized)
- nemo_automodel.components._peft.lora.patch_linear_module(
- orig_linear,
- dim=8,
- alpha=32,
- dropout=0.0,
- dropout_position='post',
- lora_A_init_method='xavier',
- lora_dtype=None,
- use_triton=True,
Monkey-patches a nn.Linear (orig_linear param) to be a LinearLoRA.
The orig_linear might not contain valid weights, for example, the given orig_linear was initialized within a context-manager that uses a “meta” device. Therefore, we cannot copy the weight/bias from the orig_linear to the LinearLoRA, since those have not been allocated,
To circumvent this scenario, LinearLoRA’s additional functionality (_init_adapter, _forward) is based on static functions, so that we can use them for patching or when allocating a new LinearLoRA object.
- Parameters:
orig_linear (nn.Linear) – the module we add adapter to.
dim (int, optional) – Lora dim. Defaults to 8.
alpha (int, optional) – Lora alpha scale. Defaults to 32.
dropout (float, optional) – dropout prob. Defaults to 0.0.
dropout_position (str, optional) – location to apply dropout wrt lora. Defaults to ‘post’ (choices: ‘pre’, ‘post’).
lora_A_init_method (str, optional) – lora_a init method. Defaults to ‘xavier’.
lora_dtype (type, optional) – Lora weights’ dtype. By default will use orig_linear’s dtype
dtype (but orig_linear might use non-trainable)
None. (specify the dtype manually. Defaults to)
use_triton (bool, optional) – By default we use the triton kernel LoRA implementation.
- Returns:
the monkey-patched (nn.Linear + LoRA) nn.Module
- Return type:
(nn.Module)
- nemo_automodel.components._peft.lora.apply_lora_to_linear_modules(
- model: torch.nn.Module,
- peft_config: nemo_automodel.components._peft.lora.PeftConfig,
Replace selected nn.Linear layers with LinearLoRA layers (in-place).
target_modules accepts wildcard fragments, e.g. [“q_proj”, “k_proj”, “.fc.”].
- class nemo_automodel.components._peft.lora.LoRATritonFunction(*args, **kwargs)[source]#
Bases:
torch.autograd.Function
Autograd function that calls the triton kernel wrappers for the LoRA forward and backward passes.
Initialization