nemo_microservices.types.customization.lora_parameters#
Module Contents#
Classes#
API#
- class nemo_microservices.types.customization.lora_parameters.LoraParameters(/, **data: Any)#
- Bases: - nemo_microservices._models.BaseModel- adapter_dim: int | None#
- None - Size of adapter layers added throughout the model. - This is the size of the tunable layers that LoRA adds to various transformer blocks in the base model. This parameter is a power of 2. 
 - adapter_dropout: float | None#
- None - Dropout probability in the adapter layer. 
 - alpha: int | None#
- None - Scaling factor for the LoRA update. - Controls the magnitude of the low-rank approximation. A higher alpha value increases the impact of the LoRA weights, effectively amplifying the changes made to the original model. Proper tuning of alpha is essential, as it balances the adaptation’s impact, ensuring neither underfitting nor overfitting. This is often a multiple of Adapter Dimension 
 - target_modules: List[str] | None#
- None - Target specific layers in the model architecture to apply LoRA. - We select a subset of the layers by default. However, specific layers can also be selected. For example: - linear_qkv: Apply LoRA to the fused linear layer used for query, key, and value projections in self-attention.
- linear_proj: Apply LoRA to the linear layer used for projecting the output of self-attention.
- linear_fc1: Apply LoRA to the first fully-connected layer in MLP.
- linear_fc2: Apply LoRA to the second fully-connected layer in MLP.
- *_proj: Apply LoRA to all layers used for projecting the output of self-attention. Target modules can also contain wildcards. For example, you can specify- target_modules=['*.layers.0.*.linear_qkv', '*.layers.1.*.linear_qkv']to add LoRA to only linear_qkv on the first two layers.
 - Our framework only supports a Fused LoRA implementation, Cannonical LoRA is not supported.