bridge.peft.dora#

Module Contents#

Classes#

DoRA

Implements the DoRA (Weight-Decomposed LowRank Adaptation) module for parameter-efficient fine-tuning.

Data#

API#

bridge.peft.dora.logger#

‘getLogger(…)’

class bridge.peft.dora.DoRA#

Bases: megatron.bridge.peft.base.PEFT, megatron.bridge.peft.module_matcher.ModuleMatcher

Implements the DoRA (Weight-Decomposed LowRank Adaptation) module for parameter-efficient fine-tuning.

DoRA decomposes pre-trained weight into magnitude and direction, and uses a low-rank projection in the directional component to adapt the weights of a pre-trained model to a new downstream task. This class facilitates the application of DoRA to specific modules within the model architecture.

Parameters:
  • target_modules (List[str], optional) – A list of module names to apply DoRA to. Defaults to all linear layers [‘linear_qkv’, ‘linear_proj’, ‘linear_fc1’, ‘linear_fc2’]. - ‘linear_qkv’: Apply DoRA to the fused linear layer used for query, key, and value projections in self-attention. - ‘linear_proj’: Apply DoRA to the linear layer used for projecting the output of self-attention. - ‘linear_fc1’: Apply DoRA to the first fully-connected layer in MLP. - ‘linear_fc2’: Apply DoRA to the second fully-connected layer in MLP. Target modules can also contain wildcards. For example, you can specify target_modules=[’.layers.0..linear_qkv’, ‘.layers.1..linear_qkv’] to add DoRA to only linear_qkv on the first two layers.

  • dim (int) – Dimension of the low-rank projection space. Defaults to 32.

  • alpha (int) – Weighting factor for the low-rank projection. Defaults to 64.

  • dropout (float) – Dropout rate for the low-rank projection. Defaults to 0.0.

  • dropout_position (Literal['pre', 'post'], optional) – Position for applying dropout. Can be ‘pre’ (before the low-rank projection) or ‘post’ (after). Defaults to ‘pre’.

  • lora_A_init_method (str) – Initialization method for the low-rank matrix A. Defaults to “xavier”.

  • lora_B_init_method (str) – Initialization method for the low-rank matrix B. Defaults to “zero”.

target_modules: List[str]#

‘field(…)’

dim: int#

32

alpha: int#

64

dropout: float#

0.0

dropout_position: Literal[pre, post]#

‘pre’

lora_A_init_method: str#

‘xavier’

lora_B_init_method: str#

‘zero’

__post_init__()#

Initialize attributes from parent classes and validate configuration.

transform(
m: torch.nn.Module,
name: Optional[str] = None,
prefix: Optional[str] = None,
) torch.nn.Module#

Applies DoRA to a specific module within the model architecture.

Parameters:
  • m (nn.Module) – The module to apply DoRA to.

  • name (str, optional) – Name of the module (if applicable). Defaults to None.

  • prefix (str, optional) – Prefix for the module name (if applicable). Defaults to None.

Returns:

The modified module with DoRA applied, or the original module if not a target.

Return type:

nn.Module