bridge.recipes.common#
Module Contents#
Functions#
Create a base pre-training ConfigContainer with common defaults for any language model. |
|
Create a base SFT (Supervised Fine-Tuning) ConfigContainer with common defaults. |
|
Create a base PEFT (Parameter-Efficient Fine-Tuning) ConfigContainer with LoRA defaults. |
API#
- bridge.recipes.common._pretrain_common() megatron.bridge.training.config.ConfigContainer#
Create a base pre-training ConfigContainer with common defaults for any language model.
This function returns a ConfigContainer template with sensible defaults. The caller MUST set
cfg.modelandcfg.tokenizer.tokenizer_modelbefore use.- Returns:
Base configuration template for pre-training.
- Return type:
- bridge.recipes.common._sft_common() megatron.bridge.training.config.ConfigContainer#
Create a base SFT (Supervised Fine-Tuning) ConfigContainer with common defaults.
This function returns a ConfigContainer template with sensible defaults for full SFT (not LoRA/DoRA). The caller MUST set
cfg.modelandcfg.tokenizer.tokenizer_modelbefore use.Key differences from pre-training:
Uses HFDatasetConfig with SQuAD as default dataset
Lower learning rate (5e-6) suitable for full fine-tuning
Fewer training iterations (1000)
Smaller batch sizes
Supports pretrained_checkpoint loading
No PEFT (full parameter training)
- Returns:
Base configuration template for full SFT.
- Return type:
- bridge.recipes.common._peft_common() megatron.bridge.training.config.ConfigContainer#
Create a base PEFT (Parameter-Efficient Fine-Tuning) ConfigContainer with LoRA defaults.
This function returns a ConfigContainer template with sensible defaults for PEFT using LoRA. The caller MUST set
cfg.modelandcfg.tokenizer.tokenizer_modelbefore use.Key differences from full SFT:
Higher learning rate (1e-4) suitable for adapter training
LoRA enabled by default with standard settings (dim=32, alpha=32)
Targets all linear layers: linear_qkv, linear_proj, linear_fc1, linear_fc2
- Returns:
Base configuration template for PEFT with LoRA.
- Return type: