bridge.recipes.common#
Module Contents#
Functions#
Create a base pre-training ConfigContainer with common defaults for any language model. |
|
Create a base SFT (Supervised Fine-Tuning) ConfigContainer with common defaults. |
|
Create a base PEFT (Parameter-Efficient Fine-Tuning) ConfigContainer with LoRA defaults. |
|
Create a base SFT ConfigContainer with common defaults for Vision-Language Models. |
|
Create a base PEFT ConfigContainer with LoRA defaults for Vision-Language Models. |
API#
- bridge.recipes.common._pretrain_common() megatron.bridge.training.config.ConfigContainer#
Create a base pre-training ConfigContainer with common defaults for any language model.
This function returns a ConfigContainer template with sensible defaults. The caller MUST set
cfg.modelandcfg.tokenizer.tokenizer_modelbefore use.- Returns:
Base configuration template for pre-training.
- Return type:
- bridge.recipes.common._sft_common() megatron.bridge.training.config.ConfigContainer#
Create a base SFT (Supervised Fine-Tuning) ConfigContainer with common defaults.
This function returns a ConfigContainer template with sensible defaults for full SFT (not LoRA/DoRA). The caller MUST set
cfg.modelandcfg.tokenizer.tokenizer_modelbefore use.Key differences from pre-training:
Uses HFDatasetConfig with SQuAD as default dataset
Lower learning rate (5e-6) suitable for full fine-tuning
Fewer training iterations (1000)
Smaller batch sizes
Supports pretrained_checkpoint loading
No PEFT (full parameter training)
- Returns:
Base configuration template for full SFT.
- Return type:
- bridge.recipes.common._peft_common() megatron.bridge.training.config.ConfigContainer#
Create a base PEFT (Parameter-Efficient Fine-Tuning) ConfigContainer with LoRA defaults.
This function returns a ConfigContainer template with sensible defaults for PEFT using LoRA. The caller MUST set
cfg.modelandcfg.tokenizer.tokenizer_modelbefore use.Key differences from full SFT:
Higher learning rate (1e-4) suitable for adapter training
LoRA enabled by default with standard settings (dim=32, alpha=32)
Targets all linear layers: linear_qkv, linear_proj, linear_fc1, linear_fc2
- Returns:
Base configuration template for PEFT with LoRA.
- Return type:
- bridge.recipes.common._sft_common_vlm() megatron.bridge.training.config.ConfigContainer#
Create a base SFT ConfigContainer with common defaults for Vision-Language Models.
This function inherits from
_sft_common()and overrides VLM-specific settings. The caller MUST setcfg.modelandcfg.dataset.hf_processor_pathbefore use.Key differences from LLM SFT (
_sft_common):Uses HFDatasetConversationProvider with HuggingFace datasets (e.g., CORD-v2)
Uses NullTokenizer (VLMs use processor instead of tokenizer)
DDP config optimized for VLM training (no grad/param overlap)
Supports freeze options for language_model, vision_model, vision_projection
Different training defaults (train_iters=300000, GBS=32, MBS=2)
Different RNG seed (1234)
- Returns:
Base configuration template for VLM full SFT.
- Return type:
- bridge.recipes.common._peft_common_vlm() megatron.bridge.training.config.ConfigContainer#
Create a base PEFT ConfigContainer with LoRA defaults for Vision-Language Models.
This function inherits from
_peft_common()and overrides VLM-specific settings. The caller MUST setcfg.modelandcfg.dataset.hf_processor_pathbefore use.Key differences from LLM PEFT (
_peft_common):Uses HFDatasetConversationProvider with HuggingFace datasets (e.g., CORD-v2)
Uses NullTokenizer (VLMs use processor instead of tokenizer)
DDP config optimized for VLM training (no grad/param overlap)
Supports freeze options for language_model, vision_model, vision_projection
Different training defaults (train_iters=300000, GBS=32, MBS=2)
Different RNG seed (1234)
Higher LR (1e-4) for adapter training
- Returns:
Base configuration template for VLM PEFT with LoRA.
- Return type: