bridge.recipes.common#

Module Contents#

Functions#

_pretrain_common

Create a base pre-training ConfigContainer with common defaults for any language model.

_sft_common

Create a base SFT (Supervised Fine-Tuning) ConfigContainer with common defaults.

_peft_common

Create a base PEFT (Parameter-Efficient Fine-Tuning) ConfigContainer with LoRA defaults.

API#

bridge.recipes.common._pretrain_common() megatron.bridge.training.config.ConfigContainer#

Create a base pre-training ConfigContainer with common defaults for any language model.

This function returns a ConfigContainer template with sensible defaults. The caller MUST set cfg.model and cfg.tokenizer.tokenizer_model before use.

Returns:

Base configuration template for pre-training.

Return type:

ConfigContainer

bridge.recipes.common._sft_common() megatron.bridge.training.config.ConfigContainer#

Create a base SFT (Supervised Fine-Tuning) ConfigContainer with common defaults.

This function returns a ConfigContainer template with sensible defaults for full SFT (not LoRA/DoRA). The caller MUST set cfg.model and cfg.tokenizer.tokenizer_model before use.

Key differences from pre-training:

  • Uses HFDatasetConfig with SQuAD as default dataset

  • Lower learning rate (5e-6) suitable for full fine-tuning

  • Fewer training iterations (1000)

  • Smaller batch sizes

  • Supports pretrained_checkpoint loading

  • No PEFT (full parameter training)

Returns:

Base configuration template for full SFT.

Return type:

ConfigContainer

bridge.recipes.common._peft_common() megatron.bridge.training.config.ConfigContainer#

Create a base PEFT (Parameter-Efficient Fine-Tuning) ConfigContainer with LoRA defaults.

This function returns a ConfigContainer template with sensible defaults for PEFT using LoRA. The caller MUST set cfg.model and cfg.tokenizer.tokenizer_model before use.

Key differences from full SFT:

  • Higher learning rate (1e-4) suitable for adapter training

  • LoRA enabled by default with standard settings (dim=32, alpha=32)

  • Targets all linear layers: linear_qkv, linear_proj, linear_fc1, linear_fc2

Returns:

Base configuration template for PEFT with LoRA.

Return type:

ConfigContainer