bridge.recipes.common#

Module Contents#

Functions#

_pretrain_common

Create a base pre-training ConfigContainer with common defaults for any language model.

_sft_common

Create a base SFT (Supervised Fine-Tuning) ConfigContainer with common defaults.

_peft_common

Create a base PEFT (Parameter-Efficient Fine-Tuning) ConfigContainer with LoRA defaults.

_sft_common_vlm

Create a base SFT ConfigContainer with common defaults for Vision-Language Models.

_peft_common_vlm

Create a base PEFT ConfigContainer with LoRA defaults for Vision-Language Models.

API#

bridge.recipes.common._pretrain_common() megatron.bridge.training.config.ConfigContainer#

Create a base pre-training ConfigContainer with common defaults for any language model.

This function returns a ConfigContainer template with sensible defaults. The caller MUST set cfg.model and cfg.tokenizer.tokenizer_model before use.

Returns:

Base configuration template for pre-training.

Return type:

ConfigContainer

bridge.recipes.common._sft_common() megatron.bridge.training.config.ConfigContainer#

Create a base SFT (Supervised Fine-Tuning) ConfigContainer with common defaults.

This function returns a ConfigContainer template with sensible defaults for full SFT (not LoRA/DoRA). The caller MUST set cfg.model and cfg.tokenizer.tokenizer_model before use.

Key differences from pre-training:

  • Uses HFDatasetConfig with SQuAD as default dataset

  • Lower learning rate (5e-6) suitable for full fine-tuning

  • Fewer training iterations (1000)

  • Smaller batch sizes

  • Supports pretrained_checkpoint loading

  • No PEFT (full parameter training)

Returns:

Base configuration template for full SFT.

Return type:

ConfigContainer

bridge.recipes.common._peft_common() megatron.bridge.training.config.ConfigContainer#

Create a base PEFT (Parameter-Efficient Fine-Tuning) ConfigContainer with LoRA defaults.

This function returns a ConfigContainer template with sensible defaults for PEFT using LoRA. The caller MUST set cfg.model and cfg.tokenizer.tokenizer_model before use.

Key differences from full SFT:

  • Higher learning rate (1e-4) suitable for adapter training

  • LoRA enabled by default with standard settings (dim=32, alpha=32)

  • Targets all linear layers: linear_qkv, linear_proj, linear_fc1, linear_fc2

Returns:

Base configuration template for PEFT with LoRA.

Return type:

ConfigContainer

bridge.recipes.common._sft_common_vlm() megatron.bridge.training.config.ConfigContainer#

Create a base SFT ConfigContainer with common defaults for Vision-Language Models.

This function inherits from _sft_common() and overrides VLM-specific settings. The caller MUST set cfg.model and cfg.dataset.hf_processor_path before use.

Key differences from LLM SFT (_sft_common):

  • Uses HFDatasetConversationProvider with HuggingFace datasets (e.g., CORD-v2)

  • Uses NullTokenizer (VLMs use processor instead of tokenizer)

  • DDP config optimized for VLM training (no grad/param overlap)

  • Supports freeze options for language_model, vision_model, vision_projection

  • Different training defaults (train_iters=300000, GBS=32, MBS=2)

  • Different RNG seed (1234)

Returns:

Base configuration template for VLM full SFT.

Return type:

ConfigContainer

bridge.recipes.common._peft_common_vlm() megatron.bridge.training.config.ConfigContainer#

Create a base PEFT ConfigContainer with LoRA defaults for Vision-Language Models.

This function inherits from _peft_common() and overrides VLM-specific settings. The caller MUST set cfg.model and cfg.dataset.hf_processor_path before use.

Key differences from LLM PEFT (_peft_common):

  • Uses HFDatasetConversationProvider with HuggingFace datasets (e.g., CORD-v2)

  • Uses NullTokenizer (VLMs use processor instead of tokenizer)

  • DDP config optimized for VLM training (no grad/param overlap)

  • Supports freeze options for language_model, vision_model, vision_projection

  • Different training defaults (train_iters=300000, GBS=32, MBS=2)

  • Different RNG seed (1234)

  • Higher LR (1e-4) for adapter training

Returns:

Base configuration template for VLM PEFT with LoRA.

Return type:

ConfigContainer