bridge.recipes.utils.finetune_utils#

Utility functions for finetuning recipes.

Module Contents#

Functions#

default_peft_config

Create default PEFT configuration matching NeMo2 exactly.

default_squad_config

Create default SQuAD dataset configuration for finetuning recipes.

default_openmathinstruct2_config

Create default OpenMathInstruct-2 dataset configuration for finetuning recipes.

default_gsm8k_config

Create default GSM8K dataset configuration for finetuning recipes.

API#

bridge.recipes.utils.finetune_utils.default_peft_config(
peft_scheme: str | megatron.bridge.peft.base.PEFT | None,
**kwargs,
) megatron.bridge.peft.base.PEFT | None#

Create default PEFT configuration matching NeMo2 exactly.

Parameters:

peft_scheme – PEFT scheme - β€˜lora’, β€˜dora’, PEFT instance, or None for full finetuning

Returns:

PEFT configuration or None for full finetuning

bridge.recipes.utils.finetune_utils.default_squad_config(
seq_length: int,
packed_sequence: bool = True,
pad_seq_to_mult: int = 1,
) megatron.bridge.data.builders.hf_dataset.HFDatasetConfig#

Create default SQuAD dataset configuration for finetuning recipes.

Parameters:
  • seq_length – Sequence length for the dataset

  • packed_sequence – Whether to enable packed sequences for training efficiency

  • pad_seq_to_mult – Optional multiple to pad each sequence to when packing (set to 2 * context_parallel_size for THD CP runs).

Returns:

HFDatasetConfig configured for SQuAD finetuning

.. note::

Uses consistent settings across all finetuning recipes:

  • SQuAD dataset with appropriate dataloader type

  • 10% validation split

  • Seed 5678 (different from pretrain seed 1234)

  • Packed sequences when enabled improve training efficiency

bridge.recipes.utils.finetune_utils.default_openmathinstruct2_config(
seq_length: int = 4096,
packed_sequence: bool = False,
pad_seq_to_mult: int = 1,
) megatron.bridge.data.builders.hf_dataset.HFDatasetConfig#

Create default OpenMathInstruct-2 dataset configuration for finetuning recipes.

bridge.recipes.utils.finetune_utils.default_gsm8k_config(
seq_length: int = 2048,
packed_sequence: bool = False,
pad_seq_to_mult: int = 1,
) megatron.bridge.data.builders.hf_dataset.HFDatasetConfig#

Create default GSM8K dataset configuration for finetuning recipes.

GSM8K (Grade School Math 8K) is a dataset of 8.5K high quality linguistically diverse grade school math word problems. See: https://huggingface.co/datasets/openai/gsm8k

Parameters:
  • seq_length – Sequence length for the dataset (default 2048, sufficient for GSM8K)

  • packed_sequence – Whether to enable packed sequences for training efficiency

  • pad_seq_to_mult – Optional multiple to pad each sequence to when packing (set to 2 * context_parallel_size for THD CP runs).

Returns:

HFDatasetConfig configured for GSM8K finetuning

.. note::

  • GSM8K has 7,473 train and 1,319 test examples

  • Loads the full DatasetDict so the published test split is used for evaluation

  • Uses β€˜batch’ dataloader type for variable-length finetuning