bridge.recipes.utils.finetune_utils#
Utility functions for finetuning recipes.
Module Contents#
Functions#
Create default PEFT configuration matching NeMo2 exactly. |
|
Create default SQuAD dataset configuration for finetuning recipes. |
|
Create default OpenMathInstruct-2 dataset configuration for finetuning recipes. |
|
Create default GSM8K dataset configuration for finetuning recipes. |
API#
- bridge.recipes.utils.finetune_utils.default_peft_config(
- peft_scheme: str | megatron.bridge.peft.base.PEFT | None,
- **kwargs,
Create default PEFT configuration matching NeMo2 exactly.
- Parameters:
peft_scheme β PEFT scheme - βloraβ, βdoraβ, PEFT instance, or None for full finetuning
- Returns:
PEFT configuration or None for full finetuning
- bridge.recipes.utils.finetune_utils.default_squad_config(
- seq_length: int,
- packed_sequence: bool = True,
- pad_seq_to_mult: int = 1,
Create default SQuAD dataset configuration for finetuning recipes.
- Parameters:
seq_length β Sequence length for the dataset
packed_sequence β Whether to enable packed sequences for training efficiency
pad_seq_to_mult β Optional multiple to pad each sequence to when packing (set to
2 * context_parallel_sizefor THD CP runs).
- Returns:
HFDatasetConfig configured for SQuAD finetuning
.. note::
Uses consistent settings across all finetuning recipes:
SQuAD dataset with appropriate dataloader type
10% validation split
Seed 5678 (different from pretrain seed 1234)
Packed sequences when enabled improve training efficiency
- bridge.recipes.utils.finetune_utils.default_openmathinstruct2_config(
- seq_length: int = 4096,
- packed_sequence: bool = False,
- pad_seq_to_mult: int = 1,
Create default OpenMathInstruct-2 dataset configuration for finetuning recipes.
- bridge.recipes.utils.finetune_utils.default_gsm8k_config(
- seq_length: int = 2048,
- packed_sequence: bool = False,
- pad_seq_to_mult: int = 1,
Create default GSM8K dataset configuration for finetuning recipes.
GSM8K (Grade School Math 8K) is a dataset of 8.5K high quality linguistically diverse grade school math word problems. See: https://huggingface.co/datasets/openai/gsm8k
- Parameters:
seq_length β Sequence length for the dataset (default 2048, sufficient for GSM8K)
packed_sequence β Whether to enable packed sequences for training efficiency
pad_seq_to_mult β Optional multiple to pad each sequence to when packing (set to
2 * context_parallel_sizefor THD CP runs).
- Returns:
HFDatasetConfig configured for GSM8K finetuning
.. note::
GSM8K has 7,473 train and 1,319 test examples
Loads the full DatasetDict so the published test split is used for evaluation
Uses βbatchβ dataloader type for variable-length finetuning