nemo_rl.data.datasets.response_datasets.openmathinstruct2#
Module Contents#
Classes#
Simple wrapper around the OpenMathInstruct2 dataset. |
API#
- class nemo_rl.data.datasets.response_datasets.openmathinstruct2.OpenMathInstruct2Dataset(
- output_key: str = 'expected_answer',
- split: str = 'train_1M',
- split_validation_size: float = 0.05,
- seed: int = 42,
- **kwargs,
Bases:
nemo_rl.data.datasets.raw_dataset.RawDatasetSimple wrapper around the OpenMathInstruct2 dataset.
- Parameters:
output_key – Key for the output text, default is “expected_answer”
split – Split name for the dataset, default is “train_1M”
split_validation_size – Size of the validation data, default is 0.05
seed – Seed for train/validation split when split_validation_size > 0, default is 42
Initialization
- format_data(data: dict[str, Any]) dict[str, Any]#