nemo_rl.data.datasets.response_datasets.gsm8k#
Module Contents#
Classes#
Simple wrapper around the GSM8K dataset. |
Functions#
API#
- nemo_rl.data.datasets.response_datasets.gsm8k._extract_hash_answer(text: str) str | None#
- class nemo_rl.data.datasets.response_datasets.gsm8k.GSM8KDataset(
- split: str = 'train',
- extract_answer: bool = True,
- system_prompt_file: str | None = None,
- **kwargs,
Bases:
nemo_rl.data.datasets.raw_dataset.RawDatasetSimple wrapper around the GSM8K dataset.
- Parameters:
split – Split name for the dataset, default is “train”
extract_answer – Whether to extract the answer from the dataset, default is True
Initialization
- format_data(data: dict[str, Any]) dict[str, Any]#