nemo_rl.data.datasets.response_datasets.gsm8k#

Module Contents#

Classes#

GSM8KDataset

Simple wrapper around the GSM8K dataset.

Functions#

API#

nemo_rl.data.datasets.response_datasets.gsm8k._extract_hash_answer(text: str) str | None#
class nemo_rl.data.datasets.response_datasets.gsm8k.GSM8KDataset(
split: str = 'train',
extract_answer: bool = True,
system_prompt_file: str | None = None,
**kwargs,
)#

Bases: nemo_rl.data.datasets.raw_dataset.RawDataset

Simple wrapper around the GSM8K dataset.

Parameters:
  • split – Split name for the dataset, default is “train”

  • extract_answer – Whether to extract the answer from the dataset, default is True

Initialization

format_data(data: dict[str, Any]) dict[str, Any]#