nemo_rl.data.datasets.response_datasets.dapo_math#

Module Contents#

Classes#

Functions#

format_dapo_math_17k

prepare_dapo_math_17k_dataset

Load and split the DeepScaler dataset into train and test sets.

API#

nemo_rl.data.datasets.response_datasets.dapo_math.format_dapo_math_17k(
data: dict[str, str | float | int],
) dict[str, list[Any] | str]#
nemo_rl.data.datasets.response_datasets.dapo_math.prepare_dapo_math_17k_dataset(
seed: int = 42,
) dict[str, datasets.Dataset | None]#

Load and split the DeepScaler dataset into train and test sets.

class nemo_rl.data.datasets.response_datasets.dapo_math.DAPOMath17KDataset(seed: int = 42)#

Initialization

Initialize the DAPO Math 17K dataset with train split.

Parameters:

seed – Random seed for reproducible splitting