`nemo_rl.data.datasets.preference_datasets.binary_preference_dataset`#

Module Contents#

Classes#

BinaryPreferenceDataset

Dataset class for binary preference data which can be loaded from a JSON file.

API#

class nemo_rl.data.datasets.preference_datasets.binary_preference_dataset.BinaryPreferenceDataset(

data_path: str,

prompt_key: str = 'prompt',

chosen_key: str = 'chosen',

rejected_key: str = 'rejected',

subset: Optional[str] = None,

split: Optional[str] = None,

**kwargs,

)#

Bases: nemo_rl.data.datasets.raw_dataset.RawDataset

Dataset class for binary preference data which can be loaded from a JSON file.

This class handles loading of preference data for DPO and RM training. It will be converted to the format of PreferenceDataset through the to_preference_data_format function.

The input JSONL files should contain valid JSON objects formatted like this: { prompt_key: str, # The input prompt/context chosen_key: str, # The preferred/winning response rejected_key: str, # The non-preferred/losing response } Please refer to https://github.com/NVIDIA-NeMo/RL/blob/main/docs/guides/dpo.md#datasets for more details.

Parameters:

data_path – Path to the dataset JSON file
prompt_key – Key for the input prompt/context, default is “prompt”
chosen_key – Key for the preferred/winning response, default is “chosen”
rejected_key – Key for the non-preferred/losing response, default is “rejected”
subset – Optional subset name for the dataset, used for HuggingFace datasets
split – Optional split name for the dataset, used for HuggingFace datasets

Initialization

format_data(data: dict[str, Any]) → dict[str, Any]#

nemo_rl.data.datasets.preference_datasets.binary_preference_dataset#

Module Contents#

Classes#

API#

`nemo_rl.data.datasets.preference_datasets.binary_preference_dataset`#