`nemo_rl.data.datasets.preference_datasets.binary_preference_dataset`#

Module Contents#

Classes#

BinaryPreferenceDataset

Dataset class for binary preference data which can be loaded from a JSON file.

Functions#

to_preference_data_format

API#

nemo_rl.data.datasets.preference_datasets.binary_preference_dataset.to_preference_data_format( data: dict[str, Any], prompt_key: str, chosen_key: str, rejected_key: str, ) → dict[str, list[dict[str, Any]]]#

class nemo_rl.data.datasets.preference_datasets.binary_preference_dataset.BinaryPreferenceDataset( train_data_path: str, val_data_path: Optional[str] = None, prompt_key: str = 'prompt', chosen_key: str = 'chosen', rejected_key: str = 'rejected', train_split: Optional[str] = None, val_split: Optional[str] = None, )#

Dataset class for binary preference data which can be loaded from a JSON file.

This class handles loading of preference data for DPO and RM training. It will be converted to the format of PreferenceDataset through the to_preference_data_format function.

The input JSONL files should contain valid JSON objects formatted like this: { prompt_key: str, # The input prompt/context chosen_key: str, # The preferred/winning response rejected_key: str, # The non-preferred/losing response }

Parameters:

train_data_path – Path to the JSON file containing training data
val_data_path – Path to the JSON file containing validation data
prompt_key – Key for the input prompt/context, default is “prompt”
chosen_key – Key for the preferred/winning response, default is “chosen”
rejected_key – Key for the non-preferred/losing response, default is “rejected”
train_split – Split name for the training data, used for HuggingFace datasets, default is None
val_split – Split name for the validation data, used for HuggingFace datasets, default is None

Initialization

nemo_rl.data.datasets.preference_datasets.binary_preference_dataset#

Module Contents#

Classes#

Functions#

API#

`nemo_rl.data.datasets.preference_datasets.binary_preference_dataset`#