nemo_rl.data.datasets.preference_datasets.binary_preference_dataset
#
Module Contents#
Classes#
Dataset class for binary preference data which can be loaded from a JSON file. |
Functions#
API#
- nemo_rl.data.datasets.preference_datasets.binary_preference_dataset.to_preference_data_format(
- data: dict[str, Any],
- prompt_key: str,
- chosen_key: str,
- rejected_key: str,
- class nemo_rl.data.datasets.preference_datasets.binary_preference_dataset.BinaryPreferenceDataset(
- train_data_path: str,
- val_data_path: Optional[str] = None,
- prompt_key: str = 'prompt',
- chosen_key: str = 'chosen',
- rejected_key: str = 'rejected',
- train_split: Optional[str] = None,
- val_split: Optional[str] = None,
Dataset class for binary preference data which can be loaded from a JSON file.
This class handles loading of preference data for DPO and RM training. It will be converted to the format of PreferenceDataset through the
to_preference_data_format
function.The input JSONL files should contain valid JSON objects formatted like this: { prompt_key: str, # The input prompt/context chosen_key: str, # The preferred/winning response rejected_key: str, # The non-preferred/losing response }
- Parameters:
train_data_path โ Path to the JSON file containing training data
val_data_path โ Path to the JSON file containing validation data
prompt_key โ Key for the input prompt/context, default is โpromptโ
chosen_key โ Key for the preferred/winning response, default is โchosenโ
rejected_key โ Key for the non-preferred/losing response, default is โrejectedโ
train_split โ Split name for the training data, used for HuggingFace datasets, default is None
val_split โ Split name for the validation data, used for HuggingFace datasets, default is None
Initialization