nemo_rl.data.datasets.response_datasets.oasst#
Module Contents#
Classes#
Simple wrapper around the OASST dataset. |
Functions#
Recusive function that returns all the sub converstaions in a list starting from node tree_obj. |
|
Data#
API#
- nemo_rl.data.datasets.response_datasets.oasst.SYSTEM_PROMPT = <Multiline-String>#
- nemo_rl.data.datasets.response_datasets.oasst.parse_conversations(tree_obj, first: bool = False)#
Recusive function that returns all the sub converstaions in a list starting from node tree_obj.
- Parameters:
tree_obj (obj) – current conversation node
- Returns:
a list of sub conversation threads including the current conversation node
- nemo_rl.data.datasets.response_datasets.oasst.get_data_records(objs, task_name: str = 'oasst')#
- class nemo_rl.data.datasets.response_datasets.oasst.OasstDataset(
- split_validation_size: float = 0.05,
- seed: int = 42,
- **kwargs,
Bases:
nemo_rl.data.datasets.raw_dataset.RawDatasetSimple wrapper around the OASST dataset.
- Parameters:
split_validation_size – Size of the validation data, default is 0.05
seed – Seed for train/validation split when split_validation_size > 0, default is 42
Initialization