nemo_rl.data.hf_datasets.oasst#

Module Contents#

Classes#

Functions#

parse_conversations

Recusive function that returns all the sub converstaions in a list starting from node tree_obj.

get_data_records

download_and_process_oasst

Data#

API#

nemo_rl.data.hf_datasets.oasst.SYSTEM_PROMPT = <Multiline-String>#
nemo_rl.data.hf_datasets.oasst.parse_conversations(tree_obj, first=False)[source]#

Recusive function that returns all the sub converstaions in a list starting from node tree_obj.

Parameters:

tree_obj (obj) – current conversation node

Returns:

a list of sub conversation threads including the current conversation node

nemo_rl.data.hf_datasets.oasst.get_data_records(objs)[source]#
nemo_rl.data.hf_datasets.oasst.download_and_process_oasst(
output_directory='.',
seed=42,
split_ratio=0.95,
)[source]#
class nemo_rl.data.hf_datasets.oasst.OasstDataset(output_dir: str = '.')[source]#

Initialization