nemo_rl.data.utils#
Module Contents#
Functions#
Setup data with environments. |
|
Setup preference data. |
API#
- nemo_rl.data.utils.setup_response_data(
- tokenizer: transformers.AutoProcessor | transformers.AutoTokenizer,
- data_config: nemo_rl.data.DataConfig,
- env_configs: Optional[dict[str, Any]] = None,
- is_vlm: bool = False,
Setup data with environments.
This function is used to setup the data and environments for the training and validation datasets.
- Parameters:
tokenizer – Tokenizer or processor.
data_config – Data config.
env_configs –
Environment configs. If None, no environments will be created. This is used for:
Algorithms like SFT which do not need environments.
Environments like NeMo-Gym which need to handle the environment creation outside of this function.
is_vlm – Whether to use VLM training or not.
- Returns:
A tuple of (train dataset, validation dataset, task to environment, task to validation environment). If env_configs is None: A tuple of (train dataset, validation dataset).
- Return type:
If env_configs is not None
- nemo_rl.data.utils.setup_preference_data(
- tokenizer: transformers.AutoTokenizer,
- data_config: nemo_rl.data.DataConfig,
Setup preference data.
This function is used to setup the preference data for the training and validation datasets.
- Parameters:
tokenizer – Tokenizer.
data_config – Data config for preference dataset.
- Returns:
A tuple of (train dataset, validation dataset).