nemo_rl.environments.nemo_gym#
Module Contents#
Classes#
This environment class isn’t really used for training. It’s really meant as an integration wrapper around NeMo-Gym that hooks into the existing NeMo RL resource management via ray. So there is still one source of truth for resource management in NeMo RL. |
Functions#
API#
- class nemo_rl.environments.nemo_gym.NemoGymConfig#
Bases:
typing.TypedDict- model_name: str#
None
- base_urls: List[str]#
None
- initial_global_config_dict: Dict[str, Any]#
None
- class nemo_rl.environments.nemo_gym.NemoGym(cfg: nemo_rl.environments.nemo_gym.NemoGymConfig)#
Bases:
nemo_rl.environments.interfaces.EnvironmentInterfaceThis environment class isn’t really used for training. It’s really meant as an integration wrapper around NeMo-Gym that hooks into the existing NeMo RL resource management via ray. So there is still one source of truth for resource management in NeMo RL.
Initialization
- health_check() bool#
- async run_rollouts(
- nemo_gym_examples: list[dict],
- tokenizer: transformers.PreTrainedTokenizerBase,
- timer_prefix: str,
- _postprocess_nemo_gym_to_nemo_rl_result(
- nemo_gym_result: dict,
- tokenizer: transformers.PreTrainedTokenizerBase,
- shutdown() None#
- abstractmethod step(message_log_batch, metadata)#
- abstractmethod global_post_process_and_metrics(batch)#
- nemo_rl.environments.nemo_gym.setup_nemo_gym_config(config, tokenizer) None#
- nemo_rl.environments.nemo_gym.nemo_gym_example_to_nemo_rl_datum_spec(
- nemo_gym_example: dict,
- idx: int,