nemo_rl.environments.penguin
#
Module Contents#
Classes#
This environment class isn’t really used for training. It’s really meant as an integration wrapper around Penguin that hooks into the existing NeMo RL resource management via ray. So there is still one source of truth for resource management in NeMo RL. |
Functions#
API#
- class nemo_rl.environments.penguin.PenguinConfig#
Bases:
typing.TypedDict
- model_name: str#
None
- base_urls: List[str]#
None
- initial_global_config_dict: Dict[str, Any]#
None
- class nemo_rl.environments.penguin.Penguin(cfg: nemo_rl.environments.penguin.PenguinConfig)#
Bases:
nemo_rl.environments.interfaces.EnvironmentInterface
This environment class isn’t really used for training. It’s really meant as an integration wrapper around Penguin that hooks into the existing NeMo RL resource management via ray. So there is still one source of truth for resource management in NeMo RL.
Initialization
- health_check() bool #
- async run_rollouts(penguin_examples: list[dict]) list[dict] #
- _postprocess_penguin_to_nemo_rl_result(penguin_result: dict) dict #
- shutdown() None #
- abstractmethod step(message_log_batch, metadata)#
- abstractmethod global_post_process_and_metrics(batch)#
- nemo_rl.environments.penguin.setup_penguin_config(config, tokenizer) None #
- nemo_rl.environments.penguin.penguin_example_to_nemo_rl_datum_spec(
- penguin_example: dict,
- idx: int,