nemo_gym.prompt

View as Markdown

Prompt configuration: YAML-based prompt templates applied at rollout time.

Prompt templates are mutually exclusive with pre-populated responses_create_params.input values. This separation enables prompt sweeps without re-preparing data.

Module Contents

Classes

NameDescription
MaterializePromptsConfigApply a prompt template to raw JSONL data, producing materialized JSONL
PromptConfigSchema for a prompt YAML file. user is required, system is optional.

Functions

NameDescription
_resolve_pathResolve a path relative to the Gym root (PARENT_DIR), consistent with config_paths resolution.
apply_prompt_to_rowApply prompt_config to a row, building responses_create_params.input.
fill_promptApply a prompt template to a data row, producing message dicts.
load_prompt_configLoad and validate a YAML prompt config file.
materialize_promptsApply a prompt template to raw JSONL data, producing materialized JSONL.
materialize_prompts_cliCLI entry point for ng_materialize_prompts.
validate_prompt_compatibilityValidate that no rows have pre-populated responses_create_params.input when a prompt_config is provided.

API

class nemo_gym.prompt.MaterializePromptsConfig()

Bases: BaseNeMoGymCLIConfig

Apply a prompt template to raw JSONL data, producing materialized JSONL with populated responses_create_params.input for RL training.

Examples:

ng_materialize_prompts \
+input_jsonl_fpath=data/my_dataset.jsonl \
+prompt_config=/path/to/my_prompt.yaml \
+output_jsonl_fpath=my_dataset_materialized.jsonl
input_jsonl_fpath
str
output_jsonl_fpath
str
prompt_config
str
class nemo_gym.prompt.PromptConfig()

Bases: BaseModel

Schema for a prompt YAML file. user is required, system is optional.

system
Optional[str] = None
user
str
nemo_gym.prompt._resolve_path(
path: str
) -> pathlib.Path

Resolve a path relative to the Gym root (PARENT_DIR), consistent with config_paths resolution.

nemo_gym.prompt.apply_prompt_to_row(
row: dict,
prompt_config: nemo_gym.prompt.PromptConfig
) -> dict

Apply prompt_config to a row, building responses_create_params.input.

Other fields in responses_create_params (tools, metadata, temperature, max_output_tokens) are preserved. Returns a new dict (does not mutate the original).

nemo_gym.prompt.fill_prompt(
prompt_config: nemo_gym.prompt.PromptConfig,
row: dict
) -> typing.List[typing.Dict[str, str]]

Apply a prompt template to a data row, producing message dicts.

Placeholders ({field_name}) are filled from the row’s top-level fields. Literal braces must be doubled ({{ / }}).

nemo_gym.prompt.load_prompt_config(
path: str
) -> nemo_gym.prompt.PromptConfig

Load and validate a YAML prompt config file.

Relative paths are resolved against the Gym root directory (PARENT_DIR), consistent with how config_paths and other Gym paths are resolved.

Returns a PromptConfig with required user and optional system fields. Each value is a string template with {placeholder} syntax. Results are cached so the same file is only parsed once.

nemo_gym.prompt.materialize_prompts(
input_jsonl: str,
prompt_config: str,
output_jsonl: str
) -> None

Apply a prompt template to raw JSONL data, producing materialized JSONL.

Reads each row from input_jsonl, validates that no row has pre-populated responses_create_params.input, applies the prompt template, and writes the result to output_jsonl.

Parameters:

input_jsonl
str

Path to raw JSONL (no responses_create_params.input).

prompt_config
str

Path to prompt YAML file.

output_jsonl
str

Path to write materialized JSONL (with responses_create_params.input).

nemo_gym.prompt.materialize_prompts_cli() -> None

CLI entry point for ng_materialize_prompts.

nemo_gym.prompt.validate_prompt_compatibility(
rows: typing.List[dict],
prompt_config: nemo_gym.prompt.PromptConfig
) -> None

Validate that no rows have pre-populated responses_create_params.input when a prompt_config is provided.

Collects all violating row indices and reports them in a single error.