nemo_gym.prompt#

Prompt configuration: YAML-based prompt templates applied at rollout time.

Prompt templates are mutually exclusive with pre-populated responses_create_params.input values. This separation enables prompt sweeps without re-preparing data.

Module Contents#

Classes#

PromptConfig

Schema for a prompt YAML file. user is required, system is optional.

MaterializePromptsConfig

Apply a prompt template to raw JSONL data, producing materialized JSONL with populated responses_create_params.input for RL training.

Functions#

_resolve_path

Resolve a path relative to the Gym root (PARENT_DIR), consistent with config_paths resolution.

load_prompt_config

Load and validate a YAML prompt config file.

fill_prompt

Apply a prompt template to a data row, producing message dicts.

validate_prompt_compatibility

Validate that no rows have pre-populated responses_create_params.input when a prompt_config is provided.

apply_prompt_to_row

Apply prompt_config to a row, building responses_create_params.input.

materialize_prompts

Apply a prompt template to raw JSONL data, producing materialized JSONL.

materialize_prompts_cli

CLI entry point for ng_materialize_prompts.

API#

class nemo_gym.prompt.PromptConfig(/, **data: typing.Any)[source]#

Bases: pydantic.BaseModel

Schema for a prompt YAML file. user is required, system is optional.

Initialization

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

user: str#

None

system: Optional[str]#

None

nemo_gym.prompt._resolve_path(path: str) pathlib.Path[source]#

Resolve a path relative to the Gym root (PARENT_DIR), consistent with config_paths resolution.

nemo_gym.prompt.load_prompt_config(path: str) nemo_gym.prompt.PromptConfig[source]#

Load and validate a YAML prompt config file.

Relative paths are resolved against the Gym root directory (PARENT_DIR), consistent with how config_paths and other Gym paths are resolved.

Returns a PromptConfig with required user and optional system fields. Each value is a string template with {placeholder} syntax. Results are cached so the same file is only parsed once.

nemo_gym.prompt.fill_prompt(
prompt_config: nemo_gym.prompt.PromptConfig,
row: dict,
) List[Dict[str, str]][source]#

Apply a prompt template to a data row, producing message dicts.

Placeholders ({field_name}) are filled from the row’s top-level fields. Literal braces must be doubled ({{ / }}).

nemo_gym.prompt.validate_prompt_compatibility(
rows: List[dict],
prompt_config: nemo_gym.prompt.PromptConfig,
) None[source]#

Validate that no rows have pre-populated responses_create_params.input when a prompt_config is provided.

Collects all violating row indices and reports them in a single error.

nemo_gym.prompt.apply_prompt_to_row(
row: dict,
prompt_config: nemo_gym.prompt.PromptConfig,
) dict[source]#

Apply prompt_config to a row, building responses_create_params.input.

Other fields in responses_create_params (tools, metadata, temperature, max_output_tokens) are preserved. Returns a new dict (does not mutate the original).

nemo_gym.prompt.materialize_prompts(
input_jsonl: str,
prompt_config: str,
output_jsonl: str,
) None[source]#

Apply a prompt template to raw JSONL data, producing materialized JSONL.

Reads each row from input_jsonl, validates that no row has pre-populated responses_create_params.input, applies the prompt template, and writes the result to output_jsonl.

Parameters:
  • input_jsonl – Path to raw JSONL (no responses_create_params.input).

  • prompt_config – Path to prompt YAML file.

  • output_jsonl – Path to write materialized JSONL (with responses_create_params.input).

class nemo_gym.prompt.MaterializePromptsConfig(/, **data: typing.Any)[source]#

Bases: nemo_gym.config_types.BaseNeMoGymCLIConfig

Apply a prompt template to raw JSONL data, producing materialized JSONL with populated responses_create_params.input for RL training.

Examples:

ng_materialize_prompts \
    +input_jsonl_fpath=data/my_dataset.jsonl \
    +prompt_config=/path/to/my_prompt.yaml \
    +output_jsonl_fpath=my_dataset_materialized.jsonl

Initialization

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

input_jsonl_fpath: str#

‘Field(…)’

prompt_config: str#

‘Field(…)’

output_jsonl_fpath: str#

‘Field(…)’

nemo_gym.prompt.materialize_prompts_cli() None[source]#

CLI entry point for ng_materialize_prompts.