Prompt Config
Apply YAML-based prompt templates at rollout time to build responses_create_params.input on the fly. This enables prompt sweeps without re-preparing JSONL data.
Goal: Use prompt configs to separate prompt templates from dataset preparation.
Time: ~5 minutes
In this guide, you will:
- Write a prompt config YAML file
- Apply it during rollout collection with
gym eval run --no-serve - Optionally materialize prompts into JSONL with
gym dataset render
Prerequisites:
- NeMo Gym installed (Installation)
- A JSONL dataset with raw fields (e.g.
question,expected_answer)
Overview
A prompt config is a YAML file with a required user field and an optional system field. Placeholders like {question} are filled from each data row’s top-level fields during rollout collection.
Prompt configs and pre-populated responses_create_params.input in the JSONL data are mutually exclusive. Use one or the other. If any row already contains responses_create_params.input when a prompt config is specified, an error is raised.
Prompt Config Format
Minimal (user message only)
With system message
Multiple fields
Literal braces must be doubled ({{ / }}). For example, \\boxed{{}} produces \boxed{} in the output.
Usage
At rollout time
Pass --prompt-config <path> to gym eval run --no-serve:
The --prompt-config path must be either an absolute path or a path relative to the Gym repository root.
The input JSONL should contain raw fields (e.g. question, expected_answer) without responses_create_params.input. The prompt config builds the input messages during rollout collection.
Standalone materialization
Use gym dataset render to write a prompt template into JSONL without running rollouts:
This produces a new JSONL file with responses_create_params.input populated from the template. This is useful for inspection or passing to other tools that expect pre-populated input.
Input Data Format
When using prompt configs, your input JSONL should have the placeholder fields at the top level:
Other fields in responses_create_params (such as tools and temperature) are preserved. Only input is built from the template.
CLI Parameters
gym eval run --no-serve
See CLI Commands for the full list of gym eval run --no-serve parameters.
gym dataset render
How It Works
- The prompt YAML is loaded and validated (must have a
userkey) - All rows are checked for conflicts. If any row already has
responses_create_params.input, an error is raised - For each row, placeholders in
systemanduserare filled from the row’s fields - The resulting messages are set as
responses_create_params.input