Prompt Config#

Apply YAML-based prompt templates at rollout time to build responses_create_params.input on the fly. This enables prompt sweeps without re-preparing JSONL data.

Goal: Use prompt configs to separate prompt templates from dataset preparation.

Time: ~5 minutes

In this guide, you will:

Write a prompt config YAML file
Apply it during rollout collection with ng_collect_rollouts
Optionally materialize prompts into JSONL with ng_materialize_prompts

Prerequisites:

NeMo Gym installed (Detailed Setup Guide)
A JSONL dataset with raw fields (e.g. question, expected_answer)

Overview#

A prompt config is a YAML file with a required user field and an optional system field. Placeholders like {question} are filled from each data row’s top-level fields during rollout collection.

Prompt configs and pre-populated responses_create_params.input in the JSONL data are mutually exclusive. Use one or the other. If any row already contains responses_create_params.input when a prompt config is specified, an error is raised.

Prompt Config Format#

Minimal (user message only)#

# The {question} placeholder is filled from each row's "question" field.
user: "{question}"

With system message#

# Math chain-of-thought prompt with system message.
# Expects rows with a "question" field.
system: "You are a helpful math assistant. Think step by step and put your final answer in \\boxed{{}}."
user: "{question}"

Multiple fields#

# Expects rows with "question" and "context" fields.
system: "Answer the question using the provided context."
user: |
  Context: {context}

  Question: {question}

Note

Literal braces must be doubled ({{ / }}). For example, \\boxed{{}} produces \boxed{} in the output.

Usage#

At rollout time#

Pass +prompt_config=<path> to ng_collect_rollouts:

ng_collect_rollouts \
    +agent_name=my_agent \
    +input_jsonl_fpath=data/raw_problems.jsonl \
    +output_jsonl_fpath=results/rollouts.jsonl \
    +prompt_config=/path/to/my_prompt.yaml \
    +num_repeats=5 \
    "+responses_create_params={max_output_tokens: 16384, temperature: 1.0}"

The +prompt_config path must be either an absolute path or a path relative to the Gym repository root.

The input JSONL should contain raw fields (e.g. question, expected_answer) without responses_create_params.input. The prompt config builds the input messages during rollout collection.

Standalone materialization#

Use ng_materialize_prompts to write a prompt template into JSONL without running rollouts:

ng_materialize_prompts \
    +input_jsonl_fpath=data/raw_problems.jsonl \
    +prompt_config=/path/to/my_prompt.yaml \
    +output_jsonl_fpath=data/materialized.jsonl

This produces a new JSONL file with responses_create_params.input populated from the template. This is useful for inspection or passing to other tools that expect pre-populated input.

Input Data Format#

When using prompt configs, your input JSONL should have the placeholder fields at the top level:

{"question": "What is 2+2?", "expected_answer": "4"}
{"question": "What is 3*5?", "expected_answer": "15"}

Other fields in responses_create_params (such as tools and temperature) are preserved. Only input is built from the template.

CLI Parameters#

`ng_collect_rollouts`#

Parameter	Required	Description
`+prompt_config`	No	Path to a prompt YAML file. Mutually exclusive with pre-populated `responses_create_params.input` in the JSONL data.

See CLI Commands for the full list of ng_collect_rollouts parameters.

`ng_materialize_prompts`#

Parameter	Required	Description
`+input_jsonl_fpath`	Yes	Raw JSONL data (no `responses_create_params.input`).
`+prompt_config`	Yes	Path to prompt YAML file to apply.
`+output_jsonl_fpath`	Yes	Output path for materialized JSONL with populated prompts.

How It Works#

The prompt YAML is loaded and validated (must have a user key)
All rows are checked for conflicts. If any row already has responses_create_params.input, an error is raised
For each row, placeholders in system and user are filled from the row’s fields
The resulting messages are set as responses_create_params.input