Data#
NeMo Gym datasets use JSONL format for reinforcement learning (RL) training. Each dataset connects to an agent server (orchestrates agent-environment interactions) which routes requests to a resources server (provides tools and computes rewards).
Prerequisites#
NeMo Gym installed: See Detailed Setup Guide
Repository cloned (for built-in datasets):
git clone https://github.com/NVIDIA-NeMo/Gym.git cd Gym
Note
NeMo Gym uses OpenAI-compatible schemas for model server compatibility. No OpenAI account required—local servers like vLLM use the same format.
Data Format#
Each JSONL line requires a responses_create_params field following the OpenAI Responses API schema:
{"responses_create_params": {"input": [{"role": "user", "content": "What is 2+2?"}]}}
Additional fields like expected_answer vary by resources server—the component that provides tools and reward signals.
Source: nemo_gym/base_resources_server.py:35-36
Required Fields#
Field |
Added By |
Description |
|---|---|---|
|
User |
Input to the model during training. Contains |
|
|
Routes each row to its resource server. Auto-generated during data preparation. |
Optional Fields#
Field |
Description |
|---|---|
|
Ground truth for verification (task-specific). |
|
Original question text (for reference). |
|
Tracking identifier. |
Tip
Check resources_servers/<name>/README.md for fields required by each resource server’s verify() method.
The agent_ref Field#
The agent_ref field maps each row to a specific resource server. A training dataset can blend multiple resource servers in a single file—agent_ref tells NeMo Gym which server handles each row.
{
"responses_create_params": {"input": [{"role": "user", "content": "..."}]},
"agent_ref": {"type": "responses_api_agents", "name": "math_with_judge_simple_agent"}
}
You don’t create agent_ref manually. The ng_prepare_data tool adds it automatically based on your config file. The tool matches the agent type (responses_api_agents) with the agent name from the config.
Example Data#
{"responses_create_params": {"input": [{"role": "user", "content": "What is 2+2?"}]}, "expected_answer": "4"}
{"responses_create_params": {"input": [{"role": "user", "content": "What is 3*5?"}]}, "expected_answer": "15"}
{"responses_create_params": {"input": [{"role": "user", "content": "What is 10/2?"}]}, "expected_answer": "5"}
Quick Start#
Run this command from the repository root:
config_paths="responses_api_models/vllm_model/configs/vllm_model_for_training.yaml,\
resources_servers/example_multi_step/configs/example_multi_step.yaml"
ng_prepare_data "+config_paths=[${config_paths}]" \
+output_dirpath=data/test \
+mode=example_validation
Success: Finished! message and data/test/example_metrics.json created.
Dataset Types#
Type |
Purpose |
License |
|---|---|---|
|
Testing and development |
Not required |
|
RL training data |
Required |
|
Evaluation during training |
Required |
Source: nemo_gym/config_types.py:352
Configuration#
Define datasets in your agent server’s YAML config:
datasets:
- name: train
type: train
jsonl_fpath: resources_servers/workplace_assistant/data/train.jsonl
huggingface_identifier:
repo_id: nvidia/Nemotron-RL-agent-workplace_assistant
artifact_fpath: train.jsonl
license: Apache 2.0
Source: resources_servers/workplace_assistant/configs/workplace_assistant.yaml:21-27
Field |
Required |
Description |
|---|---|---|
|
Yes |
Dataset identifier |
|
Yes |
|
|
Yes |
Path to data file |
|
Train/validation |
See valid values below |
|
No |
Remote download location |
|
No |
Repeat count (default: |
Valid Licenses#
Apache 2.0 · MIT · GNU General Public License v3.0 · Creative Commons Attribution 4.0 International · Creative Commons Attribution-ShareAlike 4.0 International · TBD · NVIDIA Internal Use Only, Do Not Distribute
Source: nemo_gym/config_types.py:363-372
Workflow#
flowchart LR
A[Create JSONL] --> B[Add to config]
B --> C[Run ng_prepare_data]
C -->|Pass| D[Train with NeMo RL]
C -->|Fail| E[Fix and retry]
Validation Modes#
Mode |
Scope |
Use Case |
|---|---|---|
|
|
Format check before contributing |
|
|
Full prep for RL training |
To prepare training data with auto-download:
config_paths="responses_api_models/vllm_model/configs/vllm_model_for_training.yaml,\
resources_servers/workplace_assistant/configs/workplace_assistant.yaml"
ng_prepare_data "+config_paths=[${config_paths}]" \
+output_dirpath=data/workplace_assistant \
+mode=train_preparation \
+should_download=true
Tip
HuggingFace downloads require authentication. Set hf_token in env.yaml or export HF_TOKEN.
Common Errors#
Error |
Cause |
Fix |
|---|---|---|
|
Invalid JSON |
Check quotes, commas, brackets at line N |
|
Missing field |
Add |
|
Missing license |
Add |
|
File not found |
Check path or add |
Guides#
Full data preparation workflow.
Fetch datasets from HuggingFace Hub.
CLI Commands#
Command |
Description |
|---|---|
|
Validate and generate metrics |
|
Download from HuggingFace |
|
View dataset in Gradio UI |
See CLI Commands for details.
Large Datasets#
Validation streams line-by-line (memory-efficient)
Single-threaded; >100K samples may take minutes
Use
num_repeatsinstead of duplicating JSONL lines