Data

View as Markdown

NeMo Gym datasets use JSONL format for reinforcement learning (RL) training. Each dataset connects to an agent server (orchestrates agent-environment interactions) which routes requests to a resources server (provides tools and computes rewards).

Prerequisites

  • NeMo Gym installed: See Installation
  • Repository cloned (for built-in datasets):
    $git clone https://github.com/NVIDIA-NeMo/Gym.git
    $cd Gym

NeMo Gym uses OpenAI-compatible schemas for model server compatibility. No OpenAI account required—local servers like vLLM use the same format.

Data Format

Each JSONL line requires a responses_create_params field following the OpenAI Responses API schema:

1{"responses_create_params": {"input": [{"role": "user", "content": "What is 2+2?"}]}}

Additional fields like expected_answer vary by resources server—the component that provides tools and reward signals.

Required Fields

FieldAdded ByDescription
responses_create_paramsUserInput to the model during training. Contains input (messages) and optional tools, temperature, etc.
agent_refng_prepare_dataRoutes each row to its resources server. Auto-generated during data preparation.

Optional Fields

FieldDescription
expected_answerGround truth for verification (task-specific).
questionOriginal question text (for reference).
idTracking identifier.

Check resources_servers/<name>/README.md for fields required by each resources server’s verify() method.

The agent_ref Field

The agent_ref field maps each row to a specific resources server. A training dataset can blend multiple resources servers in a single file—agent_ref tells NeMo Gym which server handles each row.

1{
2 "responses_create_params": {"input": [{"role": "user", "content": "..."}]},
3 "agent_ref": {"type": "responses_api_agents", "name": "math_with_judge_simple_agent"}
4}

You don’t create agent_ref manually. The ng_prepare_data tool adds it automatically based on your config file. The tool matches the agent type (responses_api_agents) with the agent name from the config.

Example Data

1{"responses_create_params": {"input": [{"role": "user", "content": "What is 2+2?"}]}, "expected_answer": "4"}
2{"responses_create_params": {"input": [{"role": "user", "content": "What is 3*5?"}]}, "expected_answer": "15"}
3{"responses_create_params": {"input": [{"role": "user", "content": "What is 10/2?"}]}, "expected_answer": "5"}

Quick Start

Run this command from the repository root:

$config_paths="responses_api_models/vllm_model/configs/vllm_model_for_training.yaml,\
>resources_servers/example_multi_step/configs/example_multi_step.yaml"
$
$ng_prepare_data "+config_paths=[${config_paths}]" \
> +output_dirpath=data/test \
> +mode=example_validation

Success: Finished! message and data/test/example_metrics.json created.

Dataset Types

TypePurposeLicense
exampleTesting and developmentNot required
trainRL training dataRequired
validationEvaluation during trainingRequired

Configuration

Define datasets in your agent server’s YAML config:

1datasets:
2 - name: train
3 type: train
4 jsonl_fpath: resources_servers/workplace_assistant/data/train.jsonl
5 huggingface_identifier:
6 repo_id: nvidia/Nemotron-RL-agent-workplace_assistant
7 artifact_fpath: train.jsonl
8 license: Apache 2.0
FieldRequiredDescription
nameYesDataset identifier
typeYesexample, train, or validation
jsonl_fpathYesPath to data file
licenseTrain/validationSee valid values below
huggingface_identifierNoRemote download location
num_repeatsNoRepeat count (default: 1)

Valid Licenses

Apache 2.0 · MIT · GNU General Public License v3.0 · Creative Commons Attribution 4.0 International · Creative Commons Attribution-ShareAlike 4.0 International · TBD · NVIDIA Internal Use Only, Do Not Distribute

Workflow

Validation Modes

ModeScopeUse Case
example_validationexample datasetsFormat check before contributing
train_preparationtrain + validationFull prep for RL training

To prepare training data with auto-download:

$config_paths="responses_api_models/vllm_model/configs/vllm_model_for_training.yaml,\
>resources_servers/workplace_assistant/configs/workplace_assistant.yaml"
$
$ng_prepare_data "+config_paths=[${config_paths}]" \
> +output_dirpath=data/workplace_assistant \
> +mode=train_preparation \
> +should_download=true

HuggingFace downloads require authentication. Set hf_token in env.yaml or export HF_TOKEN.

Common Errors

ErrorCauseFix
JSON parse error at line NInvalid JSONCheck quotes, commas, brackets at line N
ValidationError: responses_create_paramsMissing fieldAdd responses_create_params.input
A license is requiredMissing licenseAdd license to dataset config
Missing local datasetsFile not foundCheck path or add +should_download=true

Guides

CLI Commands

CommandDescription
ng_prepare_dataValidate and generate metrics
ng_download_dataset_from_hfDownload from HuggingFace

See Cli Commands for details.

Large Datasets

  • Validation streams line-by-line (memory-efficient)
  • Single-threaded; >100K samples may take minutes
  • Use num_repeats instead of duplicating JSONL lines