For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DocumentationAPI Reference
DocumentationAPI Reference
  • Documentation
    • Home
  • About
    • Concepts
    • Ecosystem
  • Get Started
    • Quickstart
    • Detailed Setup Guide
    • Install from PyPI
    • Rollout Collection
  • Agent Server
  • Model Server
    • vLLM
  • Resources Server
  • Data
    • Prepare and Validate
    • Download from Hugging Face
    • Prompt Config
  • Environment Tutorials
    • Single-Step Environment
    • Multi-Step Environment
    • Stateful Environment
    • Real-World Environment
      • Generating Training Data
      • Resources Server Implementation
    • Integrate external libraries
    • Aggregate Metrics
    • LLM-as-Judge Verification
  • Benchmarks
    • Run benchmarks
    • Add a benchmark
    • Design a customer evaluation
  • Training Tutorials
    • NeMo RL
    • Unsloth
    • Multi-Environment Training
    • Offline Training (SFT/DPO)
  • Model Recipes
    • Nemotron 3 Nano
    • Nemotron 3 Super
  • Infrastructure
    • Deployment Topology
    • Engineering Notes
  • Reference
    • Configuration
    • RL Framework Compatibility
    • CLI Commands
    • FAQ
  • Troubleshooting
    • Configuration Errors
  • Contribute
    • Development Setup
    • Environments
    • Integrate RL Frameworks
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Manage My Privacy | Do Not Sell or Share My Data | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoNeMo Gym
On this page
  • Pipeline Overview
  • Notebook
  • What’s Next?
Environment TutorialsReal-World Environment

Generating Training Data

||View as Markdown|
Previous

Real-World Environment

Next

Resources Server Implementation

Generate synthetic task data (user queries) for the Workplace Assistant environment using NeMo Data Designer.

This pipeline focuses on generating tasks for use with the environment. It also simulates agent trajectories, but these are used for quality filtering and validation — the environment itself produces the actual model responses during rollout collection. The Workplace Assistant uses 27 tools across 6 databases, and NeMo Data Designer can produce realistic multi-step user queries at scale.

← Back to Workplace Assistant

Pipeline Overview

The data generation pipeline:

  1. Load tool schemas for the Workplace Assistant environment
  2. Use NeMo Data Designer to generate realistic multi-step user queries
  3. Simulate agent trajectories (step-by-step tool-call solutions)
  4. Apply dual-level LLM judge filtering to ensure data quality
  5. Export task data in NeMo Gym JSONL format

Notebook

The tutorial is provided as a Jupyter notebook. See the notebook README for prerequisites and setup instructions.

View Notebook on GitHub


What’s Next?

After generating your task data, use it with the Workplace Assistant resources server to collect rollouts (where the environment produces model responses) and then proceed to GRPO training.

Continue to Resources Server Implementation →