For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DocumentationAPI Reference
DocumentationAPI Reference
  • Documentation
    • Home
  • About
    • Concepts
      • Training Approaches
      • Environment Components
      • Configuration System
      • Architecture
      • Task Verification
      • Key Terminology
    • Ecosystem
  • Get Started
    • Quickstart
    • Detailed Setup Guide
    • Install from PyPI
    • Rollout Collection
  • Agent Server
  • Model Server
    • vLLM
  • Resources Server
  • Data
    • Prepare and Validate
    • Download from Hugging Face
    • Prompt Config
  • Environment Tutorials
    • Single-Step Environment
    • Multi-Step Environment
    • Stateful Environment
    • Real-World Environment
    • Integrate external libraries
    • Aggregate Metrics
    • LLM-as-Judge Verification
  • Benchmarks
    • Run benchmarks
    • Add a benchmark
    • Design a customer evaluation
  • Training Tutorials
    • NeMo RL
    • Unsloth
    • Multi-Environment Training
    • Offline Training (SFT/DPO)
  • Model Recipes
    • Nemotron 3 Nano
    • Nemotron 3 Super
  • Infrastructure
    • Deployment Topology
    • Engineering Notes
  • Reference
    • Configuration
    • RL Framework Compatibility
    • CLI Commands
    • FAQ
  • Troubleshooting
    • Configuration Errors
  • Contribute
    • Development Setup
    • Environments
    • Integrate RL Frameworks
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Manage My Privacy | Do Not Sell or Share My Data | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoNeMo Gym
On this page
  • What is an Environment?
AboutConcepts

Environment Components

||View as Markdown|
Previous

Training Approaches

Next

Configuration System

New to reinforcement learning for LLMs? Start with training-approaches for context on SFT, RL, and RLVR, or refer to Key Terminology for a quick glossary.

What is an Environment?

An environment is defined by the task for the agent to accomplish, the actions the agent can take, and the state of the world the agent observes and acts upon. The environment also determines how the agent’s performance is evaluated: what constitutes success and how reward is assigned.

In NeMo Gym, these concepts map to three server components:

  • Agent servers define whether a rollout is single-step or multi-step, single-turn or multi-turn, and orchestrate the full rollout lifecycle: calling the model, routing tool calls to resources, and collecting the final reward. The Agent server does not run an LLM itself — it delegates all text generation to the Model server.
  • Model servers are stateless LLM inference endpoints. They receive a conversation and return the model’s next output (text, tool calls, or code) with no memory or orchestration logic.
  • Resources servers provide the tasks that agents solve, the tools and external state they interact with, and the verification logic that scores performance and returns reward signals for training. Each resources server manages isolated per-rollout state via session IDs.
┌──────────────────────────────────────────┐
│ Agent Server │
│ │
│ run(): │
│ 1. resources.seed_session() ─────────────► Resources Server
│ 2. multi-step/multi-turn loop: │
│ model.responses() ─────────────► Model Server
│ resources.my_tool() ─────────────► Resources Server
│ 3. resources.verify() ─────────────► Resources Server
│ → reward │
└──────────────────────────────────────────┘
┌───────────────────────────┐ ┌───────────────────────────────────┐
│ Model Server │ │ Resources Server │
│ │ │ │
│ responses(): │ │ seed_session(): init env state │
│ conversation │ │ my_tool(): execute action │
│ → text, tool calls, │ │ verify(): evaluate → reward│
│ or code │ │ │
└───────────────────────────┘ └───────────────────────────────────┘