Overview
The Resources server is the “world” the agent interacts with. It defines the task, the tools and actions available to the agent, and the verification logic that evaluates performance and returns reward signals for training.
Session Management
NeMo Gym uses a session_id to maintain isolated state for every parallel rollout. This ensures that concurrent rollouts never interfere with each other, and for multi-step environments, preserves state across steps within a single rollout.
Tool Implementations
Tools are exposed as HTTP endpoints that the Agent server calls during a rollout. Each tool receives the session_id to access the correct rollout state, executes an action, and returns the result as an observation back to the model. Tools may also mutate the session state (e.g., updating a database), which the verifier can later inspect to evaluate performance.
Verification Logic
Every Resources server implements a verify() function that evaluates the result of a rollout and returns a reward signal for training. See Task Verification for verification approaches, patterns, and best practices.
For semantic or rubric-based scoring, verify() may call a second language model (LLM-as-a-judge); the concept is outlined in task-verification under What is LLM-as-a-judge?. For configuration, deployment, and implementation patterns, see Llm As Judge Verification.
Example Resources Servers
workplace_assistant — Multi-step tool calling in a workplace setting.
- Task: Execute business activities such as sending emails, scheduling meetings, and managing projects.
- Actions: 26 tools across 5 databases (email, calendar, analytics, project management, CRM). Each tool can read and mutate the database state.
- Verification: State matching: executes both the agent’s actions and the ground truth actions against fresh databases, then compares the resulting states.
math_with_code — Mathematical reasoning with code execution.
- Task: Solve math problems using Python as a reasoning tool.
- Actions:
execute_python()runs code in an isolated per-session process with numpy, scipy, and pandas available. State persists across steps so the agent can build on previous computations. - Verification: Answer correctness: extracts the boxed answer from the model’s final response and compares it against the expected result.
Server Configuration
Resources Server Fields for server configuration syntax and fields.