MCP Resources Server
This tutorial shows how to expose environment tools over the Model Context Protocol (MCP) so that an MCP-native agent — such as Claude Code — can discover and call them, while the Resources Server still owns verification. The pattern is: MCP tool implementations + a verify() function = a Resources Server.
Two ways to combine MCP with a Resources Server
There are two distinct integration shapes, and they need different amounts of plumbing:
The rest of this page builds the Gym-owned flow (the one that needs new infrastructure) and then explains the external flow at the end.
Why a Gym-owned MCP server at all? Mounting the MCP endpoint inside the Resources Server lets a tool call be bound to the same per-rollout session as /seed_session and /verify. That is what makes “was this tool actually used in this episode?” a verifiable, isolated question. An external MCP server can’t offer that — Gym can’t observe its calls — so external-server verification has to work off the agent’s trajectory instead.
What You’ll Build
A weather environment with a single MCP tool, get_weather(city). The agent must call the tool and then answer with exactly the sentence the tool returned. The Resources Server rewards the rollout only if the tool was called in this session and the final answer contains the returned sentence.
Episode Flow
Implementation
The base class MCPResourcesServer (in nemo_gym/base_resources_server.py) mounts the MCP endpoint and manages the per-rollout token. You write a @gym_tool method (your tool), a seed_session() that returns the MCP metadata so the agent can connect, and a verify() that scores the rollout.
File (resources_servers/example_mcp_weather/app.py):
Key Pattern
Writing a tool is just decorating a method:
@gym_tool— mark a method and the base class auto-registers it as an MCP tool (name = method name), mounted at/mcp(Streamable HTTP). The MCP input schema is derived from the method’s typed parameters. To receive the Gym session, declare asession_id: strparameter — it is injected from the per-rollout token and hidden from the tool’s input schema (the model only sees the real args). Omit it for a stateless tool. A missing/invalid token raisesMCPSessionError, which — because MCP runs over JSON-RPC — FastMCP surfaces to the client as a tool error (isError: true) on an HTTP 200 response, not an HTTP status code. Both sync and async methods work. Tool names may not collide with reserved endpoints (verify,seed_session,aggregate_metrics,mcp), and a tool must not take arequestparameter (there is no FastAPIRequeston the MCP path — usesession_id).build_mcp_session_metadata(request)— call this fromseed_sessionand return it under the response’smcpkey. It mints the one-timeX-NeMo-Gym-Session-Tokenbound to the currentsession_id.
Need full control (e.g. a hand-written
@mcp.tool()with custom schema)? Overrideregister_mcp_tools(self, mcp)— callsuper().register_mcp_tools(mcp)first to keep the auto-registered@gym_toolones.
MCPResourcesServer disables the MCP SDK’s default DNS-rebinding protection (TransportSecuritySettings(enable_dns_rebinding_protection=False)). That protection only accepts loopback Host headers and returns HTTP 421 otherwise — which would break multi-node / use_absolute_ip=True deployments where the agent reaches the server by a routable host. The endpoint is instead protected by the per-rollout session token. You don’t need to set this yourself; the base class handles it.
Wiring the agent (Claude Code)
The claude_code_agent reads the mcp metadata from /seed_session, writes a per-rollout gym_mcp_config.json, and launches Claude Code with --mcp-config. The generated config looks like:
A minimal config (resources_servers/example_mcp_weather/configs/example_mcp_weather.yaml) wires the server and the agent together:
Run it
Put your key in a repo-root env.yaml (the config above interpolates ${anthropic_api_key}):
Then start the servers:
Then collect rollouts against the example dataset and reward-profile as in the quickstart. A correct rollout shows Claude Code calling mcp__example_mcp_weather__get_weather and a reward of 1.0.
To watch the MCP round-trip without a full gym env start, start the Resources Server on its own and drive /seed_session → /mcp tools/call → /verify directly (a requests.Session preserves the session cookie). This is also the fastest way to confirm the endpoint is reachable from another host.
Pointing at an existing / external MCP server
If the MCP server already runs outside Gym, the agent talks to it directly — you do not need an MCPResourcesServer. Give the agent a static mcp_config pointing at the external server, and write a plain SimpleResourcesServer.verify() that scores the agent’s trajectory:
Things to know about this flow:
- No cookie/session entanglement. Gym’s session cookie flows only between the agent server and the Resources Server (
/seed_session↔/verify). The agent-to-external-MCP connection is a separate channel with its own auth (whateverheadersyou put in the static config). They don’t interfere. - Verify off the trajectory. Gym can’t observe the external server’s calls, so
verify()must score thefunction_call/function_call_outputitems in the agent’s Responses-API output — not server-side session state. - Static + per-rollout compose. When both are present, the agent merges your static
mcp_configwith the per-rollout Gym-owned entry, so a single rollout can use external tools and a Gym-owned MCP server at once. If a static server happens to share the same name as the Gym resources server, the per-rollout Gym entry takes precedence and overwrites it.
Real-World Environment →