Goal: Understand how NeMo Gym implements environments as composable server components.
Read first: Environments — what an environment is and how it decomposes into dataset, agent harness, verifier, and state.
NeMo Gym implements environments as composable FastAPI servers that communicate over async HTTP:
All components are composable. Use different datasets with the same resources server, the same resources server with different models, or the same model with different agent harnesses.
This gives you flexibility to integrate with your existing models and agents:
Each task attempt flows through three steps. The resulting trajectory is called a rollout:
Hosts agent harnesses that orchestrate rollouts. Use your own harness, or use built-in harnesses such as OpenHands or NeMo Gym’s native harnesses such as Simple Agent.
A stateless LLM inference endpoint that standardizes different model providers behind the Responses API. Supports local inference and inference providers.
Manages environment-specific tools, per-task state isolation, and verification:
Tools exist on a spectrum — some belong to the agent and can be used with any environment, some belong to the environment and can be used with any agent:
run_tests endpoint, a database query tool, a sandbox execution API).An agent can use both simultaneously — its own tools and the environment’s tools in the same task. NeMo Gym’s server split reflects this: agent-specific logic in the harness, environment-specific logic in the Resources Server.
Servers communicate over async HTTP (aiohttp) with:
Understand environments, evaluation, and training before diving into implementation.
Browse available environments for evaluation and training.
Explore available agent harnesses and learn how to integrate your own agent.
Improve your agent or model with RL or fine-tuning.
Create your own evaluation or training environments.