NeMo Gym Documentation#
NeMo Gym is a framework for building reinforcement learning (RL) training environments large language models (LLMs). Gym provides training environment development scaffolding and training environment patterns such as multi-step, multi-turn, and user modeling scenarios.
At the core of NeMo Gym are three server concepts: Responses API Model servers are model endpoints, Resources servers contain tool implementations and verification logic, and Response API Agent servers orchestrate the interaction between models and resources.
Quickstart#
Run a training environment and start collecting rollouts for training in under 5 minutes.
# Clone and install dependencies
git clone git@github.com:NVIDIA-NeMo/Gym.git
cd Gym
# Install UV if not already available
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.local/bin/env
# Create a virtual environment and install Gym
uv venv --python 3.12 && source .venv/bin/activate
uv sync --extra dev --group docs
# Configure your model API access
echo "policy_base_url: https://api.openai.com/v1
policy_api_key: your-openai-api-key
policy_model_name: gpt-4.1-2025-04-14" > env.yaml
Terminal 1 (start servers):
# Start servers (this will keep running)
config_paths="resources_servers/example_simple_weather/configs/simple_weather.yaml,\
responses_api_models/openai_model/configs/openai_model.yaml"
ng_run "+config_paths=[${config_paths}]"
Terminal 2 (interact with agent):
# In a NEW terminal, activate environment
source .venv/bin/activate
# Interact with your agent
python responses_api_agents/simple_agent/client.py
Terminal 2 (keep servers running in Terminal 1):
# Create a simple dataset with one query
echo '{"responses_create_params":{"input":[{"role":"developer","content":"You are a helpful assistant."},{"role":"user","content":"What is the weather in Seattle?"}]}}' > weather_query.jsonl
# Collect verified rollouts
ng_collect_rollouts \
+agent_name=simple_weather_simple_agent \
+input_jsonl_fpath=weather_query.jsonl \
+output_jsonl_fpath=weather_rollouts.jsonl
# View the result
cat weather_rollouts.jsonl | python -m json.tool
This generates training data with verification scores!
Terminal 1 with the running servers: Ctrl+C to stop the ng_run process.