Gym requires an OpenAI-compatible HTTP server to handle model generations during training. This page covers the server requirements and existing implementations across popular RL frameworks.
Gym communicates with generation backends using the OpenAI HTTP API specification. Your generation server must implement endpoints compatible with one of these reference implementations:
Most RL frameworks that support policy optimization algorithms (PPO, GRPO) require online on-policy model generations. Integrating generation backends into the RL training loop introduces several challenges:
The following table shows how popular RL frameworks implement generation backends.
If your framework uses vLLM or SGLang, you can reference these implementations when adding OpenAI HTTP server support.
NeMo RL, VeRL, Slime, and OpenPIPE ART all expose OpenAI-compatible HTTP server endpoints.
If your training framework already uses vLLM or SGLang but does not expose an OpenAI-compatible HTTP server:
If your training framework does not use vLLM or SGLang as a generation backend, you may need significant refactoring to achieve proper Gym integration. Consider:
After setting up your generation backend, proceed to: