Generation Backend#

Gym requires an OpenAI-compatible HTTP server to handle model generations during training. This page covers the server requirements and existing implementations across popular RL frameworks.

OpenAI-Compatible Server Requirements#

Gym communicates with generation backends using the OpenAI HTTP API specification. Your generation server must implement endpoints compatible with one of these reference implementations:

Provider	Documentation
OpenAI API	Responses API Reference
Gemini	OpenAI Compatibility
vLLM	OpenAI-Compatible Server
SGLang	OpenAI-Compatible APIs
TGI	OpenAI Messages API

Generation in RL Training#

Most RL frameworks that support policy optimization algorithms (PPO, GRPO) require online on-policy model generations. Integrating generation backends into the RL training loop introduces several challenges:

Refit: Synchronizing model weights between training and generation
Off-policyness: Ensuring generations reflect the current policy state
Latency: Minimizing generation overhead during training iterations

Existing Framework Implementations#

The following table shows how popular RL frameworks implement generation backends.

Tip

If your framework uses vLLM or SGLang, you can reference these implementations when adding OpenAI HTTP server support.

Framework	Generation Backend	Reference Implementation
NeMo RL	vLLM	vllm_generation.py
VeRL	HF, vLLM, SGLang	hf_rollout.py, vLLM rollout, SGLang rollout
TRL	vLLM, HF	grpo_trainer.py (vLLM), grpo_trainer.py (HF)
Slime	SGLang	sglang_engine.py
OpenPIPE ART	vLLM	vLLM module

NeMo RL, VeRL, Slime, and OpenPIPE ART all expose OpenAI-compatible HTTP server endpoints.

Integration Guidelines#

Frameworks Using vLLM or SGLang#

If your training framework already uses vLLM or SGLang but does not expose an OpenAI-compatible HTTP server:

Reference the implementations listed above
Add server endpoints that follow the OpenAI API specification
Test your implementation using the vLLM HTTP server tests from NeMo RL

Frameworks Using Other Backends#

If your training framework does not use vLLM or SGLang as a generation backend, you may need significant refactoring to achieve proper Gym integration. Consider:

Migrating to vLLM or SGLang for generation
Implementing an adapter layer that exposes OpenAI-compatible endpoints
Evaluating the complexity of maintaining a custom generation backend