OpenAI
The openai_model server connects NeMo Gym to OpenAI. It forwards requests straight through with no conversion in either direction: a Responses request goes to OpenAI’s Responses API and returns a Responses object, and a Chat Completions request goes to OpenAI’s Chat Completions and returns a chat completion — OpenAI serves both natively.
Because it is a pass-through, the Responses endpoint also works with any other backend that implements the Responses API natively — point openai_base_url at it. In practice that is rare: most hosted backends only expose Chat Completions and instead go through Inference Providers, which translates between the two formats.
For training workloads that require token IDs and log probabilities, use vLLM instead. Hosted endpoints do not expose the token-level information needed for RL training.
Supported APIs
This server exposes both endpoints and forwards each directly to the upstream endpoint:
- OpenAI Responses —
/v1/responses - OpenAI Chat Completions —
/v1/chat/completions
Set Your Credentials
Store your values in env.yaml in the project root (gitignored):
Point policy_base_url at any OpenAI-compatible Responses endpoint to reuse this server with a different host.
Configuration Reference
The model is fixed by configuration. This server always sends the configured openai_model (from policy_model_name) to the upstream endpoint. If an incoming request carries its own model field — as standard OpenAI-compatible clients and SDKs do — that value is overwritten, so you cannot switch models on a per-request basis. To run a different model, change the config and start a new server.