Azure OpenAI
The azure_openai_model server connects NeMo Gym to models hosted on Azure OpenAI. It calls Azure’s Chat Completions endpoint and converts to and from NeMo Gym’s native Responses API — so, like Inference Providers, it translates between the two formats (unlike OpenAI, which forwards Responses requests natively). Azure also requires an explicit api-version.
For training workloads that require token IDs and log probabilities, use vLLM instead. Hosted endpoints do not expose the token-level information needed for RL training.
Supported APIs
This server exposes both endpoints, calling Azure’s Chat Completions under the hood:
- OpenAI Responses —
/v1/responses - OpenAI Chat Completions —
/v1/chat/completions
Set Your Credentials
Store your values in env.yaml in the project root (gitignored). For Azure, policy_model_name is your deployment name and policy_base_url is your Azure endpoint:
Azure requires an API version (for example 2024-10-21). Set it on the default_query.api-version config field — see the usage example below.
Configuration Reference
The model is fixed by configuration. This server always sends the configured openai_model (from policy_model_name) to Azure. If an incoming request carries its own model field — as standard OpenAI-compatible clients and SDKs do — that value is overwritten, so you cannot switch models on a per-request basis. To run a different model, change the config and start a new server.
Usage Example
1. Set model and environment config
2. Start servers
Pass your Azure API version on the command line: