Integration Footprint

This page provides a reference for the components required to integrate Gym into your training framework. Each component includes links to the NeMo RL reference implementation and corresponding tests.

Integration Components

A complete Gym integration consists of five components, implemented in sequence:

Component	Implementation	Tests
1	OpenAI-Compatible HTTP Server	vllm_worker_async.py:264	test_vllm_generation.py:1107
2	On-Policy Token ID Fixes	vllm_worker_async.py:40	test_vllm_generation.py:1250
3	Gym Spinup and Integration	nemo_gym.py	test_nemo_gym.py
4	Rollout Orchestration	rollouts.py:975	test_rollouts.py:754
5	GRPO Train Loop Integration	grpo.py:1157	End-to-end tests in progress

As of December 8, 2025, end-to-end tests for GRPO train loop integration are still being implemented in the NeMo RL repository.

Component Details

1. OpenAI-Compatible HTTP Server

Purpose: Expose your generation backend as an OpenAI-compatible endpoint.

Prerequisites: vLLM or SGLang generation backend.

Reference: Refer to Generation Backend And Openai Compatible Http Server for implementation guidance.

2. On-Policy Token ID Fixes

Purpose: Prevent train-generation mismatch in multi-step and multi-turn scenarios.

Prerequisites: OpenAI-compatible HTTP server.

Reference: Refer to On-Policy Corrections for technical details.

3. Gym Spinup and Integration

Purpose: Initialize and connect to Gym training environments.

Key responsibilities:

Environment configuration loading
Connection management
State synchronization

4. Rollout Orchestration

Purpose: Coordinate rollout collection between the policy and Gym environments.

Key responsibilities:

Batch rollout management
Multi-step and multi-turn handling
Token ID tracking for on-policy corrections

5. GRPO Train Loop Integration

Purpose: Integrate Gym rollouts into the policy optimization training loop.

Key responsibilities:

Rollout scheduling within training iterations
Loss calculation with Gym-generated experiences
Weight synchronization between training and generation

Implementation Checklist

Use this checklist to track your integration progress:

OpenAI-compatible HTTP server implemented and tested
On-policy token ID fixes implemented and tested
Gym spinup and environment connection working
Rollout orchestration handling multi-step/multi-turn scenarios
GRPO (or equivalent) train loop integration complete

Gym Rl Framework Integration Success Criteria - Validate your integration
Generation Backend And Openai Compatible Http Server - Generation backend setup

Integration Components

Component Details

1. OpenAI-Compatible HTTP Server

2. On-Policy Token ID Fixes

3. Gym Spinup and Integration

4. Rollout Orchestration

5. GRPO Train Loop Integration

Implementation Checklist

Related Topics