For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DocumentationAPI Reference
DocumentationAPI Reference
  • Documentation
    • Home
  • About
    • Concepts
    • Ecosystem
  • Get Started
    • Quickstart
    • Detailed Setup Guide
    • Install from PyPI
    • Rollout Collection
  • Agent Server
  • Model Server
    • vLLM
  • Resources Server
  • Data
    • Prepare and Validate
    • Download from Hugging Face
    • Prompt Config
  • Environment Tutorials
    • Single-Step Environment
    • Multi-Step Environment
    • Stateful Environment
    • Real-World Environment
    • Integrate external libraries
    • Aggregate Metrics
    • LLM-as-Judge Verification
  • Benchmarks
    • Run benchmarks
    • Add a benchmark
    • Design a customer evaluation
  • Training Tutorials
    • NeMo RL
    • Unsloth
    • Multi-Environment Training
    • Offline Training (SFT/DPO)
  • Model Recipes
    • Nemotron 3 Nano
    • Nemotron 3 Super
  • Infrastructure
    • Deployment Topology
    • Engineering Notes
  • Reference
    • Configuration
    • RL Framework Compatibility
    • CLI Commands
    • FAQ
  • Troubleshooting
    • Configuration Errors
  • Contribute
    • Development Setup
    • Environments
    • Integrate RL Frameworks
      • Generation Backend
      • On-Policy Corrections
      • Integration Footprint
      • Success Criteria
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Manage My Privacy | Do Not Sell or Share My Data | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoNeMo Gym
On this page
  • Integration Components
  • Component Details
  • 1. OpenAI-Compatible HTTP Server
  • 2. On-Policy Token ID Fixes
  • 3. Gym Spinup and Integration
  • 4. Rollout Orchestration
  • 5. GRPO Train Loop Integration
  • Implementation Checklist
  • Related Topics
ContributeIntegrate RL Frameworks

Integration Footprint

||View as Markdown|
Previous

On-Policy Corrections

Next

Success Criteria

This page provides a reference for the components required to integrate Gym into your training framework. Each component includes links to the NeMo RL reference implementation and corresponding tests.

Integration Components

A complete Gym integration consists of five components, implemented in sequence:

ComponentImplementationTests
1OpenAI-Compatible HTTP Servervllm_worker_async.py:264test_vllm_generation.py:1107
2On-Policy Token ID Fixesvllm_worker_async.py:40test_vllm_generation.py:1250
3Gym Spinup and Integrationnemo_gym.pytest_nemo_gym.py
4Rollout Orchestrationrollouts.py:975test_rollouts.py:754
5GRPO Train Loop Integrationgrpo.py:1157End-to-end tests in progress

As of December 8, 2025, end-to-end tests for GRPO train loop integration are still being implemented in the NeMo RL repository.

Component Details

1. OpenAI-Compatible HTTP Server

Purpose: Expose your generation backend as an OpenAI-compatible endpoint.

Prerequisites: vLLM or SGLang generation backend.

Reference: Refer to Generation Backend And Openai Compatible Http Server for implementation guidance.

2. On-Policy Token ID Fixes

Purpose: Prevent train-generation mismatch in multi-step and multi-turn scenarios.

Prerequisites: OpenAI-compatible HTTP server.

Reference: Refer to On-Policy Corrections for technical details.

3. Gym Spinup and Integration

Purpose: Initialize and connect to Gym training environments.

Key responsibilities:

  • Environment configuration loading
  • Connection management
  • State synchronization

4. Rollout Orchestration

Purpose: Coordinate rollout collection between the policy and Gym environments.

Key responsibilities:

  • Batch rollout management
  • Multi-step and multi-turn handling
  • Token ID tracking for on-policy corrections

5. GRPO Train Loop Integration

Purpose: Integrate Gym rollouts into the policy optimization training loop.

Key responsibilities:

  • Rollout scheduling within training iterations
  • Loss calculation with Gym-generated experiences
  • Weight synchronization between training and generation

Implementation Checklist

Use this checklist to track your integration progress:

  • OpenAI-compatible HTTP server implemented and tested
  • On-policy token ID fixes implemented and tested
  • Gym spinup and environment connection working
  • Rollout orchestration handling multi-step/multi-turn scenarios
  • GRPO (or equivalent) train loop integration complete

Related Topics

  • Gym Rl Framework Integration Success Criteria - Validate your integration
  • Generation Backend And Openai Compatible Http Server - Generation backend setup