Success Criteria
Use these criteria to validate that your Gym integration is working correctly. A successful integration must pass all validation benchmarks.
These success criteria may evolve as new integration challenges are discovered. Check this page for updates when troubleshooting integration issues.
Validation Checklist
1. Component Form Factor
Verify that your integration implements all required components as specified in Gym Integration Footprint And Form Factor:
- OpenAI-compatible HTTP server
- On-policy token ID fixes
- Gym spinup and integration
- Rollout orchestration
- Training loop integration
2. Environment Configuration
Verify that your integration can load and run arbitrary Gym training environments through configuration:
- Environment configuration loads from YAML
- Multiple environments can be selected at runtime
- Environment parameters are configurable without code changes
3. Math Reasoning Benchmark
Train on the DAPO17k math training environment and verify model improvement on AIME24.
4. Workplace Assistant Benchmark
Train on the workplace assistant environment and verify validation set improvements.
Troubleshooting
If your integration fails to meet the success criteria:
- Training crashes: Check for off-policy issues. Refer to On-Policy Corrections
- No improvement: Verify rollout orchestration is correctly tracking token IDs
- Environment errors: Verify OpenAI-compatible HTTP server endpoints match the specification
Related Topics
- Gym Integration Footprint And Form Factor - Required integration components
- On-Policy Corrections - On-policy training fixes