Success Criteria#

Use these criteria to validate that your Gym integration is working correctly. A successful integration must pass all validation benchmarks.

Tip

These success criteria may evolve as new integration challenges are discovered. Check this page for updates when troubleshooting integration issues.

Validation Checklist#

Verify that your integration implements all required components as specified in Integration Footprint:

Verify that your integration can load and run arbitrary Gym training environments through configuration:

Train on the DAPO17k math training environment and verify model improvement on AIME24.

Parameter	Value
Training environment	DAPO17k math environment
Base model	Qwen3-4B-Instruct-2507
Minimum training steps	1,000
Validation set	AIME24 (included with training environment)
Target accuracy	≥85%

Train on the workplace assistant environment and verify validation set improvements.

Parameter	Value
Training environment	Workplace assistant environment
Base model	Qwen3-4B-Instruct-2507
Minimum training steps	100
Success criterion	Observable validation set improvement

If your integration fails to meet the success criteria:

Training crashes: Check for off-policy issues. Refer to On-Policy Corrections
No improvement: Verify rollout orchestration is correctly tracking token IDs
Environment errors: Verify OpenAI-compatible HTTP server endpoints match the specification