NeMo Gym is a library for evaluating and improving models and agents using environments. NeMo Gym provides infrastructure to develop environments, scalably run evaluation and training, and a collection of popular benchmarks and training environments.
If you are scoring model outputs with a stateless check and do not need scale or training, a script is probably sufficient.

NeMo Gym integrates with the broader agentic ecosystem:
Install NeMo Gym and run your first evaluation.
Browse available environments for evaluation and training.
Explore available agent harnesses and learn how to integrate your own agent.
Improve your agent or model with RL or fine-tuning.
Create your own evaluation or training environments.