NeMo Evaluator#
The Core Evaluation Engine delivers standardized, reproducible AI model evaluation through containerized benchmarks and a flexible adapter architecture.
Tip
Need orchestration? For CLI and multi-backend execution, use the NeMo Evaluator Launcher.
Get Started#
Run evaluations using pre-built containers directly or integrate them through the Python API.
Ready-to-use evaluation containers with curated benchmarks and frameworks.
Reference and Customization#
Set up interceptors to handle requests, responses, logging, caching, and custom processing.
Comprehensive logging setup for evaluation runs, debugging, and audit trails.
Add custom benchmarks and frameworks by defining configuration and interfaces.
Python API documentation for programmatic evaluation control and integration.
Command-line interface for direct container and evaluation execution.