NeMo Evaluator Launcher#
The Orchestration Layer empowers you to run AI model evaluations at scale. Use the unified CLI and programmatic interfaces to discover benchmarks, configure runs, submit jobs, monitor progress, and export results.
Tip
New to evaluation? Start with NeMo Evaluator Launcher for a step-by-step walkthrough.
Get Started#
Step-by-step guide to install, configure, and run your first evaluation in minutes.
Complete configuration schema, examples, and advanced patterns for all use cases.
Execution#
Execute evaluations on your local machine, HPC cluster (Slurm), or cloud platform (Lepton AI).
Docker-based evaluation on your workstation. Perfect for development and testing.
HPC cluster execution with automatic resource management and job scheduling.
Cloud execution with on-demand GPU provisioning and automatic scaling.
Export#
Export results to MLflow, Weights & Biases, Google Sheets, or local files with one command.
Export evaluation results and metrics to MLflow for experiment tracking.
Integrate with Weights & Biases for advanced visualization and collaboration.
Export to Google Sheets for easy sharing and analysis with stakeholders.
References#
Programmatic access for notebooks, automation, and custom evaluation workflows.
Complete command-line interface documentation with examples and usage patterns.
Typical Workflow#
Choose execution backend (local, Slurm, Lepton AI)
Select example configuration from the examples directory
Point to your model endpoint (OpenAI-compatible API)
Launch evaluation via CLI or Python API
Monitor progress and export results to your preferred platform
When to Use the Launcher#
Use the launcher whenever you want:
Unified interface for running evaluations across different backends
Multi-benchmark coordination with concurrent execution
Turnkey reproducibility with saved configurations
Easy result export to MLOps platforms and dashboards
Production-ready orchestration with monitoring and lifecycle management