NeMo Evaluator SDK Documentation#
Welcome to the NeMo Evaluator SDK Documentation.
Introduction to NeMo Evaluator SDK#
Discover how NeMo Evaluator SDK works and explore its key features.
Explore the NeMo Evaluator Core and Launcher architecture
Discover NeMo Evaluator SDK’s powerful capabilities.
Master core concepts powering NeMo Evaluator SDK.
Release notes for the NeMo Evaluator SDK.
Choose a Quickstart#
Select the evaluation approach that best fits your workflow and technical requirements.
Use the CLI to orchestrate evaluations with automated container management.
Get direct Python API access with full adapter features, custom configurations, and workflow integration capabilities.
Gain full control over the container environment with volume mounting, environment variable management, and integration into Docker-based CI/CD pipelines.
Libraries#
Launcher#
Orchestrate evaluations across different execution backends with unified CLI and programmatic interfaces.
Complete configuration schema, examples, and advanced patterns for all use cases.
Run evaluations on local machines, HPC clusters (Slurm), or cloud platforms (Lepton AI).
Export results to MLflow, Weights & Biases, Google Sheets, or local files with one command.
Programmatic access for notebooks, automation, and custom evaluation workflows.
Complete command-line interface documentation with examples and usage patterns.
Core#
Access the core evaluation engine directly with containerized benchmarks and flexible adapter architecture.
Use the evaluation engine through Python API, containers, or programmatic workflows.
Ready-to-use evaluation containers with curated benchmarks and frameworks.
Configure request/response interceptors for logging, caching, and custom processing.
Comprehensive logging setup for evaluation runs, debugging, and audit trails.
Add custom benchmarks and frameworks by defining configuration and interfaces.
Python API documentation for programmatic evaluation control and integration.