NeMo Evaluator SDK Documentation#

Gain full control over the container environment with volume mounting, environment variable management, and integration into Docker-based CI/CD pipelines.

Docker

Container Direct

Libraries#

Launcher#

Orchestrate evaluations across different execution backends with unified CLI and programmatic interfaces.

Configuration

Complete configuration schema, examples, and advanced patterns for all use cases.

Setup

Configuration

Executors

Run evaluations on local machines, HPC clusters (Slurm), or cloud platforms (Lepton AI).

Execution

Executors

Exporters

Export results to MLflow, Weights & Biases, Google Sheets, or local files with one command.

Export

Exporters

Python API

Programmatic access for notebooks, automation, and custom evaluation workflows.

API

Python API

CLI Reference

Complete command-line interface documentation with examples and usage patterns.

CLI

NeMo Evaluator Launcher CLI Reference (nemo-evaluator-launcher)

Core#

Access the core evaluation engine directly with containerized benchmarks and flexible adapter architecture.

Workflows

Use the evaluation engine through Python API, containers, or programmatic workflows.

Integration

Workflows

Containers

Ready-to-use evaluation containers with curated benchmarks and frameworks.

Containers

NeMo Evaluator Containers

Interceptors

Configure request/response interceptors for logging, caching, and custom processing.

Customization

Interceptors

Logging

Comprehensive logging setup for evaluation runs, debugging, and audit trails.

Monitoring

Logging Configuration

Extending

Add custom benchmarks and frameworks by defining configuration and interfaces.

Extension

Extending NeMo Evaluator

API Reference

Python API documentation for programmatic evaluation control and integration.

API

API Reference