Local Executor#

The Local executor runs evaluations on your machine using Docker. It provides a fast way to iterate if you have Docker installed, evaluating existing endpoints.

See common concepts and commands in Executors.

Prerequisites#

Docker
Python environment with the NeMo Evaluator Launcher CLI available (install the launcher by following NeMo Evaluator Launcher Quickstart)

Quick Start#

For detailed step-by-step instructions on evaluating existing endpoints, refer to the NeMo Evaluator Launcher Quickstart guide, which covers:

Choosing models and tasks
Setting up API keys (for NVIDIA APIs, see Setting up API Keys)
Creating configuration files
Running evaluations

Here’s a quick overview for the Local executor:

Run evaluation for existing endpoint#

# Run evaluation
nemo-evaluator-launcher run --config-dir examples --config-name local_llama_3_1_8b_instruct \
  -o target.api_endpoint.api_key_name=API_KEY

Environment Variables#

The Local executor supports passing environment variables from your local machine to evaluation containers:

How It Works#

The executor passes environment variables to Docker containers using docker run -e KEY=VALUE flags. The executor automatically adds $ to your variable names from the configuration env_vars (for example, OPENAI_API_KEY becomes $OPENAI_API_KEY).

Configuration#

evaluation:
  env_vars:
    API_KEY: YOUR_API_KEY_ENV_VAR_NAME
    CUSTOM_VAR: YOUR_CUSTOM_ENV_VAR_NAME
  tasks:
    - name: my_task
      env_vars:
        TASK_SPECIFIC_VAR: TASK_ENV_VAR_NAME

Secrets and API Keys#

The executor handles API keys the same way as environment variables - store them as environment variables on your machine and reference them in the env_vars configuration.

Mounting and Storage#

The Local executor uses Docker volume mounts for data persistence:

Docker Volumes#

Results Mount: Each task’s artifacts directory mounts as /results in evaluation containers
No Custom Mounts: Local executor doesn’t support custom volume mounts

Rerunning Evaluations#

The Local executor generates reusable scripts for rerunning evaluations:

Script Generation#

The Local executor automatically generates scripts:

run_all.sequential.sh: Script to run all evaluation tasks sequentially (in output directory)
run.sh: Individual scripts for each task (in each task subdirectory)
Reproducible: Scripts contain all necessary commands and configurations

Manual Rerun#

# Rerun all tasks
cd /path/to/output_dir/2024-01-15-10-30-45-abc12345/
bash run_all.sequential.sh

# Rerun specific task
cd /path/to/output_dir/2024-01-15-10-30-45-abc12345/task1/
bash run.sh

Key Features#

Docker-based execution: Isolated, reproducible runs
OpenAI-compatible endpoint support: Works with any OpenAI-compatible endpoint
Script generation: Reusable scripts for rerunning evaluations
Real-time logs: Status tracking via log files

Monitoring and Job Management#

For monitoring jobs, checking status, and managing evaluations, see Executors.