CLI Workflows#
This document explains how to use evaluation containers within NeMo Evaluator workflows, focusing on command execution and configuration.
Overview#
Evaluation containers provide consistent, reproducible environments for running AI model evaluations. For a comprehensive list of all available containers, refer to NeMo Evaluator Containers.
Basic CLI#
Using YAML Configuration#
Define your config:
config:
type: mmlu_pro
output_dir: /workspace/results
params:
limit_samples: 10
target:
api_endpoint:
url: https://integrate.api.nvidia.com/v1/chat/completions
model_id: meta/llama-3.1-8b-instruct
type: chat
api_key: NGC_API_KEY
Run evaluation:
export HF_TOKEN=hf_xxx
export NGC_API_KEY=nvapi-xxx
nemo-evaluator run_eval \
--run_config /workspace/my_config.yml
Using CLI overrides#
Provide all arguments through CLI:
export HF_TOKEN=hf_xxx
export NGC_API_KEY=nvapi-xxx
nemo-evaluator run_eval \
--eval_type mmlu_pro \
--model_id meta/llama-3.1-8b-instruct \
--model_url https://integrate.api.nvidia.com/v1/chat/completions \
--model_type chat \
--api_key_name NGC_API_KEY \
--output_dir /workspace/results \
--overrides 'config.params.limit_samples=10'
Interceptor Configuration#
The adapter system uses interceptors to modify requests and responses. Configure interceptors using the --overrides
parameter.
For detailed interceptor configuration, refer to Interceptors.
Note
Always remember to include endpoint
Interceptor at the and of your custom Interceptors chain.
Enable Request Logging#
config:
type: mmlu_pro
output_dir: /workspace/results
params:
limit_samples: 10
target:
api_endpoint:
url: https://integrate.api.nvidia.com/v1/chat/completions
model_id: meta/llama-3.1-8b-instruct
type: chat
api_key: NGC_API_KEY
adapter_config:
interceptors:
- name: "request_logging"
enabled: true
config:
max_requests: 1000
- name: "endpoint"
enabled: true
config: {}
export HF_TOKEN=hf_xxx
export NGC_API_KEY=nvapi-xxx
nemo-evaluator run_eval \
--run_config /workspace/my_config.yml
Enable Caching#
config:
type: mmlu_pro
output_dir: /workspace/results
params:
limit_samples: 10
target:
api_endpoint:
url: https://integrate.api.nvidia.com/v1/chat/completions
model_id: meta/llama-3.1-8b-instruct
type: chat
api_key: NGC_API_KEY
adapter_config:
interceptors:
- name: "caching"
enabled: true
config:
cache_dir: "./evaluation_cache"
reuse_cached_responses: true
save_requests: true
save_responses: true
max_saved_requests: 1000
max_saved_responses: 1000
- name: "endpoint"
enabled: true
config: {}
export HF_TOKEN=hf_xxx
export NGC_API_KEY=nvapi-xxx
nemo-evaluator run_eval \
--run_config /workspace/my_config.yml
Multiple Interceptors#
config:
type: mmlu_pro
output_dir: /workspace/results
params:
limit_samples: 10
target:
api_endpoint:
url: https://integrate.api.nvidia.com/v1/chat/completions
model_id: meta/llama-3.1-8b-instruct
type: chat
api_key: NGC_API_KEY
adapter_config:
interceptors:
- name: "caching"
enabled: true
config:
cache_dir: "./evaluation_cache"
reuse_cached_responses: true
save_requests: true
save_responses: true
max_saved_requests: 1000
max_saved_responses: 1000
- name: "request_logging"
enabled: true
config:
max_requests: 1000
- name: "reasoning"
config:
start_reasoning_token: "<think>"
end_reasoning_token: "</think>"
add_reasoning: true
enable_reasoning_tracking: true
- name: "endpoint"
enabled: true
config: {}
export HF_TOKEN=hf_xxx
export NGC_API_KEY=nvapi-xxx
nemo-evaluator run_eval \
--run_config /workspace/my_config.yml
Legacy Configuration Support#
Provide Interceptor configuration with --overrides
flag:
nemo-evaluator run_eval \
--eval_type mmlu_pro \
--model_id meta/llama-3.1-8b-instruct \
--model_url https://integrate.api.nvidia.com/v1/chat/completions \
--model_type chat \
--api_key_name NGC_API_KEY \
--output_dir ./results \
--overrides 'target.api_endpoint.adapter_config.use_request_logging=True,target.api_endpoint.adapter_config.max_saved_requests=1000,target.api_endpoint.adapter_config.use_caching=True,target.api_endpoint.adapter_config.caching_dir=./cache,target.api_endpoint.adapter_config.reuse_cached_responses=True'
Note
Legacy parameters will be automatically converted to the modern interceptor-based configuration. For new projects, use the YAML interceptor configutation shown above.
Troubleshooting#
Port Conflicts#
If you manually specify the adapter server port, you can encounter port conflicts. Try selecting a differnt port:
export ADAPTER_PORT=3828
export ADAPTER_HOST=localhost
Note
You can also rely on NeMo Evaluator’s dynamic port binding feature.
API Key Issues#
Verify your API key environment variable:
echo $MY_API_KEY
Environment Variables#
Adapter Server Configuration#
export ADAPTER_PORT=3828 # Default: 3825
export ADAPTER_HOST=localhost
API Key Management#
export MY_API_KEY=your_api_key_here
export HF_TOKEN=your_hf_token_here