Troubleshooting#

This page maps common eval/model_eval failures to the config fields that usually need correction. Nemotron builds a launcher config and calls NeMo Evaluator Launcher; task execution, endpoint checks, and result writing are owned by the launcher.

Evaluator Extra Missing#

Symptom:

Error: nemo-evaluator-launcher is required for evaluation
Install with: uv sync --extra evaluator

Recovery:

uv sync --extra evaluator

Then rerun the same nemotron steps run eval/model_eval command with uv run --no-sync.

Hosted Endpoint Fails#

Most hosted failures come from one of these fields:

Field

What To Check

target.api_endpoint.url

Full endpoint URL, including /v1/chat/completions or /v1/completions.

target.api_endpoint.model_id

Exact model id returned by the endpoint’s models API or UI.

target.api_endpoint.api_key_name

Environment variable name, not the secret value.

target.api_endpoint.type

chat for chat tasks, completions for completions/logprob tasks.

For hosted smoke tests, start with tiny_chat.yaml and target.api_endpoint.type=chat.

Wrong Task For Endpoint Type#

Chat tasks need a chat endpoint. Log-probability tasks generally need a completions endpoint with logprobs support and a tokenizer.

If the launcher fails after endpoint setup, check:

tasks
target.api_endpoint.type
evaluation.nemo_evaluator_config.config.params.extra.tokenizer

Use exact task IDs from:

nemo-evaluator-launcher ls tasks

Bad Checkpoint Path#

When using default.yaml, point deployment.checkpoint_path at a concrete Megatron Bridge iter_* directory. Do not point it only at the parent training output directory.

deployment.checkpoint_path=/path/to/run/iter_0001000

For log-probability tasks, also verify:

evaluation.nemo_evaluator_config.config.params.extra.tokenizer=/path/to/run/iter_0001000/tokenizer

Launcher Job State#

The step prints launcher follow-up commands when the launcher returns an invocation id.

status_command: nemo-evaluator-launcher status <id>
logs_command: nemo-evaluator-launcher logs <id>

Run those commands before changing config. The launcher logs usually distinguish endpoint/authentication failures from task-schema failures.