Pipeline Overview#
The eval/model_eval step is a thin Nemotron wrapper around NeMo Evaluator Launcher.
It does not implement benchmark scoring itself.
It loads a YAML config, applies command-line overrides, saves the launcher config, and calls run_eval.
Architecture#
%%{init: {'theme': 'base', 'themeVariables': { 'primaryBorderColor': '#333333', 'lineColor': '#333333', 'primaryTextColor': '#333333', 'clusterBkg': '#ffffff', 'clusterBorder': '#333333'}}}%%
flowchart LR
hosted["Hosted OpenAI-compatible endpoint"] --> cfg["target.api_endpoint"]
ckpt["Megatron Bridge iter_* checkpoint"] --> deploy["launcher deployment block"]
cfg --> step["eval/model_eval"]
deploy --> step
step --> launcher["NeMo Evaluator Launcher run_eval"]
launcher --> results["eval_results under output_dir"]
Runtime Flow#
step.pycallsrun_model_evalfromruntime.py.The runtime loads the selected config from
config/default.yaml,config/tiny_chat.yaml, or a user-supplied YAML path.Hydra-style dotlist overrides are merged into the config.
Nemotron-only keys are removed before launcher dispatch:
dry_run,output_dir,task_filters, andrun.output_diris copied intoexecution.output_dir.The resolved launcher config is saved and printed as
launcher_config.nemo_evaluator_launcher.api.functional.run_evalis called with the launcher config and optional task filters.
Input Artifacts#
The step declares optional checkpoint_megatron input.
Hosted endpoint runs do not consume a checkpoint artifact.
Launcher-managed checkpoint runs usually pass a concrete Megatron Bridge iter_* directory through deployment.checkpoint_path.
Output Artifact#
The step produces eval_results.
The exact directory layout and files are owned by NeMo Evaluator Launcher and the selected task implementations.
For result inspection guidance, refer to Output Artifacts.
What Is Owned Where#
Owned by Nemotron |
Owned by NeMo Evaluator Launcher |
|---|---|
Step discovery, config loading, dotlist overrides, launcher config saving. |
Task implementations, endpoint probing, deployment orchestration, and result files. |
|
|
The |
Accepted task identifiers and version-specific task behavior. |