Download Evaluation Results#
To download the results of an evaluation job, send a GET request to the evaluation jobs results API.
This downloads a directory that contains the configuration files, logs, and evaluation results for a specific evaluation job.
v2 (Preview)#
Warning
v2 API Preview: The v2 API is available for testing and feedback but is not yet recommended for production use. Breaking changes may occur before the stable release.
The v2 API provides structured access to different result types through separate endpoints.
import os
from nemo_microservices import NeMoMicroservices
# Initialize the client
client = NeMoMicroservices(
base_url=os.environ['EVALUATOR_BASE_URL']
)
# Download job artifacts (v2 API)
job_id = "job-id"
artifacts_zip = client.v2.evaluation.jobs.results.artifacts.retrieve(job_id)
# Save to file
artifacts_zip.write_to_file('artifacts.tar.gz')
# Alternatively, download evaluation results separately
eval_results = client.v2.evaluation.jobs.results.evaluation_results.retrieve(job_id)
with open("evaluation_results.json", "w") as f:
f.write(eval_results.model_dump_json(indent=2, exclude_none=True))
print("Download completed.")
# Download job artifacts (logs, intermediate files, etc.)
curl -X "GET" "${EVALUATOR_BASE_URL}/v2/evaluation/jobs/<job_id>/results/artifacts/download" \
-o artifacts.tar.gz
# Download evaluation results (structured results only)
curl -X "GET" "${EVALUATOR_BASE_URL}/v2/evaluation/jobs/<job_id>/results/evaluation-results/download" \
-H 'accept: application/json'
v2 Result Types#
The v2 API distinguishes between different result types:
artifacts: Complete job artifacts including logs, intermediate files, configuration files, and all outputsevaluation-results: Structured evaluation metrics and scores only
v2 Available Results
You can list available results first:
curl -X "GET" "${EVALUATOR_BASE_URL}/v2/evaluation/jobs/<job_id>/results" \
-H 'accept: application/json'
Response:
[
{
"result_name": "evaluation-results",
"job_id": "job-dq1pjj6vj5p64xaeqgvuk4",
"created_at": "2025-09-08T19:21:43.078131",
"artifact_url": "hf://default/job-results-job-dq1pjj6vj5p64xaeqgvuk4/evaluation-results",
"artifact_storage_type": "nds"
},
{
"result_name": "artifacts",
"job_id": "job-dq1pjj6vj5p64xaeqgvuk4",
"created_at": "2025-09-08T19:21:39.665664",
"artifact_url": "hf://default/job-results-job-dq1pjj6vj5p64xaeqgvuk4/artifacts",
"artifact_storage_type": "nds"
}
]
v1 (Current)#
Choose one of the following options to download evaluation results.
import os
from nemo_microservices import NeMoMicroservices
# Initialize the client
client = NeMoMicroservices(
base_url=os.environ['EVALUATOR_BASE_URL']
)
# Download evaluation results (v1 API)
results_zip = client.evaluation.jobs.download_results("job-id")
# Save to file
results_zip.write_to_file('result.zip')
print("Download completed.")
curl -X "GET" "${EVALUATOR_BASE_URL}/v1/evaluation/jobs/<job_id>/download-results" \
-H 'accept: application/zip' \
-o result.zip
Results#
After the download completes, the results are available in the result.zip file. To unzip the result.zip file on Ubuntu, macOS, or Linux, run the following code.
unzip result.zip -d result
You can find the result files in the results/ folder. For example, if you run an lm-harness evaluation, the results are in automatic/lm_eval_harness/results.
The directory structure will look like this:
.
├── automatic
│ └── lm_eval_harness
│ ├── model_config_meta-llama-3_1-8b-instruct.yaml
│ ├── model_config_meta-llama-3_1-8b-instruct_inference_params.yaml
│ └── results
│ ├── README.md
│ ├── lm-harness-mmlu_str.json
│ ├── lm-harness.json
│ ├── lmharness_meta-llama-3_1-8b-instruct_aggregateresults-run.log
│ ├── lmharness_meta-llama-3_1-8b-instruct_mmlu_str-run.log
└── metadata.json