Download Evaluation Results#

To download the results of an evaluation job, send a GET request to the evaluation/jobs/<job_id>/download-results API. This downloads a directory that contains the configuration files, logs, and evaluation results for a specific evaluation job.

Options#

API#

curl -X "GET" "${EVALUATOR_SERVICE_URL}/v1/evaluation/jobs/<job_id>/download-results" \
-H 'accept: application/json' \
-o result.zip
import requests

url = f"{EVALUATOR_SERVICE_URL}/v1/evaluation/jobs/<job_id>/download-results"

response = requests.get(url, headers={'accept': 'application/json'}, stream=True)

with open('result.zip', 'wb') as file:
    for chunk in response.iter_content():
        file.write(chunk)

print("Download completed.")

After the download completes, the results are available in the result.zip file. To unzip the result.zip file on Ubuntu, macOS, or Linux, run the following code.

unzip result.zip -d result

You can find the result files in the results/ folder. For example, if you run an lm-harness evaluation, the results are in automatic/lm_eval_harness/results.

The directory structure will look like this:

.
├── automatic
│   └── lm_eval_harness
│       ├── model_config_meta-llama-3_1-8b-instruct.yaml
│       ├── model_config_meta-llama-3_1-8b-instruct_inference_params.yaml
│       └── results
│           ├── README.md
│           ├── lm-harness-mmlu_str.json
│           ├── lm-harness.json
│           ├── lmharness_meta-llama-3_1-8b-instruct_aggregateresults-run.log
│           ├── lmharness_meta-llama-3_1-8b-instruct_mmlu_str-run.log
└── metadata.json