Metric Job Management#

Note

Performance Tuning: You can improve evaluation performance by setting job.params.parallelism to control the number of concurrent requests. A typical default value is 16, but you can adjust it based on your model capacity and rate limits.

Monitor Job#

Monitor job status.

job_status = client.evaluation.metric_jobs.get_status(name=job.name)
while job_status.status in ("active", "pending", "created"):
    time.sleep(10)
    job_status = client.evaluation.metric_jobs.get_status(name=job.name)
    print("status:", job_status.status, job_status.status_details)
print(job_status)

Refer to Evaluator Troubleshooting for help troubleshooting job failures.

Fetch Job Logs#

Get JSON logs with pagination. Logs are available for an active job and after the job terminates.

logs_response = client.evaluation.metric_jobs.get_logs(name=job.name)
for log_entry in logs_response.data:
    print(f"[{log_entry.timestamp}] {log_entry.message.strip()}")

# Handle pagination
while logs_response.next_page:
    logs_response = client.evaluation.metric_jobs.get_logs(
        name=job.name,
        page_cursor=logs_response.next_page
    )
    for log_entry in logs_response.data:
        print(f"[{log_entry.timestamp}] {log_entry.message.strip()}")

View Evaluation Results#

Evaluation results are available after the evaluation job successfully completes. Refer to Evaluation Results for details on how to fetch evaluation results.

Download Job Artifacts#

Files generated during job execution are available for download. Job artifacts are useful to inspect evaluation details.

artifacts_zip = client.evaluation.metric_jobs.results.artifacts.download(job=job.name, workspace=workspace)
artifacts_zip.write_to_file("evaluation_artifacts.tar.gz")
print("Saved artifacts to evaluation_artifacts.tar.gz")

Extract files from the tarball with the following command and an artifacts directory will be created.

tar -xf evaluation_artifacts.tar.gz