Create Evaluation Job#

To create an evaluation job, send a POST request to the evaluation/jobs API. The URL of the evaluator API depends on where you deploy evaluator and how you configure it. For more information, refer to NeMo Evaluator Deployment Guide.

Prerequisites#

Set your EVALUATOR_BASE_URL environment variable to your evaluator service endpoint:
```
export EVALUATOR_BASE_URL="https://your-evaluator-service-endpoint"
```
Ensure you have created both an evaluation target and an evaluation configuration

v2 (Preview)#

Warning

v2 API Preview: The v2 API is available for testing and feedback but is not yet recommended for production use. Breaking changes may occur before the stable release.

The v2 API introduces a spec envelope at the top level.

Python SDK

import os
from nemo_microservices import NeMoMicroservices

# Initialize the client
client = NeMoMicroservices(
    base_url=os.environ['EVALUATOR_BASE_URL']
)

# Create an evaluation job (v2 API)
job = client.v2.evaluation.jobs.create(
    spec={
        "target": {
          # example target
          "name": <my-target-name>,
          "namespace": <my-target-name>,
          "type": <my-target-type>,
        },
        "config": {
          # example config
          "name": <my-config-name>,
          "namespace": <my-config-name>,
          "type": <my-config-type>,
          "params": {},
        }
    }
)

# Get the job ID and status
job_id = job.id
print(f"Job ID: {job_id}")
print(f"Job status: {job.status}")

cURL

curl -X "POST" "${EVALUATOR_BASE_URL}/v2/evaluation/jobs" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "spec": {
      "target": {
        "name": "<my-target-name>",
        "namespace": "<my-target-namespace>",
        "type": "model",
        "model": {
          "api_endpoint": {
            "url": "http://nemo-nim-proxy:8000/v1/chat/completions",
            "model_id": "meta/llama-3.1-8b-instruct"
          }
        }
      },
      "config": {
        "name": "<my-config-name>",
        "namespace": "<my-config-namespace>",
        "type": "bfclv3",
        "params": {
          "limit_samples": 10
        },
        "tasks": {
          "task1": {
            "type": "simple"
          }
        }
      }
    }
  }'

Key v2 Differences:

Spec envelope: Target and config are wrapped in a required spec object
Endpoint: Uses /v2/evaluation/jobs instead of /v1/evaluation/jobs
Response structure: Includes the new fields and spec envelope in the response
Secrets: To securely use API keys for jobs with the v2 API, the secrets must be defined in-line with the job definition, not referenced from v1 targets or configs. Refer to V2 Secrets.

v1 (Current)#

Python SDK

import os
from nemo_microservices import NeMoMicroservices

# Initialize the client
client = NeMoMicroservices(
    base_url=os.environ['EVALUATOR_BASE_URL']
)

# Create an evaluation job (v1 API)
job = client.evaluation.jobs.create(
    namespace="my-organization",
    target="<my-target-namespace>/<my-target-name>",
    config="<my-config-namespace>/<my-config-name>"
)

# Get the job ID and status
job_id = job.id
print(f"Job ID: {job_id}")
print(f"Job status: {job.status}")

cURL

curl -X "POST" "${EVALUATOR_BASE_URL}/v1/evaluation/jobs" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "namespace": "my-organization",
    "target": "<my-target-namespace>/<my-target-name>",
    "config": "<my-config-namespace>/<my-config-name>"
  }'