Create an Evaluation Configuration#
To create a configuration for an evaluation, send a POST
request to the /evaluation/configs
API. The URL of the evaluator API depends on where you deploy evaluator and how you configure it. For more information, refer to Job Target and Configuration Matrix.
Prerequisites#
The examples in this documentation specify {EVALUATOR_SERVICE_URL}
in the code. Do the following to store the Evaluator service URL to use it in your code.
Important
Replace <your evaluator service endpoint>
with your address, such as https://evaluator.internal.your-company.com
, before you run this code.
export EVALUATOR_SERVICE_URL="<your evaluator service endpoint>"
import requests
EVALUATOR_SERVICE_URL = "<your evaluator service endpoint>"
Options#
API#
Perform a POST request to the
/v1/evaluation/configs
endpoint.curl -X "POST" "${EVALUATOR_SERVICE_URL}/v1/evaluation/configs" \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d ' { "type": "<evaluation-type>", "name": "<my-configuration-name>", "namespace": "<my-namespace>", // More config details }'
data = { "type": "<evaluation-type>", "name": "<my-configuration-name>", "namespace": "<my-namespace>", # More config details } endpoint = f"{EVALUATOR_SERVICE_URL}/v1/evaluation/configs" response = requests.post(endpoint, json=data).json()
Review the returned configuration.
Example Response
{ "created_at": "2025-03-19T22:50:02.206136", "updated_at": "2025-03-19T22:50:02.206138", "id": "eval-config-MNOP1234QRST5678", "name": "my-configuration-lm-harness-gsm8k-1", "namespace": "my-organization", "type": "gsm8k", "params": { "temperature": 0.00001, "top_p": 0.00001, "max_tokens": 256, "stop": ["<|eot|>"], "extra": { "num_fewshot": 8, "batch_size": 16, "bootstrap_iters": 100000, "dataset_seed": 42, "use_greedy": true, "top_k": 1, "hf_token": "<my-token>", "tokenizer_backend": "hf", "tokenizer": "meta-llama/Llama-3.1-8B-Instruct", "apply_chat_template": true, "fewshot_as_multiturn": true } }, "custom_fields": {} }