Create an Evaluation Configuration#
To create a configuration for an evaluation, send a POST request to the /v1/evaluation/configs API. The URL of the evaluator API depends on where you deploy evaluator and how you configure it. For more information, refer to Job Target and Configuration Matrix.
Note
v2 API Availability: While evaluation configurations are created using the same v1 API, when you create evaluation jobs using these configurations, you can choose between v1 and v2 job creation APIs. The v2 API introduces a new job structure with name and spec envelope. For details, refer to the v2 Migration Guide.
Prerequisites#
Set your
EVALUATOR_BASE_URLenvironment variable to your evaluator service endpoint:export EVALUATOR_BASE_URL="https://your-evaluator-service-endpoint"
Review the available evaluation configuration types
To Create an Evaluation Configuration#
Choose one of the following options to create an evaluation configuration.
import os
from nemo_microservices import NeMoMicroservices
# Initialize the client
client = NeMoMicroservices(
base_url=os.environ['EVALUATOR_BASE_URL']
)
# Create an evaluation config
config = client.evaluation.configs.create(
type="gsm8k",
name="my-configuration-lm-harness-gsm8k-1",
namespace="my-organization",
params={
"temperature": 0.00001,
"top_p": 0.00001,
"max_tokens": 256,
"stop": ["<|eot|>"],
"extra": {
"num_fewshot": 8,
"batch_size": 16,
"bootstrap_iters": 100000,
"dataset_seed": 42,
"use_greedy": True,
"top_k": 1,
"hf_token": "<my-token>",
"tokenizer_backend": "hf",
"tokenizer": "meta-llama/Llama-3.1-8B-Instruct",
"apply_chat_template": True,
"fewshot_as_multiturn": True
}
}
)
print("Evaluation config created successfully")
curl -X "POST" "${EVALUATOR_BASE_URL}/v1/evaluation/configs" \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"type": "gsm8k",
"name": "my-configuration-lm-harness-gsm8k-1",
"namespace": "my-organization",
"params": {
"temperature": 0.00001,
"top_p": 0.00001,
"max_tokens": 256,
"stop": ["<|eot|>"],
"extra": {
"num_fewshot": 8,
"batch_size": 16,
"bootstrap_iters": 100000,
"dataset_seed": 42,
"use_greedy": true,
"top_k": 1,
"hf_token": "<my-token>",
"tokenizer_backend": "hf",
"tokenizer": "meta-llama/Llama-3.1-8B-Instruct",
"apply_chat_template": true,
"fewshot_as_multiturn": true
}
}
}'
Example Response
{
"created_at": "2025-03-19T22:50:02.206136",
"updated_at": "2025-03-19T22:50:02.206138",
"id": "eval-config-MNOP1234QRST5678",
"name": "my-configuration-lm-harness-gsm8k-1",
"namespace": "my-organization",
"type": "gsm8k",
"params": {
"temperature": 0.00001,
"top_p": 0.00001,
"max_tokens": 256,
"stop": ["<|eot|>"],
"extra": {
"num_fewshot": 8,
"batch_size": 16,
"bootstrap_iters": 100000,
"dataset_seed": 42,
"use_greedy": true,
"top_k": 1,
"hf_token": "<my-token>",
"tokenizer_backend": "hf",
"tokenizer": "meta-llama/Llama-3.1-8B-Instruct",
"apply_chat_template": true,
"fewshot_as_multiturn": true
}
},
"custom_fields": {}
}