Evaluation Config Schema#
When you create a configuration for an evaluation, you send a JSON data structure that contains the information for your configuration.
Important
Each configuration is uniquely identified by a combination of namespace and name. For example my-organization/my-configuration.
The following reference shows all available fields for creating evaluation configurations. This schema is automatically extracted from the authoritative OpenAPI specification, ensuring it’s always in sync with the API.
EvaluationConfigInput object
An evaluation configuration.
Properties
name string
The name of the entity. Must be unique inside the namespace. If not specified, it will be the same as the automatically generated id.
Constraints: max length: 255, pattern:
^[\w\-\+.@:]*$Default:
namespace string
The namespace of the entity. This can be missing for namespace entities or in deployments that don't use namespaces.
Default:
defaultdescription string
The description of the entity.
type * string
The type of the evaluation, e.g., 'mmlu', 'big_code'.For custom evaluations, this is set to `custom`.
Constraints: min length: 1
params object
Additional parameters that need to be set at evaluation level.
Properties
parallelism integer
Parallelism to be used for the evaluation job. Typically, this represents the maximum number of concurrent requests made to the model.
request_timeout integer
The timeout to be used for requests made to the model.
max_retries integer
Maximum number of retries for failed requests.
limit_samples integer
Limit number of evaluation samples
max_tokens integer
The maximum number of tokens to generate.
temperature number
Float value between 0 and 1. temp of 0 indicates greedy decoding, where the token with highest prob is chosen. Temperature can't be set to 0.0 currently.
top_p number
Float value between 0 and 1; limits to the top tokens within a certain probability. top_p=0 means the model will only consider the single most likely token for the next prediction.
stop string | string[]
Up to 4 sequences where the API will stop generating further tokens.
Any of:
Option 1:
stringOption 2:
string[]extra object
Any other custom parameters.
Allows additional properties: Yes
tasks object
Evaluation tasks belonging to the evaluation.
Additional properties schema:
[key: string] object
Configuration object for a task which is part of an evaluation.
Properties
type * string
The type of the task.
params object
Additional parameters related to the task.
Allows additional properties: Yes
metrics object
Metrics to be computed for the task.
Additional properties schema:
[key: string] object
A metric that is computed as part of the evaluation.
Properties
type * string
The type of the metric.
params object
Specific parameters for the metric.
Allows additional properties: Yes
dataset string | object
Optional dataset reference.Typically, if not specified, means that the type of task has an implicit dataset.
Any of:
Option 1:
string - A reference to Dataset.Option 2:
object - A dataset that can be used for fine-tuning or evaluation.groups object
Evaluation tasks belonging to the evaluation.
Additional properties schema:
[key: string] object
Configuration object for a group which is part of an evaluation.
Properties
tasks array
The names of the tasks that are part of the group.
Array items:
item string
groups object
Subgroups for the current group.
Allows additional properties: Yes
metrics object
The metrics that should be computed for the group.
Additional properties schema:
[key: string] object
A metric that is computed as part of the evaluation.
Properties
type * string
The type of the metric.
params object
Specific parameters for the metric.
Allows additional properties: Yes
project string
The URN of the project associated with this entity.
custom_fields object
A set of custom fields that the user can define and use for various purposes.
Allows additional properties: Yes
ownership object
Ownership information for the entity
Properties
created_by string
The ID of the user that created this entity.
Default:
updated_by string
The ID of the user that last updated this entity.
access_policies object
A general object for capturing access policies which can be used by an external service to determine ACLs
Default:
{}Additional properties schema:
[key: string] string