Evaluation Config Schema#

When you create a configuration for an evaluation, you send a JSON data structure that contains the information for your configuration.

Important

Each configuration is uniquely identified by a combination of namespace and name. For example my-organization/my-configuration.

The following reference shows all available fields for creating evaluation configurations. This schema is automatically extracted from the authoritative OpenAPI specification, ensuring it’s always in sync with the API.

EvaluationConfigInput object

An evaluation configuration.

Properties

name string

The name of the entity. Must be unique inside the namespace. If not specified, it will be the same as the automatically generated id.

Constraints: max length: 255, pattern: ^[\w\-\+.@:]*$

Default:

namespace string

The namespace of the entity. This can be missing for namespace entities or in deployments that don't use namespaces.

Default: default

description string

The description of the entity.

type * string

The type of the evaluation, e.g., 'mmlu', 'big_code'.For custom evaluations, this is set to `custom`.

Constraints: min length: 1

params object

Additional parameters that need to be set at evaluation level.

Properties

parallelism integer

Parallelism to be used for the evaluation job. Typically, this represents the maximum number of concurrent requests made to the model.

request_timeout integer

The timeout to be used for requests made to the model.

max_retries integer

Maximum number of retries for failed requests.

limit_samples integer

Limit number of evaluation samples

max_tokens integer

The maximum number of tokens to generate.

temperature number

Float value between 0 and 1. temp of 0 indicates greedy decoding, where the token with highest prob is chosen. Temperature can't be set to 0.0 currently.

top_p number

Float value between 0 and 1; limits to the top tokens within a certain probability. top_p=0 means the model will only consider the single most likely token for the next prediction.

stop string | string[]

Up to 4 sequences where the API will stop generating further tokens.

Any of:

Option 1: string

Option 2: string[]

extra object

Any other custom parameters.

Allows additional properties: Yes

tasks object

Evaluation tasks belonging to the evaluation.

Additional properties schema:

[key: string] object

Configuration object for a task which is part of an evaluation.

Properties

type * string

The type of the task.

params object

Additional parameters related to the task.

Allows additional properties: Yes

metrics object

Metrics to be computed for the task.

Additional properties schema:

[key: string] object

A metric that is computed as part of the evaluation.

Properties

type * string

The type of the metric.

params object

Specific parameters for the metric.

Allows additional properties: Yes

dataset string | object

Optional dataset reference.Typically, if not specified, means that the type of task has an implicit dataset.

Any of:

Option 1: string - A reference to Dataset.

Option 2: object - A dataset that can be used for fine-tuning or evaluation.

groups object

Evaluation tasks belonging to the evaluation.

Additional properties schema:

[key: string] object

Configuration object for a group which is part of an evaluation.

Properties

tasks array

The names of the tasks that are part of the group.

Array items:

item string

groups object

Subgroups for the current group.

Allows additional properties: Yes

metrics object

The metrics that should be computed for the group.

Additional properties schema:

[key: string] object

A metric that is computed as part of the evaluation.

Properties

type * string

The type of the metric.

params object

Specific parameters for the metric.

Allows additional properties: Yes

project string

The URN of the project associated with this entity.

custom_fields object

A set of custom fields that the user can define and use for various purposes.

Allows additional properties: Yes

ownership object

Ownership information for the entity

Properties

created_by string

The ID of the user that created this entity.

Default:

updated_by string

The ID of the user that last updated this entity.

access_policies object

A general object for capturing access policies which can be used by an external service to determine ACLs

Default: {}

Additional properties schema:

[key: string] string