nat.plugins.eval.evaluator.atif_base_evaluator#

Reusable ATIF-native evaluator base with concurrent orchestration.

Classes#

AtifBaseEvaluator

Base class for ATIF-native custom evaluators.

Module Contents#

class AtifBaseEvaluator(max_concurrency: int = 4)#

Bases: abc.ABC

Base class for ATIF-native custom evaluators.

Implementers provide item-level scoring via evaluate_atif_item. This base handles bounded concurrency, gathers all items asynchronously, and computes EvalOutput.average_score from numeric per-item scores.

max_concurrency = 4#
semaphore#
abstractmethod evaluate_atif_item(
sample: nat.plugins.eval.evaluator.atif_evaluator.AtifEvalSample,
) nat.plugins.eval.data_models.evaluator_io.EvalOutputItem#
Async:

Evaluate one ATIF sample and return a single output item.

async evaluate_atif_fn(
atif_samples: nat.plugins.eval.evaluator.atif_evaluator.AtifEvalSampleList,
) nat.plugins.eval.data_models.evaluator_io.EvalOutput#

Evaluate ATIF samples concurrently with bounded concurrency.