nat.plugins.profiler.runtime_evaluator.atif_evaluate#
ATIF-native runtime evaluators for the profiler package.
Classes#
ATIF-native mean latency between LLM start and end for agent steps with metrics. |
|
ATIF-native workflow runtime per item: max(step.timestamp) - min(step.timestamp) across all steps. |
|
ATIF-native count of LLM calls per item: agent steps with metrics. |
|
ATIF-native average total tokens per LLM call: (prompt_tokens + completion_tokens) from step.metrics. |
Functions#
|
Convert ISO 8601 timestamp to epoch seconds, or None if invalid. |
Module Contents#
- _iso_to_epoch(ts: str | None) float | None#
Convert ISO 8601 timestamp to epoch seconds, or None if invalid.
- class AverageLLMLatencyAtifEvaluator(max_concurrency: int = 8)#
Bases:
nat.plugins.eval.evaluator.atif_base_evaluator.AtifBaseEvaluatorATIF-native mean latency between LLM start and end for agent steps with metrics.
Uses step.timestamp as end time and step.extra.get(“span_event_timestamp”) as start time. Steps without span_event_timestamp are skipped (see NEP-008 for ATIF profiling metadata).
- async evaluate_atif_item( ) nat.plugins.eval.data_models.evaluator_io.EvalOutputItem#
Evaluate one ATIF sample and return a single output item.
- class AverageWorkflowRuntimeAtifEvaluator(max_concurrency: int = 8)#
Bases:
nat.plugins.eval.evaluator.atif_base_evaluator.AtifBaseEvaluatorATIF-native workflow runtime per item: max(step.timestamp) - min(step.timestamp) across all steps.
- async evaluate_atif_item( ) nat.plugins.eval.data_models.evaluator_io.EvalOutputItem#
Evaluate one ATIF sample and return a single output item.
- class AverageNumberOfLLMCallsAtifEvaluator(max_concurrency: int = 8)#
Bases:
nat.plugins.eval.evaluator.atif_base_evaluator.AtifBaseEvaluatorATIF-native count of LLM calls per item: agent steps with metrics.
- async evaluate_atif_item( ) nat.plugins.eval.data_models.evaluator_io.EvalOutputItem#
Evaluate one ATIF sample and return a single output item.
- class AverageTokensPerLLMEndAtifEvaluator(max_concurrency: int = 8)#
Bases:
nat.plugins.eval.evaluator.atif_base_evaluator.AtifBaseEvaluatorATIF-native average total tokens per LLM call: (prompt_tokens + completion_tokens) from step.metrics.
- async evaluate_atif_item( ) nat.plugins.eval.data_models.evaluator_io.EvalOutputItem#
Evaluate one ATIF sample and return a single output item.