nat.plugins.eval.utils.weave_eval#
Attributes#
Classes#
Class to handle all Weave integration functionality. |
Module Contents#
- logger#
- class WeaveEvaluationIntegration(
- eval_trace_context: nat.plugins.eval.utils.eval_trace_ctx.EvalTraceContext,
Class to handle all Weave integration functionality.
- available = False#
- client = None#
- eval_logger = None#
- pred_loggers#
- eval_trace_context#
- initialize_client()#
Initialize the Weave client if available.
- _get_prediction_inputs(item: nat.data_models.evaluator.EvalInputItem)#
Get the inputs for displaying in the UI. The following fields are excluded as they are too large to display in the UI: - full_dataset_entry - expected_trajectory - trajectory
output_obj is excluded because it is displayed separately.
- _get_weave_dataset(eval_input: nat.data_models.evaluator.EvalInput)#
Get the full dataset for Weave.
- initialize_logger(
- workflow_alias: str,
- eval_input: nat.data_models.evaluator.EvalInput,
- config: Any,
- job_id: str | None = None,
Initialize the Weave evaluation logger.
- log_prediction(
- item: nat.data_models.evaluator.EvalInputItem,
- output: Any,
Log a prediction to Weave.
- async log_usage_stats(
- item: nat.data_models.evaluator.EvalInputItem,
- usage_stats_item: nat.data_models.evaluate_runtime.UsageStatsItem,
Log usage stats to Weave.
- async alog_score(
- eval_output: nat.data_models.evaluator.EvalOutput,
- evaluator_name: str,
Log scores for evaluation outputs.
- async afinish_loggers()#
Finish all prediction loggers and wait for exports.
- _log_profiler_metrics(
- profiler_results: nat.data_models.evaluate_runtime.ProfilerResults,
- usage_stats: nat.data_models.evaluate_runtime.UsageStats,
Log profiler metrics to Weave.
- log_summary(
- usage_stats: nat.data_models.evaluate_runtime.UsageStats,
- evaluation_results: list[tuple[str, nat.data_models.evaluator.EvalOutput]],
- profiler_results: nat.data_models.evaluate_runtime.ProfilerResults,
Log summary statistics to Weave.