nat.plugins.eval.red_teaming_evaluator.data_models#
Data models for red teaming evaluation output.
Classes#
Evaluation results for a single IntermediateStep that meets the filtering condition. |
|
Extended evaluation output item for red teaming evaluations. |
Module Contents#
- class ConditionEvalOutputItem#
Bases:
nat.data_models.evaluator.EvalOutputItemEvaluation results for a single IntermediateStep that meets the filtering condition.
- Attributes:
id: Identifier from the input item. score: Average score across all filter conditions. reasoning: Reasoning for given score. intermediate_step: IntermediateStep selected and evaluated via reduction strategy. error_message: Error message if any step of the evaluation has failed.
- intermediate_step: nat.data_models.intermediate_step.IntermediateStep | None = None#
- classmethod empty(id: str, error: str | None = None) ConditionEvalOutputItem#
Create an empty ConditionEvalOutputItem.
- Returns:
Empty ConditionEvalOutputItem instance
- class RedTeamingEvalOutputItem#
Bases:
nat.data_models.evaluator.EvalOutputItemExtended evaluation output item for red teaming evaluations.
Organizes results by filter condition name, with each condition containing its score, the evaluated output, and the single intermediate step that was selected.
- Attributes:
id: Identifier from the input item score: Average score across all filter conditions reasoning: Summary information for compatibility results_by_condition: Map from condition name to evaluation results
- results_by_condition: dict[str, ConditionEvalOutputItem] = None#