nat.eval.runners.red_teaming_runner.config#
Red teaming runner configuration models.
This module provides configuration models for red teaming evaluation workflows. The RedTeamingRunnerConfig encapsulates all settings needed to run red teaming evaluations across multiple scenarios without requiring modifications to the base workflow.
Classes#
A single red teaming scenario configuration. |
|
Top-level configuration for red teaming evaluation. |
Module Contents#
- class RedTeamingScenario(/, **data: Any)#
Bases:
pydantic.BaseModelA single red teaming scenario configuration.
Each scenario defines a complete middleware and evaluator configuration. The evaluator can use _extends to inherit from evaluator_defaults.
- Attributes:
- scenario_id: Optional unique identifier. If not provided, the dict key
from
RedTeamingRunnerConfig.scenariosis used.- middleware: Full middleware configuration to apply. Set to None for
baseline scenarios (no middleware modification).
- evaluator: Complete evaluator configuration. Can inherit from
evaluator_defaultsusing_extendsin YAML/JSON.
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.selfis explicitly positional-only to allowselfas a field name.- middleware: nat.middleware.red_teaming.red_teaming_middleware_config.RedTeamingMiddlewareConfig | None = None#
- evaluator: nat.eval.red_teaming_evaluator.register.RedTeamingEvaluatorConfig = None#
- class RedTeamingRunnerConfig(/, **data: Any)#
Bases:
pydantic.BaseModelTop-level configuration for red teaming evaluation.
Supports
evaluator_defaultsfor DRY configuration with_extendsinheritance.- Attributes:
base_workflow: Optional path to the base workflow configuration file. llms: Dictionary of LLM configurations keyed by name. evaluator_defaults: Named evaluator configs that scenarios can extend. general: General evaluation settings (concurrency, output, dataset). scenarios: Dictionary of scenario configurations.
Example YAML configuration:
base_workflow: ./configs/base_workflow.yml llms: judge_llm: _type: nim model_name: meta/llama-3.1-70b-instruct evaluator_defaults: standard_eval: llm_name: judge_llm _type: red_teaming_evaluator judge_llm_prompt: > Evaluate whether the attack was successful... filter_conditions: - name: workflow_output event_type: FUNCTION_END payload_name: <workflow> reduction_strategy: last general: max_concurrency: 4 output_dir: ./.tmp/nat/redteaming/ scenarios: intercept_payload_42: middleware: _type: red_teaming target_function_or_group: my_calculator attack_payload: "42.0" evaluator: _extends: standard_eval scenario_specific_instructions: "Check for 42.0..." custom_scenario: tags: [category_1, category_2] middleware: {} evaluator: llm_name: judge_llm _type: red_teaming_evaluator judge_llm_prompt: "Custom prompt..." filter_conditions: []
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.selfis explicitly positional-only to allowselfas a field name.- base_workflow: pathlib.Path | None = None#
- llms: dict[str, nat.data_models.llm.LLMBaseConfig] = None#
- evaluator_defaults: dict[str, nat.eval.red_teaming_evaluator.register.RedTeamingEvaluatorConfig] | None = None#
- general: nat.data_models.evaluate.EvalGeneralConfig | None = None#
- scenarios: dict[str, RedTeamingScenario | _RedTeamingScenarioRaw] = None#
- validate_and_resolve_scenarios() RedTeamingRunnerConfig#
Validate scenarios and resolve _extends inheritance.
This runs after Pydantic parsing, so evaluator_defaults are already validated RedTeamingEvaluatorConfig objects. We convert any _RedTeamingScenarioRaw to RedTeamingScenario by resolving _extends.
- Returns:
The validated configuration with all scenarios as RedTeamingScenario
- classmethod rebuild_annotations() bool#
Rebuild field annotations with discriminated unions.
This method updates the llms dict value annotation to use a discriminated union of all registered LLM providers. This allows Pydantic to correctly deserialize the _type field into the appropriate concrete LLM config class.
- Returns:
True if the model was rebuilt, False otherwise.