nat.middleware.defense.defense_middleware_data_models#

Data models for defense middleware output.

Classes#

PIIAnalysisResult

Result of PII analysis using Presidio.

GuardResponseResult

Result of parsing guard model response.

ContentAnalysisResult

Result of content safety analysis with guard models.

OutputVerificationResult

Result of output verification using LLM.

Module Contents#

class PIIAnalysisResult(/, **data: Any)#

Bases: pydantic.BaseModel

Result of PII analysis using Presidio.

Attributes:

pii_detected: Whether PII was detected in the analyzed text. entities: Dictionary mapping entity types to lists of detection metadata (score, start, end). anonymized_text: Text with PII replaced by entity type placeholders (e.g., <EMAIL_ADDRESS>). original_text: The unmodified original text that was analyzed.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

pii_detected: bool#
entities: dict[str, list[dict[str, Any]]]#
anonymized_text: str#
original_text: str#
class GuardResponseResult(/, **data: Any)#

Bases: pydantic.BaseModel

Result of parsing guard model response.

Attributes:

is_safe: Whether the content is classified as safe by the guard model. categories: List of unsafe content categories detected (empty if safe). raw_response: The unprocessed response text from the guard model.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

is_safe: bool#
categories: list[str]#
raw_response: str#
class ContentAnalysisResult(/, **data: Any)#

Bases: pydantic.BaseModel

Result of content safety analysis with guard models.

Attributes:

is_safe: Whether the content is classified as safe by the guard model. categories: List of unsafe content categories detected (empty if safe). raw_response: The unprocessed response text from the guard model. should_refuse: Whether the content should be refused based on the analysis. error: Whether an error occurred during analysis. error_message: Error message if error occurred, otherwise None.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

is_safe: bool#
categories: list[str]#
raw_response: str#
should_refuse: bool#
error: bool = False#
error_message: str | None = None#
class OutputVerificationResult(/, **data: Any)#

Bases: pydantic.BaseModel

Result of output verification using LLM.

Attributes:

threat_detected: Whether a threat (incorrect or manipulated output) was detected. confidence: Confidence score (0.0-1.0) in the threat detection. reason: Explanation for the detection result. correct_answer: The correct output value if threat detected, otherwise None. content_type: Type of content analyzed (‘input’ or ‘output’). should_refuse: Whether the content should be refused based on threshold. error: Whether an error occurred during verification.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

threat_detected: bool#
confidence: float#
reason: str#
correct_answer: Any | None#
content_type: str#
should_refuse: bool#
error: bool = False#