nat.middleware.defense.defense_middleware_output_verifier#

Output Verifier Defense Middleware.

This middleware uses an LLM to verify function outputs for correctness and security. It can detect incorrect results, malicious content, and provide corrections automatically.

Attributes#

logger

Classes#

`OutputVerifierMiddlewareConfig`	Configuration for Output Verifier middleware.
`OutputVerifierMiddleware`	Verification middleware using an LLM for correctness and security.

Module Contents#

logger#

class OutputVerifierMiddlewareConfig(/, **data: Any)#

Bases: nat.middleware.defense.defense_middleware.DefenseMiddlewareConfig

Configuration for Output Verifier middleware.

This middleware analyzes function outputs using an LLM to verify correctness, detect security threats, and provide corrections when needed.

Actions: - ‘partial_compliance’: Detect and log threats, but allow content to pass through - ‘refusal’: Block output if threat detected (hard stop) - ‘redirection’: Replace incorrect output with correct answer from LLM

Note: Only output analysis is currently supported (target_location=’output’).

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

llm_name: str = None#

threshold: float = None#

tool_description: str | None = None#

class OutputVerifierMiddleware( config: OutputVerifierMiddlewareConfig, builder, )#

Bases: nat.middleware.defense.defense_middleware.DefenseMiddleware

Verification middleware using an LLM for correctness and security.

This middleware uses NAT’s LLM system to verify function outputs for:

Correctness and reasonableness
Security validation (detecting malicious content and manipulated values)
Providing automatic corrections when errors are detected

Only output analysis is currently supported (target_location='output').

Streaming Behavior:: For ‘refusal’ and ‘redirection’ actions, chunks are buffered and checked before yielding to prevent incorrect content from being streamed to clients. For ‘partial_compliance’ action, chunks are yielded immediately; violations are logged but content passes through.

Initialize output verifier middleware.

Args:: config: Configuration for output verifier middleware builder: Builder instance for loading LLMs

config: OutputVerifierMiddlewareConfig#

_llm = None#

async _get_llm()#: Lazy load the LLM when first needed.

_extract_json_from_response(response_text: str) → str#

Extract JSON from LLM response, handling markdown code blocks.

Args:: response_text: Raw response from LLM
Returns:: Extracted JSON string

async _analyze_content( content: Any, content_type: str, inputs: Any = None, function_name: str | None = None, ) → nat.middleware.defense.defense_middleware_data_models.OutputVerificationResult#

Check content for threats using the configured LLM.

Args:: content: The content to analyze content_type: Either ‘input’ or ‘output’ (for logging only) inputs: Optional function inputs for context (helps LLM calculate correct answers) function_name: Name of the function being verified (for context)
Returns:: OutputVerificationResult with threat detection info and should_refuse flag.

async _handle_threat( content: Any, analysis_result: nat.middleware.defense.defense_middleware_data_models.OutputVerificationResult, context: nat.middleware.middleware.FunctionMiddlewareContext, ) → Any#

Handle detected threat based on configured action.

Args:: content: The threatening content analysis_result: Detection result from LLM. context: Function context
Returns:: Handled content (blocked, sanitized/corrected, or original)

async _process_output_verification( value: Any, location: str, context: nat.middleware.middleware.FunctionMiddlewareContext, inputs: Any = None, ) → Any#

Process output verification and handling for a given value.

This is a common helper method that handles: - Field extraction (if target_field is specified) - Output verification analysis - Threat handling (refusal, redirection, partial_compliance) - Applying corrected value back to original structure

Args:: value: The value to analyze (input or output) location: Either “input” or “output” (for logging) context: Function context metadata inputs: Original function inputs (for analysis context)
Returns:: The value after output verification handling (may be unchanged, corrected, or raise exception)

async function_middleware_invoke( *args: Any, call_next: nat.middleware.function_middleware.CallNext, context: nat.middleware.middleware.FunctionMiddlewareContext, \*\*kwargs: Any, ) → Any#

Apply output verifier to function invocation.

Analyzes function outputs for correctness and security, with auto-correction.

Args:: args: Positional arguments passed to the function (first arg is typically the input value). call_next: Next middleware/function to call. context: Function metadata. kwargs: Keyword arguments passed to the function.
Returns:: Function output (potentially corrected, blocked, or sanitized).

async function_middleware_stream( *args: Any, call_next: nat.middleware.function_middleware.CallNextStream, context: nat.middleware.middleware.FunctionMiddlewareContext, \*\*kwargs: Any, ) → collections.abc.AsyncIterator[Any]#

Apply output verifier to streaming function.

For ‘refusal’ and ‘redirection’ actions: Chunks are buffered and checked before yielding. For ‘partial_compliance’ action: Chunks are yielded immediately; violations are logged.

Args:: args: Positional arguments passed to the function (first arg is typically the input value). call_next: Next middleware/function to call. context: Function metadata. kwargs: Keyword arguments passed to the function.
Yields:: Function output chunks (potentially corrected, blocked, or sanitized).