nat.middleware.defense.defense_middleware_output_verifier#
Output Verifier Defense Middleware.
This middleware uses an LLM to verify function outputs for correctness and security. It can detect incorrect results, malicious content, and provide corrections automatically.
Attributes#
Classes#
Configuration for Output Verifier middleware. |
|
Verification middleware using an LLM for correctness and security. |
Module Contents#
- logger#
- class OutputVerifierMiddlewareConfig(/, **data: Any)#
Bases:
nat.middleware.defense.defense_middleware.DefenseMiddlewareConfigConfiguration for Output Verifier middleware.
This middleware analyzes function outputs using an LLM to verify correctness, detect security threats, and provide corrections when needed.
Actions: - ‘partial_compliance’: Detect and log threats, but allow content to pass through - ‘refusal’: Block output if threat detected (hard stop) - ‘redirection’: Replace incorrect output with correct answer from LLM
Note: Only output analysis is currently supported (target_location=’output’).
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.selfis explicitly positional-only to allowselfas a field name.
- class OutputVerifierMiddleware(
- config: OutputVerifierMiddlewareConfig,
- builder,
Bases:
nat.middleware.defense.defense_middleware.DefenseMiddlewareVerification middleware using an LLM for correctness and security.
This middleware uses NAT’s LLM system to verify function outputs for:
Correctness and reasonableness
Security validation (detecting malicious content and manipulated values)
Providing automatic corrections when errors are detected
Only output analysis is currently supported (
target_location='output').- Streaming Behavior:
For ‘refusal’ and ‘redirection’ actions, chunks are buffered and checked before yielding to prevent incorrect content from being streamed to clients. For ‘partial_compliance’ action, chunks are yielded immediately; violations are logged but content passes through.
Initialize output verifier middleware.
- Args:
config: Configuration for output verifier middleware builder: Builder instance for loading LLMs
- config: OutputVerifierMiddlewareConfig#
- _llm = None#
- async _get_llm()#
Lazy load the LLM when first needed.
- _extract_json_from_response(response_text: str) str#
Extract JSON from LLM response, handling markdown code blocks.
- Args:
response_text: Raw response from LLM
- Returns:
Extracted JSON string
- async _analyze_content( ) nat.middleware.defense.defense_middleware_data_models.OutputVerificationResult#
Check content for threats using the configured LLM.
- Args:
content: The content to analyze content_type: Either ‘input’ or ‘output’ (for logging only) inputs: Optional function inputs for context (helps LLM calculate correct answers) function_name: Name of the function being verified (for context)
- Returns:
OutputVerificationResult with threat detection info and should_refuse flag.
- async _handle_threat(
- content: Any,
- analysis_result: nat.middleware.defense.defense_middleware_data_models.OutputVerificationResult,
- context: nat.middleware.middleware.FunctionMiddlewareContext,
Handle detected threat based on configured action.
- Args:
content: The threatening content analysis_result: Detection result from LLM. context: Function context
- Returns:
Handled content (blocked, sanitized/corrected, or original)
- async _process_output_verification(
- value: Any,
- location: str,
- context: nat.middleware.middleware.FunctionMiddlewareContext,
- inputs: Any = None,
Process output verification and handling for a given value.
This is a common helper method that handles: - Field extraction (if target_field is specified) - Output verification analysis - Threat handling (refusal, redirection, partial_compliance) - Applying corrected value back to original structure
- Args:
value: The value to analyze (input or output) location: Either “input” or “output” (for logging) context: Function context metadata inputs: Original function inputs (for analysis context)
- Returns:
The value after output verification handling (may be unchanged, corrected, or raise exception)
- async function_middleware_invoke(
- *args: Any,
- call_next: nat.middleware.function_middleware.CallNext,
- context: nat.middleware.middleware.FunctionMiddlewareContext,
- \*\*kwargs: Any,
Apply output verifier to function invocation.
Analyzes function outputs for correctness and security, with auto-correction.
- Args:
args: Positional arguments passed to the function (first arg is typically the input value). call_next: Next middleware/function to call. context: Function metadata. kwargs: Keyword arguments passed to the function.
- Returns:
Function output (potentially corrected, blocked, or sanitized).
- async function_middleware_stream(
- *args: Any,
- call_next: nat.middleware.function_middleware.CallNextStream,
- context: nat.middleware.middleware.FunctionMiddlewareContext,
- \*\*kwargs: Any,
Apply output verifier to streaming function.
For ‘refusal’ and ‘redirection’ actions: Chunks are buffered and checked before yielding. For ‘partial_compliance’ action: Chunks are yielded immediately; violations are logged.
- Args:
args: Positional arguments passed to the function (first arg is typically the input value). call_next: Next middleware/function to call. context: Function metadata. kwargs: Keyword arguments passed to the function.
- Yields:
Function output chunks (potentially corrected, blocked, or sanitized).