nat.middleware.defense.defense_middleware_output_verifier#

Output Verifier Defense Middleware.

This middleware uses an LLM to verify function outputs for correctness and security. It can detect incorrect results, malicious content, and provide corrections automatically.

Attributes#

Classes#

OutputVerifierMiddlewareConfig

Configuration for Output Verifier middleware.

OutputVerifierMiddleware

Verification middleware using an LLM for correctness and security.

Module Contents#

logger#
class OutputVerifierMiddlewareConfig#

Bases: nat.middleware.defense.defense_middleware.DefenseMiddlewareConfig

Configuration for Output Verifier middleware.

This middleware analyzes function outputs using an LLM to verify correctness, detect security threats, and provide corrections when needed.

Actions: - ‘partial_compliance’: Detect and log threats, but allow content to pass through - ‘refusal’: Block function output if threat detected (hard stop) - ‘redirection’: Replace incorrect function output with correct answer from LLM

llm_name: str = None#
threshold: float = None#
tool_description: str | None = None#
class OutputVerifierMiddleware(
config: OutputVerifierMiddlewareConfig,
builder,
)#

Bases: nat.middleware.defense.defense_middleware.DefenseMiddleware

Verification middleware using an LLM for correctness and security.

This middleware uses NAT’s LLM system to verify function outputs for:

  • Correctness and reasonableness

  • Security validation (detecting malicious content and manipulated values)

  • Providing automatic corrections when errors are detected

Streaming Behavior:

For ‘refusal’ and ‘redirection’ actions, chunks are buffered and checked before yielding to prevent incorrect content from being streamed to clients. For ‘partial_compliance’ action, chunks are yielded immediately; violations are logged but content passes through.

Initialize output verifier middleware.

Args:

config: Configuration for output verifier middleware builder: Builder instance for loading LLMs

config: OutputVerifierMiddlewareConfig#
_llm = None#
async _get_llm()#

Lazy load the LLM when first needed.

_extract_json_from_response(response_text: str) str#

Extract JSON from LLM response, handling markdown code blocks.

Args:

response_text: Raw response from LLM

Returns:

Extracted JSON string

async _analyze_content(
content: Any,
content_type: nat.middleware.common.TargetLocation,
inputs: Any = None,
function_name: str | None = None,
) nat.middleware.defense.defense_middleware_data_models.OutputVerificationResult#

Check content for threats using the configured LLM.

Args:

content: The content to analyze content_type: TargetLocation used in the LLM prompt and result model. inputs: Optional function inputs for context (helps LLM calculate correct answers) function_name: Name of the function being verified (for context)

Returns:

OutputVerificationResult with threat detection info and should_refuse flag.

async _handle_threat(
content: Any,
analysis_result: nat.middleware.defense.defense_middleware_data_models.OutputVerificationResult,
context: nat.middleware.middleware.FunctionMiddlewareContext,
) Any#

Handle detected threat based on configured action.

Args:

content: The threatening content analysis_result: Detection result from LLM. context: Function context

Returns:

Handled content (blocked, sanitized/corrected, or original)

async _process_output_verification(
value: Any,
context: nat.middleware.middleware.FunctionMiddlewareContext,
inputs: Any = None,
) Any#

Process output verification and handling for a given value.

This is a common helper method that handles: - Field extraction (if target_field is specified) - Output verification analysis - Threat handling (refusal, redirection, partial_compliance) - Applying corrected value back to original structure

Args:

value: The value to analyze. context: Function context metadata. inputs: Original function inputs (for analysis context).

Returns:

The value after output verification handling (may be unchanged, corrected, or raise exception)

async post_invoke(
context: nat.middleware.middleware.InvocationContext,
) nat.middleware.middleware.InvocationContext | None#

Analyze function output for correctness and security after execution.

Args:

context: Invocation context with function metadata and output.

Returns:

Modified context if output was processed, None to pass through.

async function_middleware_stream(
*args: Any,
call_next: nat.middleware.function_middleware.CallNextStream,
context: nat.middleware.middleware.FunctionMiddlewareContext,
\*\*kwargs: Any,
) collections.abc.AsyncIterator[Any]#

Apply output verifier to streaming function.

For ‘refusal’ and ‘redirection’ actions: Chunks are buffered and checked before yielding. For ‘partial_compliance’ action: Chunks are yielded immediately; violations are logged.

Args:

args: Positional arguments passed to the function (first arg is typically the input value). call_next: Next middleware/function to call. context: Function metadata. kwargs: Keyword arguments passed to the function.

Yields:

Function output chunks (potentially corrected, blocked, or sanitized).