nat.middleware.cache.cache_middleware#

Cache middleware for function memoization with similarity matching.

This module provides a cache middleware that memoizes function calls based on input similarity. It demonstrates the middleware pattern by:

  1. Preprocessing: Serializing and checking the cache for similar inputs

  2. Calling next: Delegating to the next middleware/function if no cache hit

  3. Postprocessing: Caching the result for future use

  4. Continuing: Returning the result (cached or fresh)

The cache supports exact matching for maximum performance and fuzzy matching using Python’s built-in difflib for similarity computation.

Attributes#

Classes#

CacheMiddleware

Cache middleware that memoizes function outputs based on input similarity.

Module Contents#

logger#
class CacheMiddleware(*, enabled_mode: str, similarity_threshold: float)#

Bases: nat.middleware.function_middleware.FunctionMiddleware

Cache middleware that memoizes function outputs based on input similarity.

This middleware demonstrates the four-phase middleware pattern:

  1. Preprocess: Serialize input and check cache for similar entries

  2. Call Next: Delegate to next middleware/function if cache miss

  3. Postprocess: Store the result in cache for future use

  4. Continue: Return the result (from cache or fresh)

The cache serializes function inputs to strings and performs similarity matching against previously seen inputs. If a similar input is found above the configured threshold, it returns the cached output without calling the next middleware or function.

Args:
enabled_mode: Either “always” to always cache, or “eval” to only

cache when Context.is_evaluating is True.

similarity_threshold: Float between 0 and 1. If 1.0, performs

exact string matching. Otherwise uses difflib for similarity computation.

Initialize the cache middleware.

Args:
enabled_mode: Either “always” or “eval”. If “eval”, only caches

when Context.is_evaluating is True.

similarity_threshold: Similarity threshold between 0 and 1.

If 1.0, performs exact matching. Otherwise uses fuzzy matching.

_enabled_mode#
_similarity_threshold#
_cache: dict[str, Any]#
property enabled: bool#

Middleware always enabled.

async pre_invoke(
context: nat.middleware.middleware.InvocationContext,
) nat.middleware.middleware.InvocationContext | None#

Not used - CacheMiddleware overrides function_middleware_invoke.

async post_invoke(
context: nat.middleware.middleware.InvocationContext,
) nat.middleware.middleware.InvocationContext | None#

Not used - CacheMiddleware overrides function_middleware_invoke.

_should_cache() bool#

Check if caching should be enabled based on the current context.

_serialize_input(value: Any) str | None#

Serialize the input value to a string for caching.

Args:

value: The input value to serialize.

Returns:

String representation of the input, or None if serialization fails.

_find_similar_key(input_str: str) str | None#

Find a cached key that is similar to the input string.

Args:

input_str: The serialized input string to match.

Returns:

The most similar cached key if above threshold, None otherwise.

async function_middleware_invoke(
*args: Any,
call_next: nat.middleware.function_middleware.CallNext,
context: nat.middleware.function_middleware.FunctionMiddlewareContext,
\*\*kwargs: Any,
) Any#

Cache middleware for single-output invocations.

Implements the four-phase middleware pattern:

  1. Preprocess: Check if caching is enabled and serialize input

  2. Call Next: Delegate to next middleware/function if cache miss

  3. Postprocess: Store the result in cache

  4. Continue: Return the result (cached or fresh)

Args:

args: The positional arguments to process call_next: Callable to invoke the next middleware or function context: Metadata about the function being wrapped kwargs: Additional function arguments

Returns:

The cached output if found, otherwise the fresh output

async function_middleware_stream(
*args: Any,
call_next: nat.middleware.function_middleware.CallNextStream,
context: nat.middleware.function_middleware.FunctionMiddlewareContext,
\*\*kwargs: Any,
) collections.abc.AsyncIterator[Any]#

Cache middleware for streaming invocations - bypasses caching.

Streaming results are not cached as they would need to be buffered entirely in memory, which would defeat the purpose of streaming.

This method demonstrates the middleware pattern for streams:

  1. Preprocess: Log that we’re bypassing cache

  2. Call Next: Get stream from next middleware/function

  3. Process Chunks: Yield each chunk as it arrives

  4. Continue: Complete the stream

Args:

args: The positional arguments to process call_next: Callable to invoke the next middleware or function stream context: Metadata about the function being wrapped kwargs: Additional function arguments

Yields:

Chunks from the stream (unmodified)