nat.middleware.cache.cache_middleware#
Cache middleware for function memoization with similarity matching.
This module provides a cache middleware that memoizes function calls based on input similarity. It demonstrates the middleware pattern by:
Preprocessing: Serializing and checking the cache for similar inputs
Calling next: Delegating to the next middleware/function if no cache hit
Postprocessing: Caching the result for future use
Continuing: Returning the result (cached or fresh)
The cache supports exact matching for maximum performance and fuzzy matching using Python’s built-in difflib for similarity computation.
Attributes#
Classes#
Cache middleware that memoizes function outputs based on input similarity. |
Module Contents#
- logger#
- class CacheMiddleware(*, enabled_mode: str, similarity_threshold: float)#
Bases:
nat.middleware.function_middleware.FunctionMiddlewareCache middleware that memoizes function outputs based on input similarity.
This middleware demonstrates the four-phase middleware pattern:
Preprocess: Serialize input and check cache for similar entries
Call Next: Delegate to next middleware/function if cache miss
Postprocess: Store the result in cache for future use
Continue: Return the result (from cache or fresh)
The cache serializes function inputs to strings and performs similarity matching against previously seen inputs. If a similar input is found above the configured threshold, it returns the cached output without calling the next middleware or function.
- Args:
- enabled_mode: Either “always” to always cache, or “eval” to only
cache when Context.is_evaluating is True.
- similarity_threshold: Float between 0 and 1. If 1.0, performs
exact string matching. Otherwise uses difflib for similarity computation.
Initialize the cache middleware.
- Args:
- enabled_mode: Either “always” or “eval”. If “eval”, only caches
when Context.is_evaluating is True.
- similarity_threshold: Similarity threshold between 0 and 1.
If 1.0, performs exact matching. Otherwise uses fuzzy matching.
- _enabled_mode#
- _similarity_threshold#
- async pre_invoke( ) nat.middleware.middleware.InvocationContext | None#
Not used - CacheMiddleware overrides function_middleware_invoke.
- async post_invoke( ) nat.middleware.middleware.InvocationContext | None#
Not used - CacheMiddleware overrides function_middleware_invoke.
- _serialize_input(value: Any) str | None#
Serialize the input value to a string for caching.
- Args:
value: The input value to serialize.
- Returns:
String representation of the input, or None if serialization fails.
- _find_similar_key(input_str: str) str | None#
Find a cached key that is similar to the input string.
- Args:
input_str: The serialized input string to match.
- Returns:
The most similar cached key if above threshold, None otherwise.
- async function_middleware_invoke(
- *args: Any,
- call_next: nat.middleware.function_middleware.CallNext,
- context: nat.middleware.function_middleware.FunctionMiddlewareContext,
- \*\*kwargs: Any,
Cache middleware for single-output invocations.
Implements the four-phase middleware pattern:
Preprocess: Check if caching is enabled and serialize input
Call Next: Delegate to next middleware/function if cache miss
Postprocess: Store the result in cache
Continue: Return the result (cached or fresh)
- Args:
args: The positional arguments to process call_next: Callable to invoke the next middleware or function context: Metadata about the function being wrapped kwargs: Additional function arguments
- Returns:
The cached output if found, otherwise the fresh output
- async function_middleware_stream(
- *args: Any,
- call_next: nat.middleware.function_middleware.CallNextStream,
- context: nat.middleware.function_middleware.FunctionMiddlewareContext,
- \*\*kwargs: Any,
Cache middleware for streaming invocations - bypasses caching.
Streaming results are not cached as they would need to be buffered entirely in memory, which would defeat the purpose of streaming.
This method demonstrates the middleware pattern for streams:
Preprocess: Log that we’re bypassing cache
Call Next: Get stream from next middleware/function
Process Chunks: Yield each chunk as it arrives
Continue: Complete the stream
- Args:
args: The positional arguments to process call_next: Callable to invoke the next middleware or function stream context: Metadata about the function being wrapped kwargs: Additional function arguments
- Yields:
Chunks from the stream (unmodified)