aiq.profiler.inference_optimization.prompt_caching#

Functions#

build_prefix_trie(→ dict)

Build a trie from a list of strings.

collect_prefixes_iterative(→ list[dict])

Iteratively traverse the trie to collect prefix statistics,

get_common_prefixes(...)

Given a pandas DataFrame with columns 'framework', 'llm_name',

Module Contents#

build_prefix_trie(strings: list[str]) dict#

Build a trie from a list of strings.

Returns a nested dictionary with:

{
    'count': int,         # number of strings passing through this node
    'children': dict[str, TrieNode]
}
collect_prefixes_iterative(root: dict, total_calls: int) list[dict]#

Iteratively traverse the trie to collect prefix statistics, avoiding recursion depth limits.

Parameters:
  • root – Trie node with ‘count’ and ‘children’

  • total_calls – Number of total calls in this group (denominator for percentages)

Returns:

A list of dicts, each dict containing prefix info

get_common_prefixes(
all_steps: list[list[aiq.data_models.intermediate_step.IntermediateStep]],
min_call_percentage: float = 0.0,
) aiq.profiler.inference_optimization.data_models.CommonPrefixesOutput#

Given a pandas DataFrame with columns ‘framework’, ‘llm_name’, and ‘llm_text_input’, return a Pydantic-validated RootModel keyed by “<llm_name>” with a sorted list of common prefix statistics.

  1. Only includes prefixes with calls_percentage >= min_call_percentage.

  2. Excludes any prefix that is a substring of another (longer) prefix that already meets the threshold and is retained.

  3. Optionally writes the resulting dictionary to JSON if output_path is provided.

Parameters:
  • all_steps – Intermediate Steps

  • min_call_percentage – Exclude prefixes that appear in fewer than this fraction of total calls. (Default 0.0 = no filtering)

Sorting: primarily by prefix length (descending),

secondarily by frequency (descending).