aiq.profiler.inference_optimization.experimental.prefix_span_analysis#

An advanced script that:

  1. Builds chronological call sequences (LLM or TOOL) from a DataFrame of events.

  2. Incorporates llm_text_input for LLM calls into the token used by PrefixSpan.

  3. Runs PrefixSpan to discover frequent sub-sequences (patterns) across examples.

  4. Computes coverage (fraction of examples containing each pattern) and average sub-sequence duration.

  5. Returns a Pydantic model with the top patterns plus a textual report.

Main use case:

  • Identify recurring sequences of calls + repeated LLM text inputs, which can help with caching or further optimization (deduplicate repeated calls or pre-load certain tokens).

Attributes#

Functions#

parse_op_type(→ str | None)

Map event_type => 'LLM' or 'TOOL' if it starts with those prefixes.

get_op_name(→ str)

Pick the operation_name from either llm_name or tool_name based on op_type.

build_call_sequence_for_example(...)

For a single example's events, pair START/END calls and build a chronological list of PrefixCallNodes,

build_sequences(→ dict[int, ...)

Group events by example_number, build a chronological list of PrefixCallNode for each example,

build_token(→ str)

Construct a token for prefixspan from a PrefixCallNode.

convert_sequences_for_prefixspan(→ list[list[str]])

Convert each example's list of PrefixCallNode into a list of tokens. Return a list-of-lists

run_prefixspan(→ list[tuple[list[str], int]])

find_contiguous_matches(→ list[tuple[int, int]])

Look for contiguous matches of 'pattern' in 'seq' by naive scanning.

compute_coverage_and_duration(...)

For each pattern from prefixspan, compute:

prefixspan_subworkflow_with_text(...)

Module Contents#

logger#
parse_op_type(evt: str) str | None#

Map event_type => ‘LLM’ or ‘TOOL’ if it starts with those prefixes.

get_op_name(row: pandas.Series, op_type: str) str#

Pick the operation_name from either llm_name or tool_name based on op_type.

build_call_sequence_for_example(
example_df: pandas.DataFrame,
) list[aiq.profiler.inference_optimization.data_models.PrefixCallNode]#

For a single example’s events, pair START/END calls and build a chronological list of PrefixCallNodes, storing llm_text_input if op_type=LLM and it’s available at START or END.

build_sequences(
df: pandas.DataFrame,
) dict[int, list[aiq.profiler.inference_optimization.data_models.PrefixCallNode]]#

Group events by example_number, build a chronological list of PrefixCallNode for each example, including the LLM text input if present.

build_token(
call: aiq.profiler.inference_optimization.data_models.PrefixCallNode,
max_text_len: int = 20,
prefix_list: list[str] = None,
) str#

Construct a token for prefixspan from a PrefixCallNode. - We do “LLM:{operation_name}|{text}” if it’s an LLM call and text is available - We optionally truncate or hash the text for length. Here we just do naive truncation - For a tool call, we do “TOOL:{operation_name}”

convert_sequences_for_prefixspan(
sequences_map: dict[int, list[aiq.profiler.inference_optimization.data_models.PrefixCallNode]],
max_text_len: int = 20,
prefix_list: list[str] = None,
) list[list[str]]#

Convert each example’s list of PrefixCallNode into a list of tokens. Return a list-of-lists suitable for prefixspan. E.g.:

[
["LLM:llama-3|Hello", "TOOL:internet-search", "LLM:llama-3|How are you?"],
["LLM:davinci|some prompt", "TOOL:vector-db"]
...
]
run_prefixspan(
sequences_map: dict[int, list[aiq.profiler.inference_optimization.data_models.PrefixCallNode]],
min_support: int | float,
max_text_len: int = 20,
prefix_list: list[str] = None,
) list[tuple[list[str], int]]#
  1. Convert all example sequences => tokens

  2. Run prefixspan with min_support

  3. Return (pattern, freq) list

find_contiguous_matches(
pattern: list[str],
seq: list[str],
) list[tuple[int, int]]#

Look for contiguous matches of ‘pattern’ in ‘seq’ by naive scanning. e.g. pattern=[“LLM:llama-3|Hello”, “TOOL:internet-search”], seq=… Return list of (start_idx, end_idx).

compute_coverage_and_duration(
sequences_map: dict[int, list[aiq.profiler.inference_optimization.data_models.PrefixCallNode]],
prefixspan_patterns: list[tuple[list[str], int]],
top_k: int,
min_coverage: float = 0.0,
max_text_len: int = 20,
) list[aiq.profiler.inference_optimization.data_models.FrequentPattern]#

For each pattern from prefixspan, compute:

  • coverage: fraction of examples that contain it

  • average_duration: sum of durations of calls in sub-sequence / total occurrences

Then filter by min_coverage and pick top_k, sorted by frequency, coverage, avg_duration desc.

prefixspan_subworkflow_with_text(
all_steps: list[list[aiq.data_models.intermediate_step.IntermediateStep]],
min_support: int | float = 2,
top_k: int = 10,
min_coverage: float = 0.0,
max_text_len: int = 700,
prefix_list: list[str] = None,
) aiq.profiler.inference_optimization.data_models.PrefixSpanSubworkflowResult#
  1. Build sequences of calls for each example (with llm_text_input).

  2. Convert to token lists, run PrefixSpan with min_support.

  3. Compute coverage & average duration for each pattern, filter by min_coverage, pick top_k.

  4. Return Pydantic model with final patterns & textual report.

Parameters:
  • all_steps – Intermediate steps

  • min_support – minimal # of times (int) or fraction (float) for prefixspan

  • top_k – how many patterns to keep

  • min_coverage – discard patterns that appear in fewer than this fraction of examples

  • max_text_len – how many chars of llm_text_input to incorporate in the token

  • prefix_list – list of prefixes to filter on and exclude from pattern matching