aiq.profiler.inference_optimization.experimental.prefix_span_analysis#
An advanced script that:
Builds chronological call sequences (LLM or TOOL) from a DataFrame of events.
Incorporates llm_text_input for LLM calls into the token used by PrefixSpan.
Runs PrefixSpan to discover frequent sub-sequences (patterns) across examples.
Computes coverage (fraction of examples containing each pattern) and average sub-sequence duration.
Returns a Pydantic model with the top patterns plus a textual report.
Main use case:
Identify recurring sequences of calls + repeated LLM text inputs, which can help with caching or further optimization (deduplicate repeated calls or pre-load certain tokens).
Attributes#
Functions#
|
Map event_type => 'LLM' or 'TOOL' if it starts with those prefixes. |
|
Pick the operation_name from either llm_name or tool_name based on op_type. |
For a single example's events, pair START/END calls and build a chronological list of PrefixCallNodes, |
|
|
Group events by example_number, build a chronological list of PrefixCallNode for each example, |
|
Construct a token for prefixspan from a PrefixCallNode. |
|
Convert each example's list of PrefixCallNode into a list of tokens. Return a list-of-lists |
|
|
|
Look for contiguous matches of 'pattern' in 'seq' by naive scanning. |
For each pattern from prefixspan, compute: |
|
Module Contents#
- logger#
- parse_op_type(evt: str) str | None #
Map event_type => ‘LLM’ or ‘TOOL’ if it starts with those prefixes.
- get_op_name(row: pandas.Series, op_type: str) str #
Pick the operation_name from either llm_name or tool_name based on op_type.
- build_call_sequence_for_example(
- example_df: pandas.DataFrame,
For a single example’s events, pair START/END calls and build a chronological list of PrefixCallNodes, storing llm_text_input if op_type=LLM and it’s available at START or END.
- build_sequences(
- df: pandas.DataFrame,
Group events by example_number, build a chronological list of PrefixCallNode for each example, including the LLM text input if present.
- build_token(
- call: aiq.profiler.inference_optimization.data_models.PrefixCallNode,
- max_text_len: int = 20,
- prefix_list: list[str] = None,
Construct a token for prefixspan from a PrefixCallNode. - We do “LLM:{operation_name}|{text}” if it’s an LLM call and text is available - We optionally truncate or hash the text for length. Here we just do naive truncation - For a tool call, we do “TOOL:{operation_name}”
- convert_sequences_for_prefixspan(
- sequences_map: dict[int, list[aiq.profiler.inference_optimization.data_models.PrefixCallNode]],
- max_text_len: int = 20,
- prefix_list: list[str] = None,
Convert each example’s list of PrefixCallNode into a list of tokens. Return a list-of-lists suitable for prefixspan. E.g.:
[ ["LLM:llama-3|Hello", "TOOL:internet-search", "LLM:llama-3|How are you?"], ["LLM:davinci|some prompt", "TOOL:vector-db"] ... ]
- run_prefixspan(
- sequences_map: dict[int, list[aiq.profiler.inference_optimization.data_models.PrefixCallNode]],
- min_support: int | float,
- max_text_len: int = 20,
- prefix_list: list[str] = None,
Convert all example sequences => tokens
Run prefixspan with min_support
Return (pattern, freq) list
- find_contiguous_matches( ) list[tuple[int, int]] #
Look for contiguous matches of ‘pattern’ in ‘seq’ by naive scanning. e.g. pattern=[“LLM:llama-3|Hello”, “TOOL:internet-search”], seq=… Return list of (start_idx, end_idx).
- compute_coverage_and_duration(
- sequences_map: dict[int, list[aiq.profiler.inference_optimization.data_models.PrefixCallNode]],
- prefixspan_patterns: list[tuple[list[str], int]],
- top_k: int,
- min_coverage: float = 0.0,
- max_text_len: int = 20,
For each pattern from prefixspan, compute:
coverage: fraction of examples that contain it
average_duration: sum of durations of calls in sub-sequence / total occurrences
Then filter by min_coverage and pick top_k, sorted by frequency, coverage, avg_duration desc.
- prefixspan_subworkflow_with_text(
- all_steps: list[list[aiq.data_models.intermediate_step.IntermediateStep]],
- min_support: int | float = 2,
- top_k: int = 10,
- min_coverage: float = 0.0,
- max_text_len: int = 700,
- prefix_list: list[str] = None,
Build sequences of calls for each example (with llm_text_input).
Convert to token lists, run PrefixSpan with min_support.
Compute coverage & average duration for each pattern, filter by min_coverage, pick top_k.
Return Pydantic model with final patterns & textual report.
- Parameters:
all_steps – Intermediate steps
min_support – minimal # of times (int) or fraction (float) for prefixspan
top_k – how many patterns to keep
min_coverage – discard patterns that appear in fewer than this fraction of examples
max_text_len – how many chars of llm_text_input to incorporate in the token
prefix_list – list of prefixes to filter on and exclude from pattern matching