aiq.profiler.inference_optimization.data_models#

Classes#

PrefixInfo

Stores metadata about a particular prefix observed in the LLM text input.

FrameworkLLMPrefixData

Metadata for a single (framework, llm_name) group,

CommonPrefixesOutput

A root model storing a dictionary keyed by '<framework>-<llm>',

LLMUniquenessMetrics

Stores p90, p95, and p99 for the 'new words' metric.

LLMUniquenessMetricsByLLM

A RootModel containing a dictionary where each key is an LLM name

WorkflowRuntimeMetrics

Stores p90, p95, and p99 for workflow runtimes across all examples.

SimpleOperationStats

Statistics for a particular operation name (LLM or tool),

SimpleBottleneckReport

A container for all operation stats keyed by 'operation_type:operation_name',

CallNode

A single call (LLM or TOOL) in a nested call tree.

NodeMetrics

Metrics for a single node:

ConcurrencyDistribution

Overall concurrency distribution info:

NestedCallProfilingResult

The final Pydantic model returned by 'multi_example_call_profiling'.

ConcurrencyCallNode

A single call in the nested call tree for one example.

ConcurrencySpikeInfo

Info about one concurrency spike interval:

ConcurrencyCorrelationStats

Simple container for correlation / summarized stats of calls overlapping concurrency spikes.

ConcurrencyAnalysisResult

The final Pydantic model returned by concurrency_spike_analysis(...).

PrefixCallNode

Represents a single call in an example's workflow.

FrequentPattern

Frequent sub-sequence discovered by PrefixSpan, with coverage and average duration data.

PrefixSpanSubworkflowResult

Pydantic model for the final outcome:

Module Contents#

class PrefixInfo(/, **data: Any)#

Bases: pydantic.BaseModel

Stores metadata about a particular prefix observed in the LLM text input.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

prefix: str#
prefix_length: int#
calls_count: int#
calls_percentage: float = None#
class FrameworkLLMPrefixData(/, **data: Any)#

Bases: pydantic.BaseModel

Metadata for a single (framework, llm_name) group, including total calls and all prefix statistics.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

total_calls: int#
prefix_info: list[PrefixInfo]#
class CommonPrefixesOutput#

Bases: pydantic.RootModel[dict[str, FrameworkLLMPrefixData]]

A root model storing a dictionary keyed by ‘<framework>-<llm>’, each value is a FrameworkLLMPrefixData instance.

to_dict() dict[str, FrameworkLLMPrefixData]#

Return the raw dictionary of data, discarding the ‘root’ wrapper.

class LLMUniquenessMetrics(/, **data: Any)#

Bases: pydantic.BaseModel

Stores p90, p95, and p99 for the ‘new words’ metric.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

p90: float#
p95: float#
p99: float#
class LLMUniquenessMetricsByLLM#

Bases: pydantic.RootModel[dict[str, LLMUniquenessMetrics]]

A RootModel containing a dictionary where each key is an LLM name and each value is the LLMUniquenessMetrics for that LLM.

to_dict() dict[str, Any]#
class WorkflowRuntimeMetrics(/, **data: Any)#

Bases: pydantic.BaseModel

Stores p90, p95, and p99 for workflow runtimes across all examples.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

p90: float#
p95: float#
p99: float#
class SimpleOperationStats(/, **data: Any)#

Bases: pydantic.BaseModel

Statistics for a particular operation name (LLM or tool), capturing concurrency, duration, usage, etc.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

operation_type: str#
operation_name: str#
usage_count: int#
avg_duration: float#
p95_duration: float#
p99_duration: float#
max_concurrency: int#
bottleneck_score: float = None#
class SimpleBottleneckReport(/, **data: Any)#

Bases: pydantic.BaseModel

A container for all operation stats keyed by ‘operation_type:operation_name’, plus a textual summary that highlights top bottlenecks.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

stats: dict[str, SimpleOperationStats]#
summary: str#
class CallNode(/, **data: Any)#

Bases: pydantic.BaseModel

A single call (LLM or TOOL) in a nested call tree.

Attributes#

uuid: str

Unique ID tying together START/END events.

operation_type: str

e.g. ‘LLM’ or ‘TOOL’.

operation_name: str

e.g. ‘llama-3’, ‘bing-search’, …

start_time: float

Time when the call started.

end_time: float

Time when the call ended.

duration: float

end_time - start_time

children: list[“CallNode”]

List of nested calls inside this call’s time window.

parent: “CallNode” | None

Reference to the parent call in the tree (None if top-level).

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

model_config#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

uuid: str#
operation_type: str#
operation_name: str#
start_time: float#
end_time: float#
duration: float = None#
children: list[CallNode] = None#
parent: CallNode | None = None#
compute_self_time() float#

‘Self time’ = duration minus the union of child intervals. Overlapping child intervals are merged so we don’t double-count them.

compute_subtree_time() float#

Recursively compute the sum of self_time + children’s subtree_time. This ensures no overlap double-counting among children.

_repr(level: int) str#
class NodeMetrics(/, **data: Any)#

Bases: pydantic.BaseModel

Metrics for a single node:
  • self_time

  • subtree_time

  • concurrency_midpoint (optional)

  • bottleneck_score (example: subtree_time)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

uuid: str#
operation_type: str#
operation_name: str#
start_time: float#
end_time: float#
duration: float#
self_time: float#
subtree_time: float#
concurrency_midpoint: float | None = None#
bottleneck_score: float#
class ConcurrencyDistribution(/, **data: Any)#

Bases: pydantic.BaseModel

Overall concurrency distribution info:
  • timeline_segments: List of (start, end, concurrency)

  • p50, p90, p95, p99 concurrency

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

timeline_segments: list[tuple[float, float, int]]#
p50: float#
p90: float#
p95: float#
p99: float#
class NestedCallProfilingResult(/, **data: Any)#

Bases: pydantic.BaseModel

The final Pydantic model returned by ‘multi_example_call_profiling’.

Contains:
  • concurrency: ConcurrencyDistribution

  • node_metrics: dict[uuid, NodeMetrics]

  • top_bottlenecks: The top calls by bottleneck_score

  • textual_report: A multiline string summarizing everything

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

concurrency: ConcurrencyDistribution#
node_metrics: dict[str, NodeMetrics]#
top_bottlenecks: list[NodeMetrics]#
textual_report: str#
class ConcurrencyCallNode(/, **data: Any)#

Bases: CallNode

A single call in the nested call tree for one example. Each call is matched by a UUID with a *_START and *_END event.

Because fields like prompt_tokens, completion_tokens, total_tokens may only exist at the END event, we store them only after seeing *_END”.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

example_number: int#
prompt_tokens: int | None = None#
completion_tokens: int | None = None#
total_tokens: int | None = None#
tool_outputs: str | None = None#
llm_text_output: str | None = None#
class ConcurrencySpikeInfo(/, **data: Any)#

Bases: pydantic.BaseModel

Info about one concurrency spike interval: - start, end of the spike - concurrency level - list of calls that overlap

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

start_time: float#
end_time: float#
concurrency: int#
active_uuids: list[str] = None#
class ConcurrencyCorrelationStats(/, **data: Any)#

Bases: pydantic.BaseModel

Simple container for correlation / summarized stats of calls overlapping concurrency spikes.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

avg_prompt_tokens: float#
avg_total_tokens: float#
class ConcurrencyAnalysisResult(/, **data: Any)#

Bases: pydantic.BaseModel

The final Pydantic model returned by concurrency_spike_analysis(…). Contains: - concurrency_distribution: concurrency_level => total_time - p50_concurrency, p90_concurrency, p95_concurrency, p99_concurrency - spike_threshold, spike_intervals - correlation_stats - textual_report

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

concurrency_distribution: dict[int, float]#
p50_concurrency: float#
p90_concurrency: float#
p95_concurrency: float#
p99_concurrency: float#
spike_threshold: int#
spike_intervals: list[ConcurrencySpikeInfo]#
correlation_stats: ConcurrencyCorrelationStats#
average_latency_by_concurrency: dict[int, float]#
textual_report: str#
class PrefixCallNode(/, **data: Any)#

Bases: pydantic.BaseModel

Represents a single call in an example’s workflow. - For LLM calls, we also store llm_text_input if available so we can incorporate it into the token.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

uuid: str#
example_number: int#
operation_type: str#
operation_name: str#
start_time: float#
end_time: float#
duration: float#
llm_text_input: str | None = None#
class FrequentPattern(/, **data: Any)#

Bases: pydantic.BaseModel

Frequent sub-sequence discovered by PrefixSpan, with coverage and average duration data.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

pattern: list[str]#
frequency: int#
coverage: float#
average_duration: float#
examples_containing: list[int]#
class PrefixSpanSubworkflowResult(/, **data: Any)#

Bases: pydantic.BaseModel

Pydantic model for the final outcome: - A list of frequent patterns - A textual summary

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

patterns: list[FrequentPattern]#
textual_report: str#