aiq.profiler.inference_optimization.data_models#
Classes#
Stores metadata about a particular prefix observed in the LLM text input. |
|
Metadata for a single (framework, llm_name) group, |
|
A root model storing a dictionary keyed by '<framework>-<llm>', |
|
Stores p90, p95, and p99 for the 'new words' metric. |
|
A RootModel containing a dictionary where each key is an LLM name |
|
Stores p90, p95, and p99 for workflow runtimes across all examples. |
|
Statistics for a particular operation name (LLM or tool), |
|
A container for all operation stats keyed by 'operation_type:operation_name', |
|
A single call (LLM or TOOL) in a nested call tree. |
|
Metrics for a single node: |
|
Overall concurrency distribution info: |
|
The final Pydantic model returned by 'multi_example_call_profiling'. |
|
A single call in the nested call tree for one example. |
|
Info about one concurrency spike interval: |
|
Simple container for correlation / summarized stats of calls overlapping concurrency spikes. |
|
The final Pydantic model returned by concurrency_spike_analysis(...). |
|
Represents a single call in an example's workflow. |
|
Frequent sub-sequence discovered by PrefixSpan, with coverage and average duration data. |
|
Pydantic model for the final outcome: |
Module Contents#
- class PrefixInfo(/, **data: Any)#
Bases:
pydantic.BaseModel
Stores metadata about a particular prefix observed in the LLM text input.
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.
- class FrameworkLLMPrefixData(/, **data: Any)#
Bases:
pydantic.BaseModel
Metadata for a single (framework, llm_name) group, including total calls and all prefix statistics.
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.- prefix_info: list[PrefixInfo]#
- class CommonPrefixesOutput#
Bases:
pydantic.RootModel
[dict
[str
,FrameworkLLMPrefixData
]]A root model storing a dictionary keyed by ‘<framework>-<llm>’, each value is a FrameworkLLMPrefixData instance.
- to_dict() dict[str, FrameworkLLMPrefixData] #
Return the raw dictionary of data, discarding the ‘root’ wrapper.
- class LLMUniquenessMetrics(/, **data: Any)#
Bases:
pydantic.BaseModel
Stores p90, p95, and p99 for the ‘new words’ metric.
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.
- class LLMUniquenessMetricsByLLM#
Bases:
pydantic.RootModel
[dict
[str
,LLMUniquenessMetrics
]]A RootModel containing a dictionary where each key is an LLM name and each value is the LLMUniquenessMetrics for that LLM.
- class WorkflowRuntimeMetrics(/, **data: Any)#
Bases:
pydantic.BaseModel
Stores p90, p95, and p99 for workflow runtimes across all examples.
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.
- class SimpleOperationStats(/, **data: Any)#
Bases:
pydantic.BaseModel
Statistics for a particular operation name (LLM or tool), capturing concurrency, duration, usage, etc.
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.
- class SimpleBottleneckReport(/, **data: Any)#
Bases:
pydantic.BaseModel
A container for all operation stats keyed by ‘operation_type:operation_name’, plus a textual summary that highlights top bottlenecks.
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.- stats: dict[str, SimpleOperationStats]#
- class CallNode(/, **data: Any)#
Bases:
pydantic.BaseModel
A single call (LLM or TOOL) in a nested call tree.
Attributes#
- uuid: str
Unique ID tying together START/END events.
- operation_type: str
e.g. ‘LLM’ or ‘TOOL’.
- operation_name: str
e.g. ‘llama-3’, ‘bing-search’, …
- start_time: float
Time when the call started.
- end_time: float
Time when the call ended.
- duration: float
end_time - start_time
- children: list[“CallNode”]
List of nested calls inside this call’s time window.
- parent: “CallNode” | None
Reference to the parent call in the tree (None if top-level).
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.- model_config#
Configuration for the model, should be a dictionary conforming to [
ConfigDict
][pydantic.config.ConfigDict].
- compute_self_time() float #
‘Self time’ = duration minus the union of child intervals. Overlapping child intervals are merged so we don’t double-count them.
- class NodeMetrics(/, **data: Any)#
Bases:
pydantic.BaseModel
- Metrics for a single node:
self_time
subtree_time
concurrency_midpoint (optional)
bottleneck_score (example: subtree_time)
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.
- class ConcurrencyDistribution(/, **data: Any)#
Bases:
pydantic.BaseModel
- Overall concurrency distribution info:
timeline_segments: List of (start, end, concurrency)
p50, p90, p95, p99 concurrency
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.
- class NestedCallProfilingResult(/, **data: Any)#
Bases:
pydantic.BaseModel
The final Pydantic model returned by ‘multi_example_call_profiling’.
- Contains:
concurrency: ConcurrencyDistribution
node_metrics: dict[uuid, NodeMetrics]
top_bottlenecks: The top calls by bottleneck_score
textual_report: A multiline string summarizing everything
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.- concurrency: ConcurrencyDistribution#
- node_metrics: dict[str, NodeMetrics]#
- top_bottlenecks: list[NodeMetrics]#
- class ConcurrencyCallNode(/, **data: Any)#
Bases:
CallNode
A single call in the nested call tree for one example. Each call is matched by a UUID with a
*_START
and*_END
event.Because fields like prompt_tokens, completion_tokens, total_tokens may only exist at the END event, we store them only after seeing
*_END
”.Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.
- class ConcurrencySpikeInfo(/, **data: Any)#
Bases:
pydantic.BaseModel
Info about one concurrency spike interval: - start, end of the spike - concurrency level - list of calls that overlap
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.
- class ConcurrencyCorrelationStats(/, **data: Any)#
Bases:
pydantic.BaseModel
Simple container for correlation / summarized stats of calls overlapping concurrency spikes.
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.
- class ConcurrencyAnalysisResult(/, **data: Any)#
Bases:
pydantic.BaseModel
The final Pydantic model returned by concurrency_spike_analysis(…). Contains: - concurrency_distribution: concurrency_level => total_time - p50_concurrency, p90_concurrency, p95_concurrency, p99_concurrency - spike_threshold, spike_intervals - correlation_stats - textual_report
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.- spike_intervals: list[ConcurrencySpikeInfo]#
- correlation_stats: ConcurrencyCorrelationStats#
- class PrefixCallNode(/, **data: Any)#
Bases:
pydantic.BaseModel
Represents a single call in an example’s workflow. - For LLM calls, we also store llm_text_input if available so we can incorporate it into the token.
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.
- class FrequentPattern(/, **data: Any)#
Bases:
pydantic.BaseModel
Frequent sub-sequence discovered by PrefixSpan, with coverage and average duration data.
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.
- class PrefixSpanSubworkflowResult(/, **data: Any)#
Bases:
pydantic.BaseModel
Pydantic model for the final outcome: - A list of frequent patterns - A textual summary
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.- patterns: list[FrequentPattern]#