nat.profiler.prediction_trie#

Submodules#

Classes#

`LLMCallPrediction`	Predictions for an LLM call at a given position in the call hierarchy.
`PredictionMetrics`	Aggregated statistics for a single metric from profiler data.
`PredictionTrieNode`	A node in the prediction trie representing a function in the call hierarchy.
`PredictionTrieBuilder`	Builds a prediction trie from profiler execution traces.

Functions#

`load_prediction_trie`(...)	Load a prediction trie from a JSON file.
`save_prediction_trie`(→ None)	Save a prediction trie to a JSON file.

Package Contents#

class LLMCallPrediction(/, **data: Any)#

Bases: pydantic.BaseModel

Predictions for an LLM call at a given position in the call hierarchy.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

remaining_calls: PredictionMetrics = None#

interarrival_ms: PredictionMetrics = None#

output_tokens: PredictionMetrics = None#

latency_sensitivity: int | None = None#

class PredictionMetrics(/, **data: Any)#

Bases: pydantic.BaseModel

Aggregated statistics for a single metric from profiler data.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

sample_count: int = None#

mean: float = None#

p50: float = None#

p90: float = None#

p95: float = None#

class PredictionTrieNode(/, **data: Any)#

Bases: pydantic.BaseModel

A node in the prediction trie representing a function in the call hierarchy.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

name: str = None#

children: dict[str, PredictionTrieNode] = None#

predictions_by_call_index: dict[int, LLMCallPrediction] = None#

predictions_any_index: LLMCallPrediction | None = None#

load_prediction_trie( path: pathlib.Path, ) → nat.profiler.prediction_trie.data_models.PredictionTrieNode#

Load a prediction trie from a JSON file.

Args:: path: Path to the JSON file
Returns:: The deserialized prediction trie root node

save_prediction_trie( trie: nat.profiler.prediction_trie.data_models.PredictionTrieNode, path: pathlib.Path, workflow_name: str = 'unknown', ) → None#

Save a prediction trie to a JSON file.

Args:: trie: The prediction trie root node path: Path to save the JSON file workflow_name: Name of the workflow this trie was built from

class PredictionTrieBuilder( sensitivity_config: SensitivityConfig | None = None, )#

Builds a prediction trie from profiler execution traces.

_node_accumulators: dict[tuple[str, Ellipsis], _NodeAccumulators]#

_sensitivity_config = None#

add_trace( steps: list[nat.data_models.intermediate_step.IntermediateStep], ) → None#: Process a single execution trace and update accumulators.

_extract_llm_contexts( steps: list[nat.data_models.intermediate_step.IntermediateStep], ) → list[LLMCallContext]#: Extract LLM call contexts from a trace.

_compute_sensitivity_scores(contexts: list[LLMCallContext]) → None#

Compute composite sensitivity scores for each call in the trace.

Parallel siblings are detected via temporal overlap and assigned the same logical position so that the U-shaped position signal and fan-out signal treat them as a single workflow step rather than spreading them across sequential indices.

After computing raw weighted scores, the values are min-max normalized across all calls in the trace so the full 0–1 range is used. This ensures the most-sensitive call in a trace maps to the top of the scale and the least-sensitive call maps to the bottom.

static _compute_logical_positions( contexts: list[LLMCallContext], ) → list[int]#

Assign a logical position to each call, collapsing parallel siblings.

Uses standard interval-merging: contexts are sorted by span start time, and any call whose start is before the current group’s latest end time joins the group (capturing transitive overlaps). The resulting group indices are then mapped back to the original LLM_END ordering.

All calls in a parallel group share the same logical position index, so the U-shaped position signal and fan-out signal treat them as occupying a single workflow step.

static _build_sibling_map( steps: list[nat.data_models.intermediate_step.IntermediateStep], ) → dict[str, list[_SiblingSpan]]#

Pair START/END events by UUID, then group by parent_id.

Only considers LLM, TOOL, FUNCTION, and SPAN event types. Returns a mapping from parent_id to all completed sibling spans under that parent.

static _compute_parallel_slack( llm_uuid: str, llm_start: float, llm_end: float, siblings: list[_SiblingSpan], ) → float#

Compute the parallel slack ratio for an LLM call relative to its siblings.

slack = max(0, 1 - llm_duration / max_overlapping_sibling_duration)

Returns 0.0 when the LLM call is the longest overlapping sibling, and approaches 1.0 when a much longer sibling runs in parallel.

_build_path( step: nat.data_models.intermediate_step.IntermediateStep, ) → list[str]#: Build the function path from ancestry.

_update_accumulators(ctx: LLMCallContext) → None#: Update accumulators at every node along the path.

_add_to_accumulators( path_key: tuple[str, Ellipsis], ctx: LLMCallContext, ) → None#: Add context data to accumulators for a specific path.

build() → nat.profiler.prediction_trie.data_models.PredictionTrieNode#: Build the final prediction trie from accumulated data.

_get_or_create_node( root: nat.profiler.prediction_trie.data_models.PredictionTrieNode, path_key: tuple[str, Ellipsis], ) → nat.profiler.prediction_trie.data_models.PredictionTrieNode#: Navigate to or create a node at the given path.

_populate_node_predictions( node: nat.profiler.prediction_trie.data_models.PredictionTrieNode, accs: _NodeAccumulators, ) → None#: Populate a node with computed predictions from accumulators.

_score_to_sensitivity( acc: nat.profiler.prediction_trie.metrics_accumulator.MetricsAccumulator | None, ) → int | None#: Convert accumulated sensitivity scores to a clamped integer.