nat.profiler.prediction_trie.trie_builder#

Classes#

`_SiblingSpan`	A paired START/END span used for parallel sibling overlap detection.
`SensitivityConfig`	Configuration for auto-sensitivity scoring.
`LLMCallContext`	Context for a single LLM call extracted from a trace.
`_NodeAccumulators`	Accumulators for a single trie node.
`PredictionTrieBuilder`	Builds a prediction trie from profiler execution traces.

Module Contents#

class _SiblingSpan#

A paired START/END span used for parallel sibling overlap detection.

uuid: str#

parent_id: str#

start_time: float#

end_time: float#

is_llm: bool#

class SensitivityConfig#

Configuration for auto-sensitivity scoring.

sensitivity_scale: int = 5#

w_critical: float = 0.5#

w_fanout: float = 0.3#

w_position: float = 0.2#

w_parallel: float = 0.0#

class LLMCallContext#

Context for a single LLM call extracted from a trace.

path: list[str]#

call_index: int#

remaining_calls: int#

time_to_next_ms: float | None#

output_tokens: int#

call_duration_s: float = 0.0#

workflow_duration_s: float = 0.0#

parallel_slack_ratio: float = 0.0#

sensitivity_score: float = 0.0#

span_start_time: float = 0.0#

span_end_time: float = 0.0#

class _NodeAccumulators#

Accumulators for a single trie node.

remaining_calls: dict[int, nat.profiler.prediction_trie.metrics_accumulator.MetricsAccumulator]#

interarrival_ms: dict[int, nat.profiler.prediction_trie.metrics_accumulator.MetricsAccumulator]#

output_tokens: dict[int, nat.profiler.prediction_trie.metrics_accumulator.MetricsAccumulator]#

all_remaining_calls: nat.profiler.prediction_trie.metrics_accumulator.MetricsAccumulator#

all_interarrival_ms: nat.profiler.prediction_trie.metrics_accumulator.MetricsAccumulator#

all_output_tokens: nat.profiler.prediction_trie.metrics_accumulator.MetricsAccumulator#

sensitivity: dict[int, nat.profiler.prediction_trie.metrics_accumulator.MetricsAccumulator]#

all_sensitivity: nat.profiler.prediction_trie.metrics_accumulator.MetricsAccumulator#

class PredictionTrieBuilder( sensitivity_config: SensitivityConfig | None = None, )#

Builds a prediction trie from profiler execution traces.

_node_accumulators: dict[tuple[str, Ellipsis], _NodeAccumulators]#

_sensitivity_config = None#

add_trace( steps: list[nat.data_models.intermediate_step.IntermediateStep], ) → None#: Process a single execution trace and update accumulators.

_extract_llm_contexts( steps: list[nat.data_models.intermediate_step.IntermediateStep], ) → list[LLMCallContext]#: Extract LLM call contexts from a trace.

_compute_sensitivity_scores(contexts: list[LLMCallContext]) → None#

Compute composite sensitivity scores for each call in the trace.

Parallel siblings are detected via temporal overlap and assigned the same logical position so that the U-shaped position signal and fan-out signal treat them as a single workflow step rather than spreading them across sequential indices.

After computing raw weighted scores, the values are min-max normalized across all calls in the trace so the full 0–1 range is used. This ensures the most-sensitive call in a trace maps to the top of the scale and the least-sensitive call maps to the bottom.

static _compute_logical_positions( contexts: list[LLMCallContext], ) → list[int]#

Assign a logical position to each call, collapsing parallel siblings.

Uses standard interval-merging: contexts are sorted by span start time, and any call whose start is before the current group’s latest end time joins the group (capturing transitive overlaps). The resulting group indices are then mapped back to the original LLM_END ordering.

All calls in a parallel group share the same logical position index, so the U-shaped position signal and fan-out signal treat them as occupying a single workflow step.

static _build_sibling_map( steps: list[nat.data_models.intermediate_step.IntermediateStep], ) → dict[str, list[_SiblingSpan]]#

Pair START/END events by UUID, then group by parent_id.

Only considers LLM, TOOL, FUNCTION, and SPAN event types. Returns a mapping from parent_id to all completed sibling spans under that parent.

static _compute_parallel_slack( llm_uuid: str, llm_start: float, llm_end: float, siblings: list[_SiblingSpan], ) → float#

Compute the parallel slack ratio for an LLM call relative to its siblings.

slack = max(0, 1 - llm_duration / max_overlapping_sibling_duration)

Returns 0.0 when the LLM call is the longest overlapping sibling, and approaches 1.0 when a much longer sibling runs in parallel.

_build_path( step: nat.data_models.intermediate_step.IntermediateStep, ) → list[str]#: Build the function path from ancestry.

_update_accumulators(ctx: LLMCallContext) → None#: Update accumulators at every node along the path.

_add_to_accumulators( path_key: tuple[str, Ellipsis], ctx: LLMCallContext, ) → None#: Add context data to accumulators for a specific path.

build() → nat.profiler.prediction_trie.data_models.PredictionTrieNode#: Build the final prediction trie from accumulated data.

_get_or_create_node( root: nat.profiler.prediction_trie.data_models.PredictionTrieNode, path_key: tuple[str, Ellipsis], ) → nat.profiler.prediction_trie.data_models.PredictionTrieNode#: Navigate to or create a node at the given path.

_populate_node_predictions( node: nat.profiler.prediction_trie.data_models.PredictionTrieNode, accs: _NodeAccumulators, ) → None#: Populate a node with computed predictions from accumulators.

_score_to_sensitivity( acc: nat.profiler.prediction_trie.metrics_accumulator.MetricsAccumulator | None, ) → int | None#: Convert accumulated sensitivity scores to a clamped integer.