Prefix Synthesis API Reference
Complete API documentation for the prefix synthesis module.
Module: aiperf.dataset.synthesis
Classes
RollingHasher
Converts sequences of text blocks into globally unique hash IDs using rolling hash.
Constructor:
Parameters:
block_size(int): Number of tokens per block for hashing (default: 512)
Methods:
hash_blocks(blocks: Sequence[str]) -> list[int]
- Convert a sequence of text blocks to hash IDs
- Parameters:
blocks: Sequence of text strings representing blocks
- Returns: List of unique hash IDs
hash_token_blocks(blocks: Sequence[Sequence[int]]) -> list[int]
- Convert a sequence of token blocks to hash IDs
- Parameters:
blocks: Sequence of token blocks (each block is a sequence of token IDs)
- Returns: List of unique hash IDs
reset() -> None
- Reset the hasher state for hashing new sequences
- Note: Maintains global uniqueness across sequences (keeps
_hash_to_idand_id_counter)
get_stats() -> dict[str, int]
- Get statistics about the hasher
- Returns: Dictionary with ‘total_hashes’ and ‘max_id’
Example:
RadixTree
Compact representation of prefix patterns using a radix tree data structure.
Constructor:
Methods:
add_path(path: list[int]) -> RadixNode
- Add a path to the tree from root
- Parameters:
path: List of edge labels (hash IDs or token counts)
- Returns: The leaf RadixNode at the end of the path
- Side effects: Increments visit_count for all nodes in the path
get_node(node_id: int) -> RadixNode | None
- Get node by ID
- Returns: RadixNode or None if not found
get_all_nodes() -> list[RadixNode]
- Get all nodes in the tree
- Returns: List of all RadixNode instances
get_stats() -> RadixTreeStats
- Get statistics about tree structure
- Returns:
RadixTreeStatsobject with ‘num_nodes’, ‘num_leaves’, ‘total_visits’, ‘max_depth’
Properties:
root: RadixNode
- The root node of the tree (read-only)
Example:
RadixNode
A node in the radix tree representing a prefix path.
Constructor:
Attributes:
node_id: Unique node identifierlabel: Edge label (token count)visit_count: Number of times this node is visitedchildren: Dictionary of child nodes by edge labelparent: Parent node reference
Methods:
add_child(label: int, child: RadixNode) -> None
- Add a child node with the given edge label
get_child(label: int) -> RadixNode | None
- Get child with given label
is_leaf() -> bool
- Check if this node is a leaf (no children)
EmpiricalSampler
Samples values from an empirical distribution learned from data.
Constructor:
Parameters:
data: List of observed values to learn distribution from
Methods:
sample() -> int | float
- Draw a single sample from the learned distribution
- Returns: A sampled value
sample_batch(size: int) -> list[int | float]
- Draw multiple samples from the learned distribution
- Parameters:
size: Number of samples to draw
- Returns: List of sampled values
get_stats() -> EmpiricalSamplerStats
- Get statistics about the learned distribution
- Returns:
EmpiricalSamplerStatsobject with ‘min’, ‘max’, ‘mean’, ‘median’, ‘num_unique’
Example:
PrefixAnalyzer
Analyzes traces to extract ISL/OSL statistics and prefix patterns.
Constructor:
Parameters:
block_size(int): Number of tokens per block for analysis (default: 512)
Methods:
analyze_file(trace_file: Path | str) -> AnalysisStats
- Analyze a mooncake trace file
- Parameters:
trace_file: Path to JSONL trace file
- Returns: AnalysisStats object
analyze_traces(traces: list[dict]) -> AnalysisStats
- Analyze a list of trace dictionaries
- Parameters:
traces: List of trace dictionaries
- Returns: AnalysisStats object
Example:
Synthesizer
Generates synthetic traces preserving prefix-sharing patterns.
Constructor:
Parameters:
params(SynthesisParams): Generation configuration (optional, uses defaults if None)
Methods:
synthesize_from_file(trace_file: Path | str) -> list[dict]
- Synthesize traces from an input trace file
- Parameters:
trace_file: Path to input JSONL trace file
- Returns: List of synthetic trace dictionaries
synthesize_traces(traces: list[dict]) -> list[dict]
- Synthesize traces from a list of trace dictionaries
- Parameters:
traces: List of input trace dictionaries
- Returns: List of synthetic trace dictionaries
synthesize_grouped_traces(data: dict[str, list[dict]]) -> dict[str, list[dict]]
- Synthesize traces while preserving session grouping
- Parameters:
data: Dictionary mapping session_id to list of trace dicts
- Returns: Dictionary mapping session_id to list of synthesized trace dicts
get_stats() -> dict[str, Any]
- Get synthesizer statistics
- Returns: Dictionary with ‘tree_nodes’, ‘tree_depth’, ‘params’
Example:
Data Models
AnalysisStats
Statistics extracted from trace analysis.
Fields:
total_requests: int- Total number of requests in traceunique_prefixes: int- Number of unique prefix patterns (all prefix subsequences)num_prefix_groups: int- Number of distinct shared first blocks (prefix groups)cache_hit_rate: float- Theoretical cache hit rate (0.0 to 1.0)min_isl: int- Minimum input sequence lengthmax_isl: int- Maximum input sequence lengthavg_isl: float- Average input sequence lengthmin_osl: int- Minimum output sequence lengthmax_osl: int- Maximum output sequence lengthavg_osl: float- Average output sequence lengthprefix_reuse_ratio: float- Ratio of reused prefixes (0.0 to 1.0)isl_stats: MetricStats | None- Full statistics for input sequence lengthosl_stats: MetricStats | None- Full statistics for output sequence lengthcontext_length_stats: MetricStats | None- Full statistics for context (shared prefix) lengthunique_prompt_length_stats: MetricStats | None- Full statistics for unique prompt lengthhit_rate_stats: MetricStats | None- Full statistics for per-request cache hit rates
MetricStats
Statistics for a single metric with percentiles.
Fields:
mean: float- Mean valuestd_dev: float- Standard deviationmin: float- Minimum valuep25: float- 25th percentilemedian: float- Median (50th percentile)p75: float- 75th percentilemax: float- Maximum value
SynthesisParams
Parameters for synthetic trace generation.
Fields:
speedup_ratio: float = 1.0- Timestamp scaling multiplier (ge 0.0)prefix_len_multiplier: float = 1.0- Core prefix length multiplier (ge 0.0)prefix_root_multiplier: int = 1- Number of independent trees to distribute traces across (ge 1)prompt_len_multiplier: float = 1.0- Leaf prompt length multiplier (ge 0.0)max_isl: int | None = None- Maximum input sequence length filterblock_size: int = 512- KV cache page size (ge 1)
Class Methods:
from_synthesis_config(config: SynthesisConfig, block_size: int = 512) -> SynthesisParams- Create from SynthesisConfig
RadixTreeStats
Statistics about radix tree structure.
Fields:
num_nodes: int- Total number of nodes in treenum_leaves: int- Number of leaf nodes (nodes with no children)total_visits: int- Number of paths added to treemax_depth: int- Maximum depth from root to leaf
EmpiricalSamplerStats
Statistics about learned empirical distribution.
Fields:
min: float- Minimum value in original datamax: float- Maximum value in original datamean: float- Mean of original datamedian: float- Median of original datanum_unique: int- Number of unique values in distribution
Utility Functions
aiperf.dataset.synthesis.graph_utils
Graph manipulation utilities for radix tree operations.
validate_tree(tree: RadixTree) -> bool
- Validate tree structure consistency
- Checks parent-child relationships and reachability
remove_leaves(tree: RadixTree, visit_threshold: int = 1) -> None
- Remove leaf nodes visited N times or less
- Parameters:
tree: RadixTree to prunevisit_threshold: Minimum visit count to keep
merge_unary_chains(tree: RadixTree) -> None
- Merge unary chains (nodes with single children) into compressed edges
- Modifies tree in-place
compute_transition_cdfs(tree: RadixTree) -> dict[int, np.ndarray]
- Compute cumulative distribution functions for outgoing transitions
- Returns: Dictionary mapping node IDs to CDF arrays
get_tree_stats(tree: RadixTree) -> dict[str, Any]
- Get comprehensive statistics about tree structure
- Returns: Dictionary with tree metrics (includes RadixTreeStats fields plus ‘internal_nodes’ and ‘branching_factor’)
aiperf.dataset.synthesis.rolling_hasher
Utility functions for converting between texts and hash IDs.
texts_to_hashes(tokenizer: Tokenizer, texts: list[str], block_size: int = 512) -> list[list[int]]
- Convert a list of texts to hash ID sequences
- Tokenizes texts, splits into blocks, and generates consecutive hash IDs
- Parameters:
tokenizer: Tokenizer for encoding textstexts: List of input text stringsblock_size: Number of tokens per block
- Returns: List of hash ID sequences, one per input text
hashes_to_texts(prompt_generator: PromptGenerator, hash_ids_list: list[list[int]], input_lengths: list[int], block_size: int = 512) -> list[str]
- Convert hash ID sequences back to text strings
- Uses the PromptGenerator’s cache to ensure the same hash ID always produces the same token block
- Parameters:
prompt_generator: PromptGenerator instance for generating text from hash_idshash_ids_list: List of hash ID sequencesinput_lengths: Target input lengths (in tokens) for each sequenceblock_size: Number of tokens per block
- Returns: List of text strings, one per hash ID sequence
- Raises:
ValueErroriflen(hash_ids) * block_size < input_lengthfor any sequence
CLI Commands
aiperf analyze-trace
Analyze a mooncake trace file for ISL/OSL distributions and cache hit rates.
Usage:
Arguments:
INPUT_FILE: Path to input mooncake trace JSONL file
Options:
--block-size INT(default: 512) - KV cache block size--output-file PATH- Optional output path for analysis report (JSON)
Example:
Configuration
Synthesis parameters can be configured via CLI or programmatically.
Via CLI Options (with aiperf profile):
Synthesis is applied automatically when running aiperf profile with mooncake traces and synthesis parameters:
Via SynthesisConfig:
Direct Instantiation:
Integration with AIPerf Benchmark
Synthesis is applied automatically during benchmark when using mooncake traces with synthesis parameters:
Synthesis is triggered automatically when any --synthesis-* parameter differs from its default value.
Performance Considerations
- Memory: Radix tree construction can be memory-intensive for large traces (>100k requests)
- Time: Analysis is O(n) where n is number of requests
- Synthesis: O(n) to generate synthetic traces from analyzed data
- Optimization: Use
remove_leaves()to prune infrequent paths for memory savings
Error Handling
All components use standard Python exceptions: