AIPerf Server Metrics Parquet Export Schema
Schema reference for the server_metrics_export.parquet file. Optimized for SQL analytics with DuckDB, pandas, and Polars.
Overview
The Parquet export provides raw time-series data with cumulative delta calculations applied at each timestamp. Uses a normalized schema where histogram buckets are separate rows (not wide columns), producing ~50% smaller files.
Enable Parquet Export
Delta Calculations
All values are deltas from a reference point (last sample before profiling period):
Negative deltas (counter resets) are clamped to 0.
Schema Definition
Fixed Columns
Value Columns
Dynamic Label Columns
Prometheus labels become individual columns (alphabetically sorted):
Label columns vary by endpoint/model. Use union_by_name=true for cross-file queries.
Note: Prometheus labels that conflict with reserved column names (endpoint_url, metric_name, metric_type, unit, description, timestamp_ns, value, sum, count, bucket_le, bucket_count) are silently excluded.
Row Structure by Metric Type
Column order: fixed columns → label columns (alphabetically) → value columns.
Gauge/Counter: One Row per Timestamp
Histogram: N Rows per Timestamp (One per Bucket)
File Metadata
Parquet file metadata (accessible via pq.read_metadata()) includes:
Compression: Snappy (good compression ratio with fast decompression)
Example Queries
DuckDB
pandas
Polars
Reading Metadata
Best Practices
Cross-File Analysis
Label columns vary by endpoint and model. Always use union_by_name:
Histogram Percentile Estimation
Reconstruct percentiles from bucket data. Note that bucket_count values are cumulative (each bucket includes all observations with value <= bucket_le), matching Prometheus histogram semantics:
Memory-Efficient Processing
For large files, use lazy evaluation:
Schema Version History
For aggregated statistics, see JSON Schema. For metric definitions, see Server Metrics Reference. For usage examples, see the Server Metrics Tutorial.