This guide demonstrates how to programmatically work with AIPerf benchmark output files using the native Pydantic data models.
AIPerf generates multiple output formats after each benchmark run, each optimized for different analysis workflows:
inputs.json - Complete input dataset with formatted payloads for each requestprofile_export.jsonl - Per-request metric records in JSON Lines format with one record per lineprofile_export_aiperf.json - Aggregated statistics and user configuration as a single JSON objectprofile_export_aiperf.csv - Aggregated statistics in CSV formatAIPerf uses Pydantic models for type-safe parsing and validation of all benchmark output files. These models ensure data integrity and provide IDE autocompletion support.
File: artifacts/my-run/inputs.json
A structured representation of all input datasets converted to the payload format used by the endpoint.
Structure:
Key fields:
session_id: Unique identifier for the conversation. This can be used to correlate inputs with results.payloads: Array of formatted request payloads (one per turn in multi-turn conversations)File: artifacts/my-run/profile_export.jsonl
The JSONL output contains one record per line, for each request sent during the benchmark. Each record includes request metadata, computed metrics, and error information if the request failed.
Metadata Fields:
session_num: Sequential request number across the entire benchmark (0-indexed).
x_request_id: Unique identifier for this specific request. This is sent to the endpoint as the X-Request-ID header.x_correlation_id: Unique identifier for the user session. This is the same for all requests in the same user session for multi-turn conversations. This is sent to the endpoint as the X-Correlation-ID header.conversation_id: ID of the input dataset conversation. This can be used to correlate inputs with results.turn_index: Position within a multi-turn conversation (0-indexed), or 0 for single-turn conversations.request_start_ns: Epoch time in nanoseconds when request was initiated by AIPerf.request_ack_ns: Epoch time in nanoseconds when server acknowledged the request. This is only applicable to streaming requests.request_end_ns: Epoch time in nanoseconds when the last response was received from the endpoint.worker_id: ID of the AIPerf worker that executed the request against the endpoint.record_processor_id: ID of the AIPerf record processor that processed the results from the server.benchmark_phase: Phase of the benchmark. Currently only profiling is supported.was_cancelled: Whether the request was cancelled during execution (such as when --request-cancellation-rate is enabled).cancellation_time_ns: Epoch time in nanoseconds when the request was cancelled (if applicable).Metrics: See the Complete Metrics Reference page for a list of all metrics and their descriptions. Will always be null for failed requests.
Error Fields:
code: HTTP status code or custom error codetype: Classification of the error (e.g., timeout, cancellation, server error). Typically the python exception class name.message: Human-readable error descriptionFile: artifacts/my-run/profile_export_aiperf.json
A single JSON object containing statistical summaries (min, max, mean, percentiles) for all metrics across the entire benchmark run, as well as the user configuration used for the benchmark.
File: artifacts/my-run/profile_export_aiperf.csv
Contains the same aggregated statistics as the JSON format, but in a spreadsheet-friendly structure with one metric per row.
AIPerf output files can be parsed using the native Pydantic models for type-safe data handling and analysis.
For large benchmark runs with thousands of requests, use async file I/O for better performance:
Load and analyze the inputs.json file to understand what data was sent during the benchmark:
Combine artifacts/my-run/inputs.json with artifacts/my-run/profile_export.jsonl for deeper analysis:
MetricRecordInfo, MetricRecordMetadata, and MetricValue model definitionsInputsFile and SessionPayloads model definitions