For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
    • Welcome to AIPerf Documentation
  • Getting Started
    • Profiling with AIPerf
    • Comprehensive LLM Benchmarking
    • Migrating from GenAI-Perf
    • GenAI-Perf vs AIPerf CLI Feature Comparison Matrix
  • Tutorials
      • Command Line Options
      • Environment Variables
      • Metrics Reference
      • Benchmark Datasets
      • Pre-Flight Tokenizer Auto Detection
      • Conversation Context Mode
      • List-Metric Aggregation
      • Vendor Usage Field Reference
      • JSON Export Schema
      • HTTP API Endpoints
      • YAML Config Roadmap
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoDocumentation
On this page
  • Per-metric stats fields
  • Example
  • Top-level fields
  • run_info
  • Schema versions
  • Other JSON exports use independent schema versions
  • For downstream parsers
Reference

profile_export_aiperf.json Schema

||View as Markdown|
Previous

Vendor Usage Field Reference

Next

HTTP API Endpoints

After every aiperf profile run, AIPerf writes a summary JSON file (default name profile_export_aiperf.json) under the artifact directory. Each top-level metric entry holds a stats block; this page documents which fields appear in that block, when they appear, and how the schema is versioned.

The on-disk shape is produced by JsonMetricResult in src/aiperf/common/models/export_models.py. Fields that are unset are omitted from the JSON output (exclude_none=True), so the field set per metric varies by metric type — this page is the source of truth for which fields to expect where.

Per-metric stats fields

FieldTypeAlways present?Notes
unitstringyesDisplay unit, e.g. "ms", "requests/sec", "tokens".
avgfloatrecord metrics with observations; derived/aggregate metricsFor derived/aggregate scalar metrics, avg carries the single computed value.
minnumberrecord metrics with a distributionSmallest observation.
maxnumberrecord metrics with a distributionLargest observation.
p1, p5, p10, p25, p50, p75, p90, p95, p99floatrecord metrics with a distributionPercentiles. Omitted for derived/aggregate metrics that have no distribution.
stdfloatrecord metrics with a distributionSample standard deviation.
countintrecord metrics onlyNumber of records contributing to the distribution. Intentionally omitted for derived/aggregate scalar metrics where it would trivially be 1 and risks being misread as the request count.
sumnumberrecord metrics with a distribution sumSum of all observations. Absent for derived metrics whose value is itself a computed rate or total.

The metric type (record / aggregate / derived) is documented per-metric in Metrics Reference. At a glance: latencies and per-request lengths are record; counts and timestamps are aggregate; throughputs and run-level totals are derived.

Example

A run with 20 requests against a streaming chat endpoint produces entries shaped like this:

1{
2 "schema_version": "1.3",
3 "request_latency": {
4 "unit": "ms",
5 "avg": 2620.71,
6 "min": 2145.06,
7 "max": 3411.10,
8 "p50": 2568.73,
9 "p99": 3371.24,
10 "std": 297.93,
11 "count": 20,
12 "sum": 52414.29
13 },
14 "request_throughput": {
15 "unit": "requests/sec",
16 "avg": 1.45
17 },
18 "request_count": {
19 "unit": "requests",
20 "avg": 20.0
21 }
22}

Note that request_throughput (derived) and request_count (aggregate) carry only unit + avg — no count, no sum, no percentiles. request_latency (record) carries the full set.

Top-level fields

In addition to the per-metric stats blocks, profile_export_aiperf.json includes top-level provenance:

FieldTypeNotes
schema_versionstringThis document’s schema version.
aiperf_versionstringAIPerf version that produced this export.
benchmark_idstringPer-run unique identifier.
start_time, end_timedatetimeUTC.
was_cancelledboolTrue if the run was interrupted.
input_configobjectResolved BenchmarkConfig body (does NOT carry envelope-level random_seed, sweep, multi_run, or variables).
run_infoobjectPer-run reproducibility — see below. Schema 1.2+.
telemetry_dataobjectGPU telemetry summaries when telemetry collection was active.
error_summaryarrayPer-error counts collected during the run.

run_info

Schema 1.2 introduced run_info to surface the seed and sweep coordinates needed to reproduce a run from the JSON file alone, without consulting the internal run_config.json handoff file. Schema 1.3 extends it with identifiers and the redacted CLI command.

FieldTypeNotes
benchmark_idstringPer-run unique identifier (BenchmarkRun.benchmark_id). Duplicates the top-level benchmark_id so run_info is a self-contained reproducibility block.
sweep_idstring / nullUUID4 of the outer sweep this run belongs to (BenchmarkPlan.sweep_id). Stable across every variation and trial of one plan; lets readers join all per-run JSON exports from the same sweep. Null for runs constructed outside the multi-run orchestrator.
random_seedint / nullResolved per-run seed. Null when the user opted out of consistent seeding and --random-seed was not set. For grid/zip/scenario sweeps this is base_seed + variation_index; for adaptive iterations beyond the plan-time list it is SHA-256 derived from (envelope_seed, variation.label).
trialintZero-based trial index within this variation.
run_labelstringHuman-readable run label (run_0001, concurrency_10, etc.).
variation_labelstringSweep variation label, or base for non-sweep runs.
variation_indexintSweep variation index (0 for non-sweep / first cell).
variation_valuesobjectSweep parameter point as {path: value}. Empty for non-sweep runs.
cli_commandstring / nullRedacted command line used to launch the run. Secrets such as API keys are removed before export. Null when the run was constructed without a CLI command.

Example for variation 2 of a concurrency grid sweep with --random-seed 42:

1"run_info": {
2 "benchmark_id": "abc123def456",
3 "sweep_id": "8c4f9a2e-1234-4567-89ab-0123456789ab",
4 "random_seed": 44,
5 "trial": 0,
6 "run_label": "run_0001",
7 "variation_label": "concurrency_40",
8 "variation_index": 2,
9 "variation_values": {"phases.profiling.concurrency": 40},
10 "cli_command": "aiperf profile --model meta-llama/Llama-3.1-8B-Instruct --url http://localhost:8000 --request-count 500"
11}

Schema versions

The current schema version is exported as the top-level schema_version field on the JSON document. Bump on additive changes; coordinate a major bump for any field rename or removal.

VersionChange
1.0Initial shape: unit, avg, min, max, std, p1–p99.
1.1Added count and sum to per-metric stats blocks. Backward-compatible for readers that ignore unknown fields; the new fields are present only on record-type metrics, omitted on derived/aggregate.
1.2Added top-level run_info block (random_seed, trial, run_label, variation_label, variation_index, variation_values). Backward-compatible: readers that don’t need reproducibility can ignore the field.
1.3Added benchmark_id, sweep_id, and cli_command to run_info. benchmark_id duplicates the top-level field so run_info is self-contained; sweep_id (UUID4 of the outer sweep) lets readers join all per-run exports from one plan without consulting the parent multi-run artifact directory; cli_command records the redacted command line when available. Backward-compatible: nullable fields default to null when unavailable.

Other JSON exports use independent schema versions

aiperf writes additional JSON files when --num-profile-runs >= 2:

  • profile_export_aiperf_aggregate.json — confidence aggregation across runs. Per-metric blocks have a different shape (mean, std, cv, se, ci_low, ci_high, t_critical, unit) and own their own schema_version (AggregateConfidenceJsonExporter.SCHEMA_VERSION, currently "1.0").
  • profile_export_aiperf_collated.json — pools per-request values from all runs into a single population, then emits combined percentiles (mean, std, p50, p90, p95, p99, count) under a combined key plus a per_run list of run-level summaries. Uses its own schema_version ("1.0.0").

The schema_version documented on this page applies only to profile_export_aiperf.json. The other files evolve on their own cadence.

For downstream parsers

  • Treat absent fields as “not applicable to this metric type,” not “data missing.” A derived-metric block with no count is normal; a record-metric block with no count indicates a bug.
  • Do not assume the field set is closed. Future minor schema bumps may add fields. Use schema_version to detect compat; ignore unknown fields.
  • unit is authoritative for the value’s interpretation. Do not infer units from the metric tag.