Weights & Biases Export
Weights & Biases Export
What You Will Learn
This tutorial walks through AIPerf’s Weights & Biases (wandb) integration:
- Post-run results table — upload the final benchmark results as a
wandb Table that mirrors the console metrics view, one row per metric
with
avg/min/max/p99/p90/p50/stdcolumns. - Artifact upload — attach the generated output files (
inputs.json,profile_export.jsonl, CSV/JSON summaries) to the run so it is fully reproducible from the wandb UI. - Cross-run comparison — use the run config (model, concurrency, request rate) to filter and group benchmark runs in a wandb workspace.
Prerequisites
Install AIPerf with the optional wandb extra:
Authenticate with wandb if you have not already:
Run a Profile with wandb Export Enabled
Flag breakdown
The equivalent config-v2 YAML:
The exporter runs after profiling completes, once the local exporters have written their files. The wandb client’s local state is written under the run’s artifact directory, not your working directory.
What Gets Uploaded
Results table
The run’s workspace contains a summary_metrics table with the same
metrics, ordering, and labels as the console results table:
Stats a metric does not produce (for example percentiles on count-style
metrics) appear as empty cells, matching the console’s N/A.
Table panels paginate by default. Use the page-size selector in the panel footer (10/25/50/100) to fit the whole table on one page.
Artifact bundle
Each run logs one aiperf-run artifact containing the files from the
run’s artifact directory: inputs.json, profile_export.jsonl (when
record-level export is enabled), profile_export_aiperf.csv,
profile_export_aiperf.json, plus any parquet, timeslice, or plot
files. Download them from the run’s Artifacts tab to reproduce or
re-analyze the benchmark.
Run config and tags
The run config records the endpoint type, model names, redacted URLs,
load generator settings (concurrency, request rate, request count,
duration), and the redacted CLI command. Tags include the AIPerf
version and a benchmark-<id> tag. In a project workspace you can
group or filter runs by any config key — for example, group a
concurrency sweep by phases.0.concurrency.
Comparing Runs
Run the same benchmark at several settings, giving each run a distinguishing name:
In the wandb project, open each run to read its full results table, or use the runs table to compare the same metric across runs side by side.