Search History API Reference

View as Markdown

Schema reference for search_history.json — the on-disk trajectory log of an AIPerf adaptive Bayesian-Optimization (BO) run. The file is produced by src/aiperf/exporters/search_history.py (write_search_history) and is rewritten in place after every BO iteration, so a partial trajectory survives a crash or cancellation. Each entry captures what the planner proposed, what the resulting benchmark measured, and (on terminal calls) why the loop stopped. For algorithm semantics see Bayesian Optimization.

Overview

search_history.json is the canonical artifact for post-run BO audit and dashboarding. It complements (it does not replace) sweep_aggregate/profile_export_aiperf_sweep.{json,csv}, which carries the post-hoc grouping of all iterations by variation_values. The trajectory log is unique in that it preserves iteration order and convergence-reason metadata.

Use it to:

  • Recover the order in which the planner proposed configurations.
  • Identify the best observed point(s) and how many iterations it took to find them. For multi-objective runs (len(config.objectives) > 1) best_trials is the Pareto front rather than a single argmax/argmin.
  • Determine why the run terminated (budget exhaustion, no-improvement patience, plateau, or — for the Optuna terminator — posterior-regret bound).
  • Reproduce the original search-space specification (including objectives and outcome constraints) for a follow-up run.

File Location

The exporter writes to <base_dir>/search_history.json where base_dir is the controlling artifact directory. The companion sweep aggregate is under <base_dir>/sweep_aggregate/ for single-trial and independent multi-run layouts, and under <base_dir>/aggregate/sweep_aggregate/ for repeated multi-run layouts.

In-process (aiperf profile --search-space ...):

artifacts/
{benchmark_name}/
search_history.json # next to sweep_aggregate/, NOT inside it
sweep_aggregate/
profile_export_aiperf_sweep.json
profile_export_aiperf_sweep.csv

JSON Schema

Top-Level Structure

1{
2 "config": { ... },
3 "iterations": [ ... ],
4 "best_trials": [ ... ] | null,
5 "boundary_summary": { ... } | null,
6 "recipe": "max-concurrency-under-sla" | null,
7 "convergence_reason": "max_iterations" | "improvement_patience" | "plateau_cv" | "posterior_regret_bound" | "emmr" | "unknown" | "smooth_isotonic_precision_reached" | "monotonic_precision_reached" | ... | null
8}

Top-Level Fields:

FieldTypeRequiredDescription
configobjectyesFrozen subset of the BO configuration used to drive this run, including the full objectives list and any outcome_constraints.
iterationsarray<object>yesPer-iteration trajectory entries, in the order the planner proposed them. May be empty on the first write.
best_trialsarray<object> | nullyesArgmax/argmin (single-objective: len(config.objectives) == 1) or the Pareto front (multi-objective: len(config.objectives) > 1). null until at least one iteration has produced a usable objective. See Interpreting best_trials.
boundary_summaryobject | nullyesEmpirical SLA-feasibility boundary along the swept axis. null for multi-dim search spaces, and on empty history. See boundary_summary.
recipestring | nullyesName of the search recipe (AdaptiveSearchSweep.recipe_name) that authored this configuration, e.g. "max-concurrency-under-sla". null when the configuration was built ad-hoc rather than via a recipe.
convergence_reasonstring | nullyesWhy the loop stopped. null only on mid-loop writes or abnormal exit (cancellation, crash). A clean terminal exit always writes a non-null string — either the planner’s own reason or the literal "unknown" fallback when the planner returned no reason. See Convergence Reasons.

config Section

A snapshot of the adaptive-search configuration fields that the v1 writer persists from AdaptiveSearchSweep (src/aiperf/config/sweep/config.py). It includes the planner name, objectives, outcome constraints, iteration budget, initial-point count, random seed, convergence knobs, search-space dimensions, and SLA filters. It does not serialize every planner knob yet (for example optuna_sampler, optuna_acquisition, optuna_terminator, objective_pooling, and smooth-isotonic replicate/warmup settings are omitted), so use it as an audit trail for the trajectory rather than a complete round-trip config. The optimization target is recorded as a list under objectives (length-1 for single-objective runs, length-N for Pareto BO); outcome_constraints is the parallel list of feasibility gates that BoTorch’s acquisition masks against.

1{
2 "config": {
3 "planner": "optuna",
4 "objectives": [
5 {"metric": "output_token_throughput", "stat": "avg", "direction": "MAXIMIZE", "threshold": null},
6 {"metric": "time_to_first_token", "stat": "p99", "direction": "MINIMIZE", "threshold": 250.0}
7 ],
8 "outcome_constraints": [
9 {"metric": "request_error_rate", "op": "<=", "bound": 0.01}
10 ],
11 "max_iterations": 30,
12 "n_initial_points": 5,
13 "random_seed": 42,
14 "improvement_patience": 10,
15 "plateau_window": 8,
16 "plateau_threshold": 0.01,
17 "search_space": [
18 {"path": "phases.profiling.concurrency", "lo": 1, "hi": 1000, "kind": "int"}
19 ],
20 "sla_filters": [
21 {"metric_tag": "time_to_first_token", "stat": "p95", "op": "lt", "threshold": 200.0}
22 ]
23 }
24}

Fields:

FieldTypeRequiredDescription
plannerstringyesSearch planner plugin name. One of bayesian, optuna, monotonic_sla, smooth_isotonic. Multi-objective is available through the curated bayesian planner or through optuna with optuna_sampler=botorch and a multi-objective acquisition such as qlognehvi.
objectivesarray<object>yesOptimization targets. Length 1 = single-objective BO. Length > 1 = multi-objective Pareto BO. Min length 1.
objectives[].metricstringyesMetric tag (e.g. "output_token_throughput"). Matches a key in RunResult.summary_metrics.
objectives[].statstringyesStatistic on the metric: one of "avg", "p50", "p90", "p95", "p99".
objectives[].directionstringyesEither "MAXIMIZE" or "MINIMIZE" (uppercase; serialized from OptimizationDirection.name, locked by tests/unit/exporters/test_search_history_multi_objective.py).
objectives[].thresholdfloat | nullyesPareto reference point for hypervolume computation (multi-objective only). Trials worse than this on this objective do not contribute to hypervolume. null = auto-derive from the worst observed value among Sobol initial points. Ignored for single-objective runs.
outcome_constraintsarray<object>yesFeasibility gates on metrics the optimizer is not optimizing. Empty list = no constraints. Distinct from objectives[].threshold (Pareto reference point) and from sla_filters (post-hoc benchmark eligibility): outcome constraints down-weight infeasible candidates inside BoTorch’s acquisition function.
outcome_constraints[].metricstringyesMetric tag to constrain.
outcome_constraints[].opstringyesComparison operator. One of "<=", ">=", "==". Distinct from sla_filters[].op, which uses the lowercase mnemonics lt/le/gt/ge — outcome constraints feed BoTorch’s acquisition mask, SLA filters feed post-hoc feasibility ranking.
outcome_constraints[].boundfloatyesThreshold value.
max_iterationsintyesIteration budget. The loop also stops earlier on convergence.
n_initial_pointsintyesSobol-random points before BoTorch fits the GP. Validator enforces n_initial_points < max_iterations for bayesian and optuna planners only; monotonic_sla / smooth_isotonic planners drive their own probe sequence and ignore this field.
random_seedint | nullyesReproducibility seed passed to the planner backend. null when the run was unseeded.
improvement_patienceintyesStop after this many consecutive iterations with no improvement over the running best objective (single-objective) or no hypervolume gain (multi-objective). Drives the "improvement_patience" convergence reason.
plateau_windowintyesNumber of recent iterations inspected for plateau detection.
plateau_thresholdfloatyesCoefficient-of-variation threshold (relative; scale-free) for the plateau test. Drives the "plateau_cv" convergence reason.
search_spacearray<object>yesOriginal search-space spec, one entry per dimension. Min length 1.
sla_filtersarray<object>yesPost-hoc feasibility gates applied to per-iteration verdicts. Empty list = no filters. Drives the feasible flag on iterations[] and best_trials[], the 1D boundary_summary block, and feasibility-first lexicographic best-trial selection. Distinct from outcome_constraints (BO acquisition-mask gates fed into BoTorch).
sla_filters[].metric_tagstringyesMetric tag to filter on; matches a key in RunResult.summary_metrics.
sla_filters[].statstringyesStatistic on the metric: one of "avg", "p50", "p90", "p95", "p99".
sla_filters[].opstringyesComparison operator. One of "lt", "le", "gt", "ge" (lowercase mnemonics). Distinct from outcome_constraints[].op, which uses "<=" / ">=" / "==".
sla_filters[].thresholdfloatyesNumeric threshold the metric statistic is compared against. Finite (NaN/inf rejected at config time).

search_space Element Fields:

FieldTypeRequiredDescription
pathstringyesDotted path into BenchmarkConfig (e.g. "phases.profiling.concurrency").
lofloatyesInclusive lower bound.
hifloatyesInclusive upper bound. Always > lo.
kindstringyesEither "int" (integer-valued; suggestions are coerced via int()) or "real" (float).

iterations Section

One entry per BO iteration, in submission order. iteration_idx is dense and zero-based. Mid-run writes leave the array open-ended; readers must tolerate any non-negative length, including zero.

1{
2 "iterations": [
3 {
4 "iteration_idx": 0,
5 "variation_values": {"phases.profiling.concurrency": 142},
6 "objective_values": [8421.7],
7 "feasible": true,
8 "non_monotonic_warning": false
9 },
10 {
11 "iteration_idx": 1,
12 "variation_values": {"phases.profiling.concurrency": 256},
13 "objective_values": [9512.3],
14 "feasible": true,
15 "non_monotonic_warning": false
16 },
17 {
18 "iteration_idx": 2,
19 "variation_values": {"phases.profiling.concurrency": 64},
20 "objective_values": null,
21 "feasible": true,
22 "non_monotonic_warning": false
23 }
24 ]
25}

A multi-objective iteration carries one entry per config.objectives[i], in the same order:

1{
2 "iteration_idx": 7,
3 "variation_values": {"phases.profiling.concurrency": 256},
4 "objective_values": [9512.3, 187.4],
5 "feasible": true,
6 "non_monotonic_warning": false
7}

Fields:

FieldTypeRequiredDescription
iteration_idxintyesZero-based, dense iteration counter. Matches SweepVariation.index for the iteration.
variation_valuesobjectyesMap of dotted path to proposed value (one entry per search_space dimension). Values are plain Python int or float per dimension kind.
objective_valuesarray<float> | nullyesOne entry per config.objectives[i], in the same order. Each entry is one aggregate value for the planner-proposed point: the mean of finite per-trial values for objectives[i].metric/objectives[i].stat, or the pooled percentile when percentile pooling is configured. The whole field is null (not a list of nulls) when every trial failed or any configured metric/stat was missing — in that case the planner internally tells BoTorch a fallback loss to keep ask/tell pairing consistent, but the fallback is NOT persisted here. For length-1 objectives, this is a length-1 list.
feasibleboolyesWhether at least one trial at this iteration satisfied every configured sla_filters entry. Computed by the planner’s tell(). Defaults to true when no SLA filters are configured, so non-SLA runs degenerate to plain ranking unchanged.
non_monotonic_warningboolyestrue iff the verdict at this iteration violated the monotonicity assumption — a feasible point appeared at a swept value at-or-above the latched infeasible_min, or an infeasible point at-or-below feasible_max. Set only by MonotonicSLASearchPlanner and SmoothIsotonicSLAPlanner; always false for BO planners.

Note: objective_values[i] is one aggregate vector per search point/iteration: by default the mean of finite trial-level objective values, or the pooled percentile when percentile pooling is configured. The GP/Optuna planner observes that aggregate vector, not every per-trial value separately. The SearchIteration.results per-trial list held in memory by the planner is intentionally NOT serialized — read the per-trial profile_export_aiperf.json files under each iteration’s variation directory if you need the spread.

Interpreting best_trials

best_trials is the post-hoc winner set over iterations whose objective_values is non-null. The shape adapts to the number of objectives:

  • Single-objective (len(config.objectives) == 1). best_trials is a length-1 list containing the global argmax (when direction == "MAXIMIZE") or argmin (when "MINIMIZE"). Single-objective is treated as the length-1 special case of the multi-objective shape — there is no separate scalar-best block.
  • Multi-objective (len(config.objectives) > 1). best_trials is the Pareto front: the set of iterations that are not dominated by any other iteration on every objective simultaneously. A trial A dominates B iff A is at least as good on every objective and strictly better on at least one. The front itself is unranked; if you want a tie-breaking order, sort by pareto_rank (always 0 for trials on the front) then by hypervolume contribution (not persisted here — recompute downstream if needed).

best_trials is null until at least one iteration has produced a usable objective. Readers MUST tolerate the null state during early-run reads (and any read where every scored iteration’s objective_values is None).

1{
2 "best_trials": [
3 {
4 "iteration_idx": 1,
5 "objective_values": [9512.3],
6 "variation_values": {"phases.profiling.concurrency": 256},
7 "feasible": true,
8 "feasible_count": 5,
9 "pareto_rank": 0
10 }
11 ]
12}

A multi-objective Pareto front:

1{
2 "best_trials": [
3 {
4 "iteration_idx": 7,
5 "objective_values": [9800.1, 215.4],
6 "variation_values": {"phases.profiling.concurrency": 280},
7 "feasible": true,
8 "feasible_count": 18,
9 "pareto_rank": 0
10 },
11 {
12 "iteration_idx": 13,
13 "objective_values": [9512.3, 187.4],
14 "variation_values": {"phases.profiling.concurrency": 256},
15 "feasible": true,
16 "feasible_count": 18,
17 "pareto_rank": 0
18 },
19 {
20 "iteration_idx": 22,
21 "objective_values": [8910.0, 162.7],
22 "variation_values": {"phases.profiling.concurrency": 224},
23 "feasible": true,
24 "feasible_count": 18,
25 "pareto_rank": 0
26 }
27 ]
28}

Fields:

FieldTypeRequiredDescription
iteration_idxintyesIndex of the winning iteration.
objective_valuesarray<float>yesThe objective tuple at the winner, one entry per config.objectives[i] (always non-null for entries in best_trials). For length-1 objectives, a length-1 list.
variation_valuesobjectyesProposed values that produced the winner. Same shape as iterations[i].variation_values.
feasibleboolyesWhether this iteration satisfied every configured sla_filters entry. Lexicographic feasibility-first selection means a feasible iteration is preferred over an infeasible one even if the latter has a better objective.
feasible_countintyesNumber of feasible iterations across the whole run among those with a non-null objective_values (an iteration with feasible == true but objective_values is None does NOT count). 0 flags the “no iteration was both feasible and scored — we fell back to ranking the full scored pool” case so the reader can distinguish it from the normal feasible-front case.
pareto_rankintyesAlways 0 in v1 — every entry of best_trials (single-objective argmax/argmin, or any member of the multi-objective non-dominated set) is emitted with pareto_rank == 0. Reserved for future non-dominated sorting (NSGA-II style) that would emit 0, 1, … for successive fronts; until then, do not branch on this field.

Caveat: best_trials is “best of observed iterations,” not “true Pareto front of the search space.” Early termination (any convergence_reason) means the planner stopped before exhausting the budget; better trade-offs may exist outside the explored region.

Convergence Reasons

convergence_reason takes one of the values below. The shared BO-set (everything except the monotonic_* and smooth_isotonic_* strings) is defined on OptunaSearchPlanner.convergence_reason() in src/aiperf/orchestrator/search_planner/optuna_planner.py; BayesianSearchPlanner inherits this implementation without override. The Optuna-terminator reasons (posterior_regret_bound, emmr) fire only when --optuna-terminator is set. The 1D-SLA planners (MonotonicSLASearchPlanner, SmoothIsotonicSLAPlanner) emit their own algorithm-specific strings — see the table below and the Bayesian Optimization — 1D SLA saturation guide.

ValueMeaning
nullMid-loop write (run still in progress), OR terminated abnormally (cancelled, crashed, or aborted before the orchestrator’s terminal write_search_history call). After a clean terminal exit this is never null — see "unknown".
"unknown"Clean terminal exit fallback: the orchestrator wrote planner.convergence_reason() or "unknown", and the planner returned None. Indicates planner.ask() returned None but the planner did not record a structured reason.
"max_iterations"Budget exhausted: the loop ran config.max_iterations iterations. Emitted by every planner family.
"improvement_patience"No improvement-over-best for improvement_patience consecutive iterations (single-objective: improvement = better objective; multi-objective: improvement = positive hypervolume delta). BO planners only.
"plateau_cv"Coefficient of variation (sample stddev / abs(mean)) on the last plateau_window iterations fell below plateau_threshold. Single-objective on the scalar objective; multi-objective on the hypervolume time series. BO planners only.
"posterior_regret_bound"Optuna terminator: RegretBoundEvaluator (Makarova 2022) signalled that the high-probability bound on simple regret has fallen below the user-supplied threshold. Only fires under --optuna-terminator regret.
"emmr"Optuna terminator: EMMREvaluator (Ishibashi 2023). Only fires under --optuna-terminator emmr.
"monotonic_precision_reached"MonotonicSLASearchPlanner: bracket (infeasible_min - feasible_max) / infeasible_min fell below SLA_PRECISION_DEFAULT.
"monotonic_no_pass_in_range"MonotonicSLASearchPlanner: even the lowest swept value violates SLA — no feasible point exists in the configured range.
"monotonic_no_failure_in_range"MonotonicSLASearchPlanner: even the highest swept value satisfies SLA — no infeasible point exists in the configured range.
"smooth_isotonic_precision_reached"SmoothIsotonicSLAPlanner: PCHIP-fitted boundary converged within the configured precision; boundary_type == "smooth".
"smooth_isotonic_cliff_precision_reached"SmoothIsotonicSLAPlanner: cliff guard tripped — PAVA residual exceeded 3·σ_local AND bracket gap exceeded precision threshold. Planner emits an honest bracket; boundary_type == "cliff".
"smooth_isotonic_no_pass_in_range"SmoothIsotonicSLAPlanner: counterpart to monotonic_no_pass_in_range.
"smooth_isotonic_no_failure_in_range"SmoothIsotonicSLAPlanner: counterpart to monotonic_no_failure_in_range.
"smooth_isotonic_pchip_fallback_bisection"SmoothIsotonicSLAPlanner: PCHIP fit failed prerequisites (insufficient bracketing samples, monotonicity violations); planner fell back to monotonic bisection and reached its precision target there.

The first signal to fire wins; later iterations are not run. See the BO guide’s convergence section for tuning advice and the Bayesian Optimization — 1D SLA saturation guide for the SLA-planner termination semantics.

boundary_summary

Top-level block. Emitted (non-null) when the search has exactly one dimension AND at least one iteration was recorded; null for multi-dim searches or empty history. Records the empirical feasibility boundary along the swept axis — most meaningful when at least one SLAFilter was configured (the max-concurrency-under-sla recipe is the canonical user), but the exporter does NOT gate on filter presence: with no filters every iteration’s feasible flag defaults to true, so feasible_max tracks the highest swept value and infeasible_min is null.

1{
2 "boundary_summary": {
3 "swept_dim_path": "phases.profiling.concurrency",
4 "feasible_max": {"value": 256, "iteration_idx": 3, "objective_value": 4172.3},
5 "infeasible_min": {
6 "value": 320, "iteration_idx": 4,
7 "first_breach": {
8 "metric_tag": "time_to_first_token", "stat": "p95",
9 "op": "lt", "threshold": 200.0, "observed": 213.4
10 }
11 },
12 "boundary_type": "smooth",
13 "binding_constraint": "time_to_first_token:p95",
14 "boundary_ci": {"lo": 248.7, "hi": 264.2}
15 }
16}

Base fields (written by MonotonicSLASearchPlanner, SmoothIsotonicSLAPlanner, and the BO post-hoc derivation):

FieldTypeRequiredDescription
swept_dim_pathstringyesDotted path of the (single) swept dimension. Matches config.search_space[0].path.
feasible_maxobject | nullyesHighest swept value observed to pass every SLA filter. null when no probe passed.
feasible_max.valuenumberyesThe swept value (int when kind=int).
feasible_max.iteration_idxintyesIndex into iterations[] of the probe that observed this value.
feasible_max.objective_valuenumber | nullyesObjective at the same probe (when present), for context.
infeasible_minobject | nullyesLowest swept value observed to violate at least one SLA filter. null when no probe failed.
infeasible_min.valuenumberyesThe swept value.
infeasible_min.iteration_idxintyesIndex into iterations[] of the breaching probe.
infeasible_min.first_breachobjectyesIdentity of the SLA filter that triggered first at this point: metric_tag, stat, op, threshold, and the observed value.

Smooth-isotonic-only optional fields (written by SmoothIsotonicSLAPlanner when applicable; absent — not null — when produced by other planners or when the relevant phase did not run):

FieldTypeWhen presentDescription
boundary_type"smooth" | "cliff"smooth_isotonic onlyCliff-guard verdict. "smooth" means the PAVA-residual at the most-recent probe was within 3·σ_local and the planner is confident the curve is smooth at the boundary. "cliff" means the residual exceeded that threshold AND the bracket gap exceeded precision · x_hi — the planner is reporting an honest bracket [feasible_max.value, infeasible_min.value] instead of a single boundary point on a discontinuity. Catches the prefill-prioritizing-server pattern (Sarathi-Serve fig. 8).
binding_constraintstringsmooth_isotonic only, after at least one Phase-2 fitThe SLA filter key (<metric_tag>:<stat>) whose σ-normalized margin is tightest at termination — i.e. the constraint that defines the boundary in this run. When several SLAs are configured, only this one is replicated and CI’d in Phase 3, because it dominates the final boundary location.
boundary_ciobjectsmooth_isotonic only, when Phase-3 replicates ranBootstrap CI on the binding margin at the candidate boundary x*, computed via _replicate_budget.boundary_ci over per-replicate margins. Object shape: {"lo": float, "hi": float}. When the CI brackets zero, the planner expands to nearby points and refits before terminating; a written CI that brackets zero therefore only appears when the planner exited via --search-max-iterations.

For full algorithm context (when each phase runs, the cliff-detection threshold, how the binding constraint is selected) see Bayesian Optimization — 1D SLA saturation (smooth_isotonic).


Lifecycle and Consistency Guarantees

  • Rewritten after every iteration. The orchestrator calls write_search_history(...) after each successful tell() AND once more on terminal exit (when ask() returns None). Readers MUST tolerate the partial state — the file is valid JSON at every observable instant only because each write is a single Path.write_bytes(...).
  • NOT atomic. The current writer issues one Path.write_bytes call without a temp-file-then-rename. Concurrent readers may observe a torn write (zero bytes, partial JSON) on a slow filesystem; in practice the payload is small (a few KB up to ~100 KB for a 200-iteration run) and the race window is short. Treat a parse failure as “retry in a moment,” not as a corrupted run.
  • Iteration order is submission order. iterations[i].iteration_idx == i (dense, zero-based). The planner-internal _iter counter increments on every tell(), regardless of trial success.
  • Final write carries convergence_reason. All earlier (mid-loop) writes carry convergence_reason: null. After a clean terminal exit (i.e. planner.ask() returned None), the orchestrator rewrites the file with planner.convergence_reason() or "unknown" — so a clean terminal exit always lands a non-null string, even when the planner did not record a structured reason. null in a finalized-looking file therefore implies abnormal termination (cancellation, crash, or hard process kill).
  • Crash semantics. On controller-pod restart, cancellation, or a hard process kill, the last entry in iterations is the most recently-completed iteration, and convergence_reason will be null. The BO loop does NOT resume from the file in v1 — a restarted run begins with iteration 0.

Programmatic Consumption

1from pathlib import Path
2
3import orjson
4
5artifact_dir = Path("artifacts/my_benchmark")
6history = orjson.loads((artifact_dir / "search_history.json").read_bytes())
7
8# Detect run state.
9if history["convergence_reason"] is None:
10 if history["iterations"]:
11 last = history["iterations"][-1]
12 print(f"Run in progress; last completed iter={last['iteration_idx']}")
13 else:
14 print("Run started but no iterations have completed yet")
15else:
16 print(f"Run terminated: {history['convergence_reason']}")
17
18# Pull the best observed configuration(s).
19best_trials = history["best_trials"] # list[dict] or None
20objectives = history["config"]["objectives"]
21
22if not best_trials: # None or []
23 print("No successful iteration yet")
24elif len(objectives) == 1:
25 # Single-objective: best_trials is length-1 by construction.
26 best = best_trials[0]
27 best_concurrency = best["variation_values"]["phases.profiling.concurrency"]
28 best_throughput = best["objective_values"][0]
29 print(f"Best: concurrency={best_concurrency} -> {best_throughput:.1f} tokens/s "
30 f"(iter {best['iteration_idx']} of {len(history['iterations'])})")
31else:
32 # Multi-objective: best_trials is the Pareto front.
33 print(f"Pareto front ({len(best_trials)} non-dominated trials):")
34 for trial in best_trials:
35 values = ", ".join(
36 f"{obj['metric']}/{obj['stat']}={v:.2f}"
37 for obj, v in zip(objectives, trial["objective_values"])
38 )
39 print(f" iter={trial['iteration_idx']:>3} {values} "
40 f"vars={trial['variation_values']}")

To compute summary statistics across the trajectory (e.g. learning curves), iterate history["iterations"] and skip entries where objective_values is None. For multi-objective hypervolume tracking, fold over [ov[i] for ov in iter['objective_values'] if ov is not None] paired with config['objectives'][i].direction.


Caveats

  • Schema is not yet stable across versions. v1 emits the subset above; future releases may add fields (e.g. per-iteration timestamps, GP posterior summaries, hypervolume time-series). Pin your aiperf version when building dashboards or downstream tooling against this artifact.
  • objective_values[i] is the arithmetic mean across trials. It is the GP/Optuna planner’s observed aggregate vector for the point: the mean of finite trial values by default, or a pooled percentile when percentile pooling is enabled. If you need per-trial spread, read the per-trial profile_export_aiperf.json files at <base_dir>/search_iter_NNNN/profile_runs/run_NNNN/ — adaptive-search runs use a flat search_iter_NNNN per BO iteration (each holding profile_runs/run_NNNN/ for that iteration’s trials), distinct from grid sweeps’ {leaf}_{value} layout. See Sweep Aggregate API Reference for the full layout table.
  • convergence_reason: "plateau_cv" can fire as early as iteration plateau_window. When the random-Sobol initial points happen to land in a flat region of the (scalar or hypervolume) objective, the coefficient-of-variation test trips immediately. This is correct, not a bug — increase plateau_window or tighten plateau_threshold if the run terminates too eagerly.
  • config.search_space is the original spec, not what the planner sampled. The planner may explore the dimension’s range non-uniformly (Sobol initial points, then GP-driven exploitation). Use iterations[i].variation_values to see the actual samples; use config.search_space only to reproduce the original CLI/CRD invocation.
  • best_trials is orthogonal to sweep_aggregate/’s best_configurations and pareto_optimal. Those belong to the SweepAnalyzer exporter, are computed across the whole RunResult set (including failed iterations), and may include points the BO planner never saw a finite objective for. Use best_trials for “what the BO loop converged on”; use sweep_aggregate/profile_export_aiperf_sweep.json for “what the post-hoc analyzer thinks is best across every cell.”

See Also