Sweep Aggregates API Reference | NVIDIA AIPerf Documentation

Complete API documentation for parameter sweep aggregate outputs, including JSON schema, CSV format, and programmatic analysis examples.

Overview

When running parameter sweeps with AIPerf (e.g., --concurrency 10,20,30), the system generates sweep aggregate files that summarize performance across all parameter combinations. These aggregates enable:

Comparison of performance across parameter combinations
Identification of optimal configurations
Pareto frontier analysis for multi-objective optimization
Statistical analysis with confidence intervals (when using --num-profile-runs > 1)

Output Files

Sweep aggregates are written to different locations depending on the sweep mode:

Independent Mode (sweep-only, no --num-profile-runs):

artifacts/
  {benchmark_name}/
    sweep_aggregate/
      profile_export_aiperf_sweep.json    # Structured data for programmatic analysis
      profile_export_aiperf_sweep.csv     # Tabular format for spreadsheet analysis

Repeated Mode (sweep with --num-profile-runs > 1):

artifacts/
  {benchmark_name}/
    aggregate/
      concurrency_10/                     # Per-value confidence aggregates
        profile_export_aiperf_aggregate.json
        profile_export_aiperf_aggregate.csv
      concurrency_20/
        ...
      sweep_aggregate/                    # Cross-value sweep analysis
        profile_export_aiperf_sweep.json
        profile_export_aiperf_sweep.csv

The sweep aggregate files contain cross-value analysis including best configurations and Pareto optimal points.

JSON Schema

Top-Level Structure

1 {
2   "aggregation_type": "sweep",
3   "num_profile_runs": 15,
4   "num_successful_runs": 15,
5   "failed_runs": [],
6   "metadata": { ... },
7   "per_combination_metrics": [ ... ],
8   "best_configurations": { ... },
9   "pareto_optimal": [ ... ]
10 }

Top-Level Fields:

Field	Type	Description
`aggregation_type`	string	Always `"sweep"` for sweep aggregates
`num_profile_runs`	int	Total number of profile runs executed
`num_successful_runs`	int	Number of successful profile runs
`failed_runs`	array	List of failed runs with error details (empty if all succeeded)
`metadata`	object	Sweep configuration and execution metadata
`per_combination_metrics`	array	List of metrics for each parameter combination
`best_configurations`	object	Best parameter combinations for key metrics
`pareto_optimal`	array	List of Pareto optimal parameter combinations

Metadata Section

Contains information about the sweep configuration.

1 {
2   "metadata": {
3     "sweep_parameters": [
4       {
5         "name": "concurrency",
6         "values": [10, 20, 30, 40]
7       }
8     ],
9     "num_combinations": 4
10   }
11 }

Fields:

Field	Type	Description
`sweep_parameters`	array	List of parameter definitions (name and values)
`num_combinations`	int	Total number of parameter combinations tested

Sweep Parameters Structure:

Each parameter definition contains:

name: Parameter name (e.g., "concurrency", "request_rate")
values: List of values tested for this parameter

Per-Combination Metrics Section

Contains aggregated metrics for each parameter combination. This is a list where each entry represents one combination.

1 {
2   "per_combination_metrics": [
3     {
4       "parameters": {
5         "concurrency": 10
6       },
7       "metrics": {
8         "request_throughput_avg": {
9           "mean": 100.5,
10           "std": 5.2,
11           "min": 95.0,
12           "max": 108.0,
13           "cv": 0.052,
14           "se": 2.3,
15           "ci_low": 94.3,
16           "ci_high": 106.7,
17           "t_critical": 2.776,
18           "unit": "requests/sec"
19         },
20         "ttft_p99_ms": {
21           "mean": 120.5,
22           "std": 8.1,
23           "min": 110.2,
24           "max": 132.8,
25           "cv": 0.067,
26           "se": 3.6,
27           "ci_low": 111.5,
28           "ci_high": 129.5,
29           "t_critical": 2.776,
30           "unit": "ms"
31         }
32       }
33     },
34     {
35       "parameters": {
36         "concurrency": 20
37       },
38       "metrics": { ... }
39     }
40   ]
41 }

Combination Entry Fields:

Field	Type	Description
`parameters`	object	Dictionary of parameter names to values for this combination
`metrics`	object	Dictionary of metric names to statistics

Metric Statistics Fields:

Field	Type	Description
`mean`	float	Mean value across trials
`std`	float	Standard deviation across trials
`min`	float	Minimum value observed
`max`	float	Maximum value observed
`cv`	float	Coefficient of variation (std/mean)
`se`	float	Standard error of the mean
`ci_low`	float	Lower bound of confidence interval
`ci_high`	float	Upper bound of confidence interval
`t_critical`	float	Critical t-value used for confidence interval
`unit`	string	Unit of measurement

Note: For single-trial sweeps (--num-profile-runs 1), only mean and unit fields are present.

Best Configurations Section

Identifies the parameter combinations that achieved the best performance for key metrics.

1 {
2   "best_configurations": {
3     "best_throughput": {
4       "parameters": {
5         "concurrency": 40
6       },
7       "metric": 350.2,
8       "unit": "requests/sec"
9     },
10     "best_latency_p99": {
11       "parameters": {
12         "concurrency": 10
13       },
14       "metric": 120.5,
15       "unit": "ms"
16     }
17   }
18 }

Configuration Fields:

Field	Type	Description
`parameters`	object	Parameter combination that achieved best performance
`metric`	float	The metric value achieved
`unit`	string	Unit of measurement

Available Configurations:

best_throughput: Highest request_throughput_avg
best_latency_p99: Lowest ttft_p99_ms (or request_latency_p99 as fallback)

Pareto Optimal Section

Lists parameter combinations that are Pareto optimal - configurations where no other configuration is strictly better on all objectives simultaneously.

1 {
2   "pareto_optimal": [
3     {"concurrency": 10},
4     {"concurrency": 30},
5     {"concurrency": 40}
6   ]
7 }

Default Objectives:

Maximize: request_throughput_avg (throughput)
Minimize: ttft_p99_ms (latency)

A configuration is Pareto optimal if:

No other configuration has both higher throughput AND lower latency
It represents a valid trade-off point on the efficiency frontier

Example Interpretation:

Concurrency 10: Low latency, moderate throughput (latency-optimized)
Concurrency 30: Balanced latency and throughput
Concurrency 40: High throughput, higher latency (throughput-optimized)

Multi-Parameter Sweeps:

For sweeps with multiple parameters (e.g., --concurrency 10,20 --request-rate 5,10), each Pareto optimal entry contains all parameter values:

1 {
2   "pareto_optimal": [
3     {"concurrency": 10, "request_rate": 5},
4     {"concurrency": 20, "request_rate": 10}
5   ]
6 }

CSV Format

The CSV export provides a tabular view optimized for spreadsheet analysis and plotting.

Structure

The CSV file contains multiple sections separated by blank lines:

Per-Combination Metrics Table (main data)
Best Configurations
Pareto Optimal Points
Metadata

Per-Combination Metrics Table

The first section is a wide-format table with one row per parameter combination:

1 concurrency,request_throughput_avg_mean,request_throughput_avg_std,request_throughput_avg_min,request_throughput_avg_max,request_throughput_avg_cv,ttft_p99_ms_mean,ttft_p99_ms_std,ttft_p99_ms_min,ttft_p99_ms_max,ttft_p99_ms_cv
2 10,100.50,5.20,95.00,108.00,0.0520,120.50,8.10,110.20,132.80,0.0672
3 20,180.30,8.50,170.00,195.00,0.0471,135.20,9.30,125.00,148.00,0.0688
4 30,270.80,12.10,255.00,290.00,0.0447,155.80,11.20,142.00,172.00,0.0719
5 40,285.50,15.30,265.00,310.00,0.0536,180.30,13.50,165.00,200.00,0.0749

Columns:

Parameter columns (e.g., concurrency, request_rate)
For each metric: {metric}_mean, {metric}_std, {metric}_min, {metric}_max, {metric}_cv

Multi-Parameter Example:

1 concurrency,request_rate,request_throughput_avg_mean,request_throughput_avg_std,...
2 10,5,50.25,2.10,...
3 10,10,95.30,4.50,...
4 20,5,98.40,3.20,...
5 20,10,185.60,7.80,...

Best Configurations Section

1 Best Configurations
2 Configuration,concurrency,Metric,Unit
3 Best Throughput,40,285.50,requests/sec
4 Best Latency P99,10,120.50,ms

For multi-parameter sweeps:

1 Best Configurations
2 Configuration,concurrency,request_rate,Metric,Unit
3 Best Throughput,40,10,350.20,requests/sec
4 Best Latency P99,10,5,95.30,ms

Pareto Optimal Section

1 Pareto Optimal Points
2 concurrency
3 10
4 30
5 40

For multi-parameter sweeps:

1 Pareto Optimal Points
2 concurrency,request_rate
3 10,5
4 20,10
5 40,10

Metadata Section

1 Metadata
2 Field,Value
3 Aggregation Type,sweep
4 Sweep Parameters,concurrency
5 Number of Combinations,4
6 Number of Profile Runs,12
7 Number of Successful Runs,12

Artifact Directory Structure

Repeated Mode (`--parameter-sweep-mode repeated`)

Default mode where the full sweep is executed N times:

artifacts/
  {benchmark_name}/
    profile_runs/
      trial_0001/
        concurrency_10/
          profile_export_aiperf.json
          profile_export.jsonl
        concurrency_20/
          profile_export_aiperf.json
          profile_export.jsonl
        concurrency_30/
          profile_export_aiperf.json
          profile_export.jsonl
      trial_0002/
        concurrency_10/
        concurrency_20/
        concurrency_30/
      trial_0003/
        concurrency_10/
        concurrency_20/
        concurrency_30/
    aggregate/
      concurrency_10/
        profile_export_aiperf_aggregate.json
        profile_export_aiperf_aggregate.csv
      concurrency_20/
        profile_export_aiperf_aggregate.json
        profile_export_aiperf_aggregate.csv
      concurrency_30/
        profile_export_aiperf_aggregate.json
        profile_export_aiperf_aggregate.csv
    sweep_aggregate/
      profile_export_aiperf_sweep.json
      profile_export_aiperf_sweep.csv

Execution Pattern:

Trial 1: [10 → 20 → 30]
Trial 2: [10 → 20 → 30]
Trial 3: [10 → 20 → 30]

Independent Mode (`--parameter-sweep-mode independent`)

All trials at each parameter value before moving to the next:

artifacts/
  {benchmark_name}/
    concurrency_10/
      profile_runs/
        trial_0001/
          profile_export_aiperf.json
          profile_export.jsonl
        trial_0002/
        trial_0003/
      aggregate/
        profile_export_aiperf_aggregate.json
        profile_export_aiperf_aggregate.csv
    concurrency_20/
      profile_runs/
        trial_0001/
        trial_0002/
        trial_0003/
      aggregate/
        profile_export_aiperf_aggregate.json
        profile_export_aiperf_aggregate.csv
    concurrency_30/
      profile_runs/
        trial_0001/
        trial_0002/
        trial_0003/
      aggregate/
        profile_export_aiperf_aggregate.json
        profile_export_aiperf_aggregate.csv
    sweep_aggregate/
      profile_export_aiperf_sweep.json
      profile_export_aiperf_sweep.csv

Execution Pattern:

Concurrency 10: [trial1, trial2, trial3]
Concurrency 20: [trial1, trial2, trial3]
Concurrency 30: [trial1, trial2, trial3]

Single-Trial Sweep

When --num-profile-runs 1 (or omitted), no trial directories are created:

artifacts/
  {benchmark_name}/
    concurrency_10/
      profile_export_aiperf.json
      profile_export.jsonl
    concurrency_20/
      profile_export_aiperf.json
      profile_export.jsonl
    concurrency_30/
      profile_export_aiperf.json
      profile_export.jsonl
    sweep_aggregate/
      profile_export_aiperf_sweep.json
      profile_export_aiperf_sweep.csv

Programmatic Analysis Examples

Example 1: Load and Inspect Sweep Results

1 import json
2 from pathlib import Path
3 
4 # Load sweep aggregate
5 sweep_file = Path("artifacts/my_benchmark/sweep_aggregate/profile_export_aiperf_sweep.json")
6 with open(sweep_file) as f:
7     sweep_data = json.load(f)
8 
9 # Inspect metadata
10 metadata = sweep_data["metadata"]
11 sweep_params = metadata["sweep_parameters"]
12 print(f"Sweep parameters: {[p['name'] for p in sweep_params]}")
13 print(f"Total combinations: {metadata['num_combinations']}")
14 print(f"Total runs: {sweep_data['num_profile_runs']}")

Example 2: Find Optimal Configuration

1 # Get best configurations
2 best_configs = sweep_data["best_configurations"]
3 
4 best_throughput = best_configs["best_throughput"]
5 print(f"Best throughput: {best_throughput['metric']:.2f} {best_throughput['unit']}")
6 print(f"  Parameters: {best_throughput['parameters']}")
7 
8 best_latency = best_configs["best_latency_p99"]
9 print(f"Best latency: {best_latency['metric']:.2f} {best_latency['unit']}")
10 print(f"  Parameters: {best_latency['parameters']}")

Example 3: Analyze Pareto Frontier

1 # Get Pareto optimal points
2 pareto_optimal = sweep_data["pareto_optimal"]
3 print(f"Found {len(pareto_optimal)} Pareto optimal configurations")
4 
5 # Extract metrics for Pareto points
6 per_combination_metrics = sweep_data["per_combination_metrics"]
7 
8 print("\nPareto Frontier:")
9 for combo in per_combination_metrics:
10     params = combo["parameters"]
11     # Check if this combination is Pareto optimal
12     if params in pareto_optimal:
13         metrics = combo["metrics"]
14         throughput = metrics["request_throughput_avg"]["mean"]
15         latency = metrics["ttft_p99_ms"]["mean"]
16         print(f"  {params}: {throughput:.1f} req/s, {latency:.1f} ms p99")

Example 4: Compare Confidence Intervals

1 import matplotlib.pyplot as plt
2 
3 # Extract data for single-parameter sweep
4 combinations = sweep_data["per_combination_metrics"]
5 
6 # Assuming single parameter (concurrency)
7 param_name = sweep_data["metadata"]["sweep_parameters"][0]["name"]
8 param_values = []
9 throughputs = []
10 ci_lows = []
11 ci_highs = []
12 
13 for combo in combinations:
14     param_value = combo["parameters"][param_name]
15     tp = combo["metrics"]["request_throughput_avg"]
16 
17     param_values.append(param_value)
18     throughputs.append(tp["mean"])
19     ci_lows.append(tp.get("ci_low", tp["mean"]))
20     ci_highs.append(tp.get("ci_high", tp["mean"]))
21 
22 # Plot with confidence intervals
23 plt.figure(figsize=(10, 6))
24 plt.plot(param_values, throughputs, 'o-', label='Mean Throughput')
25 plt.fill_between(param_values, ci_lows, ci_highs, alpha=0.3, label='95% CI')
26 plt.xlabel(param_name.title())
27 plt.ylabel('Throughput (requests/sec)')
28 plt.title(f'Throughput vs {param_name.title()}')
29 plt.legend()
30 plt.grid(True)
31 plt.savefig('throughput_sweep.png')

Example 5: Export to Pandas DataFrame

1 import pandas as pd
2 
3 # Convert per-combination metrics to DataFrame
4 rows = []
5 for combo in sweep_data["per_combination_metrics"]:
6     row = combo["parameters"].copy()
7 
8     # Add metrics
9     for metric_name, metric_data in combo["metrics"].items():
10         if isinstance(metric_data, dict):
11             row[f"{metric_name}_mean"] = metric_data.get("mean")
12             row[f"{metric_name}_std"] = metric_data.get("std")
13             row[f"{metric_name}_cv"] = metric_data.get("cv")
14         else:
15             row[metric_name] = metric_data
16     rows.append(row)
17 
18 df = pd.DataFrame(rows)
19 
20 # Sort by parameter values
21 param_names = [p["name"] for p in sweep_data["metadata"]["sweep_parameters"]]
22 df = df.sort_values(param_names)
23 
24 # Analyze
25 print(df[[*param_names, "request_throughput_avg_mean", "ttft_p99_ms_mean"]])
26 
27 # Export
28 df.to_csv("sweep_analysis.csv", index=False)

Example 6: Multi-Parameter Sweep Analysis

1 # For sweeps with multiple parameters
2 sweep_params = sweep_data["metadata"]["sweep_parameters"]
3 param_names = [p["name"] for p in sweep_params]
4 
5 print(f"Multi-parameter sweep: {', '.join(param_names)}")
6 
7 # Find best combination for each parameter individually
8 for param_name in param_names:
9     # Group by this parameter
10     param_groups = {}
11     for combo in sweep_data["per_combination_metrics"]:
12         param_value = combo["parameters"][param_name]
13         if param_value not in param_groups:
14             param_groups[param_value] = []
15         param_groups[param_value].append(combo)
16 
17     # Find best throughput for each value of this parameter
18     print(f"\nBest throughput for each {param_name}:")
19     for value, combos in sorted(param_groups.items()):
20         best_combo = max(combos,
21                         key=lambda c: c["metrics"]["request_throughput_avg"]["mean"])
22         throughput = best_combo["metrics"]["request_throughput_avg"]["mean"]
23         print(f"  {param_name}={value}: {throughput:.1f} req/s")
24         print(f"    Full config: {best_combo['parameters']}")

Example 7: Identify Diminishing Returns

1 # For single-parameter sweeps, calculate efficiency
2 combinations = sweep_data["per_combination_metrics"]
3 param_name = sweep_data["metadata"]["sweep_parameters"][0]["name"]
4 
5 # Sort by parameter value
6 combinations_sorted = sorted(combinations,
7                             key=lambda c: c["parameters"][param_name])
8 
9 efficiencies = []
10 for combo in combinations_sorted:
11     param_value = combo["parameters"][param_name]
12     throughput = combo["metrics"]["request_throughput_avg"]["mean"]
13     efficiency = throughput / param_value
14     efficiencies.append((param_value, efficiency))
15 
16 # Find point of diminishing returns (where efficiency drops significantly)
17 threshold = 0.8  # 20% drop
18 for i in range(1, len(efficiencies)):
19     if efficiencies[i][1] < threshold * efficiencies[i-1][1]:
20         print(f"Diminishing returns detected at {param_name}={efficiencies[i][0]}")
21         print(f"  Efficiency dropped from {efficiencies[i-1][1]:.2f} to {efficiencies[i][1]:.2f}")
22         break

Example 8: Multi-Objective Decision Making

1 # Score configurations based on weighted objectives
2 weights = {
3     "throughput": 0.6,  # 60% weight on throughput
4     "latency": 0.4,     # 40% weight on latency
5 }
6 
7 # Extract all throughputs and latencies
8 combinations = sweep_data["per_combination_metrics"]
9 throughputs = [c["metrics"]["request_throughput_avg"]["mean"] for c in combinations]
10 latencies = [c["metrics"]["ttft_p99_ms"]["mean"] for c in combinations]
11 
12 max_tp = max(throughputs)
13 min_lat = min(latencies)
14 max_lat = max(latencies)
15 
16 scores = []
17 for combo in combinations:
18     tp = combo["metrics"]["request_throughput_avg"]["mean"]
19     lat = combo["metrics"]["ttft_p99_ms"]["mean"]
20 
21     # Normalize: higher is better for both
22     tp_score = tp / max_tp
23     lat_score = 1 - (lat - min_lat) / (max_lat - min_lat) if max_lat > min_lat else 1.0
24 
25     # Weighted combination
26     score = weights["throughput"] * tp_score + weights["latency"] * lat_score
27     scores.append((combo["parameters"], score))
28 
29 # Find best configuration
30 best_params, best_score = max(scores, key=lambda x: x[1])
31 print(f"Best configuration for given weights: {best_params}")
32 print(f"  Score: {best_score:.3f}")

Sweep Aggregate API Reference

Overview

Output Files

JSON Schema

Top-Level Structure

Metadata Section

Per-Combination Metrics Section

Best Configurations Section

Pareto Optimal Section

CSV Format

Structure

Per-Combination Metrics Table

Best Configurations Section

Pareto Optimal Section

Metadata Section

Artifact Directory Structure

Repeated Mode (`--parameter-sweep-mode repeated`)

Independent Mode (`--parameter-sweep-mode independent`)

Single-Trial Sweep

Programmatic Analysis Examples

Example 1: Load and Inspect Sweep Results

Example 2: Find Optimal Configuration

Example 3: Analyze Pareto Frontier

Example 4: Compare Confidence Intervals

Example 5: Export to Pandas DataFrame

Example 6: Multi-Parameter Sweep Analysis

Example 7: Identify Diminishing Returns

Example 8: Multi-Objective Decision Making

See Also

Overview

Output Files

JSON Schema

Top-Level Structure

Metadata Section

Per-Combination Metrics Section

Best Configurations Section

Pareto Optimal Section

CSV Format

Structure

Per-Combination Metrics Table

Best Configurations Section

Pareto Optimal Section

Metadata Section

Artifact Directory Structure

Repeated Mode (--parameter-sweep-mode repeated)

Independent Mode (--parameter-sweep-mode independent)

Single-Trial Sweep

Programmatic Analysis Examples

Example 1: Load and Inspect Sweep Results

Example 2: Find Optimal Configuration

Example 3: Analyze Pareto Frontier

Example 4: Compare Confidence Intervals

Example 5: Export to Pandas DataFrame

Example 6: Multi-Parameter Sweep Analysis

Example 7: Identify Diminishing Returns

Example 8: Multi-Objective Decision Making

See Also

Repeated Mode (`--parameter-sweep-mode repeated`)

Independent Mode (`--parameter-sweep-mode independent`)