Inline Datasets

View as Markdown

Embed your benchmark dataset directly in the YAML config — no separate JSONL file required.

When to inline vs. when to keep a file

Inline (records:)File (path:)
Few records (< ~100)RecommendedOK
Many records (> ~500)Discouraged (warning emitted)Recommended
Single-file deployment unit (k8s ConfigMap)RecommendedRequires sidecar mount
Shareable repro for a colleagueRecommendedTwo files to ship
Records updated independently of the configDiscouragedRecommended

The schema is the same: each inline record matches one line of the equivalent JSONL file.

Single-turn

1schemaVersion: "2.0"
2benchmark:
3 model: meta-llama/Llama-3.1-8B-Instruct
4 endpoint:
5 url: http://localhost:8000/v1/chat/completions
6 dataset:
7 type: file
8 format: single_turn
9 records:
10 - {text: "What is machine learning?"}
11 - {text: "Explain GANs in two sentences.", output_length: 200}
12 - {text: "Define reinforcement learning."}
13 phases:
14 type: concurrency
15 concurrency: 2
16 requests: 100

Multi-turn

1benchmark:
2 dataset:
3 type: file
4 format: multi_turn
5 records:
6 - session_id: chat_1
7 turns:
8 - {text: "What is machine learning?"}
9 - {text: "Can you give me an example?"}
10 - session_id: chat_2
11 turns:
12 - {text: "Explain neural networks."}
13 - {text: "How do they differ from traditional algorithms?"}
14 - {text: "Which architecture for image classification?"}

Random pool

Single-pool inline:

1benchmark:
2 dataset:
3 type: file
4 format: random_pool
5 sampling: random
6 records:
7 - {text: "Common query", type: random_pool}
8 - {text: "Less common query", type: random_pool}
9 - {text: "Rare query", type: random_pool}

Multi-pool inline (mirrors a directory-of-JSONLs file layout):

1benchmark:
2 dataset:
3 type: file
4 format: random_pool
5 records:
6 queries:
7 - {text: "What is your refund policy?", type: random_pool}
8 - {text: "How do I reset my password?", type: random_pool}
9 passages:
10 - {text: "Refunds are processed within 5 business days.", type: random_pool}
11 - {text: "Click 'Forgot password' on the login page.", type: random_pool}

Trace replay (mooncake_trace)

1benchmark:
2 dataset:
3 type: file
4 format: mooncake_trace
5 synthesis:
6 speedup_ratio: 2.0 # replay 2x faster (1.0 = real-time, 0.5 = 2x slower)
7 records:
8 - {timestamp: 0, input_length: 512, output_length: 128, hash_ids: [1, 2, 3]}
9 - {timestamp: 100, input_length: 1024, output_length: 256, hash_ids: [4, 5]}
10 - {timestamp: 250, input_length: 256, output_length: 64, hash_ids: [1, 2]}

Mutual exclusion

path: and records: are mutually exclusive. Setting both, or neither, raises a Pydantic ValidationError at config load with this message:

FileDataset requires exactly one source: set either `path:` (load from disk) or `records:` (embed in YAML), not both. Got path=<...>, records=<...>.

Soft size limit

If you inline more than 500 records, AIPerf logs a warning recommending a file. There is no hard cap — you can keep going if you have a good reason — but reading a 5,000-line YAML in code review is rough. The threshold is configurable via AIPERF_DATASET_INLINE_RECORDS_WARN_THRESHOLD.

Tutorial template

A bundled template demonstrates all three formats:

$aiperf config init --template inline_dataset --output bench.yaml