Inline Datasets | NVIDIA AIPerf Documentation

Embed your benchmark dataset directly in the YAML config — no separate JSONL file required.

When to inline vs. when to keep a file

	Inline (`records:`)	File (`path:`)
Few records (< ~100)	Recommended	OK
Many records (> ~500)	Discouraged (warning emitted)	Recommended
Single-file deployment unit (k8s ConfigMap)	Recommended	Requires sidecar mount
Shareable repro for a colleague	Recommended	Two files to ship
Records updated independently of the config	Discouraged	Recommended

The schema is the same: each inline record matches one line of the equivalent JSONL file.

Single-turn

1 schemaVersion: "2.0"
2 benchmark:
3   model: meta-llama/Llama-3.1-8B-Instruct
4   endpoint:
5     url: http://localhost:8000/v1/chat/completions
6   dataset:
7     type: file
8     format: single_turn
9     records:
10       - {text: "What is machine learning?"}
11       - {text: "Explain GANs in two sentences.", output_length: 200}
12       - {text: "Define reinforcement learning."}
13   phases:
14     type: concurrency
15     concurrency: 2
16     requests: 100

Multi-turn

1 benchmark:
2   dataset:
3     type: file
4     format: multi_turn
5     records:
6       - session_id: chat_1
7         turns:
8           - {text: "What is machine learning?"}
9           - {text: "Can you give me an example?"}
10       - session_id: chat_2
11         turns:
12           - {text: "Explain neural networks."}
13           - {text: "How do they differ from traditional algorithms?"}
14           - {text: "Which architecture for image classification?"}

Random pool

Single-pool inline:

1 benchmark:
2   dataset:
3     type: file
4     format: random_pool
5     sampling: random
6     records:
7       - {text: "Common query", type: random_pool}
8       - {text: "Less common query", type: random_pool}
9       - {text: "Rare query", type: random_pool}

Multi-pool inline (mirrors a directory-of-JSONLs file layout):

1 benchmark:
2   dataset:
3     type: file
4     format: random_pool
5     records:
6       queries:
7         - {text: "What is your refund policy?", type: random_pool}
8         - {text: "How do I reset my password?", type: random_pool}
9       passages:
10         - {text: "Refunds are processed within 5 business days.", type: random_pool}
11         - {text: "Click 'Forgot password' on the login page.", type: random_pool}

Trace replay (mooncake_trace)

1 benchmark:
2   dataset:
3     type: file
4     format: mooncake_trace
5     synthesis:
6       speedup_ratio: 2.0    # replay 2x faster (1.0 = real-time, 0.5 = 2x slower)
7     records:
8       - {timestamp: 0,    input_length: 512,  output_length: 128, hash_ids: [1, 2, 3]}
9       - {timestamp: 100,  input_length: 1024, output_length: 256, hash_ids: [4, 5]}
10       - {timestamp: 250,  input_length: 256,  output_length: 64,  hash_ids: [1, 2]}

Mutual exclusion

path: and records: are mutually exclusive. Setting both, or neither, raises a Pydantic ValidationError at config load with this message:

FileDataset requires exactly one source: set either `path:` (load from disk) or `records:` (embed in YAML), not both. Got path=<...>, records=<...>.

Soft size limit

If you inline more than 500 records, AIPerf logs a warning recommending a file. There is no hard cap — you can keep going if you have a good reason — but reading a 5,000-line YAML in code review is rough. The threshold is configurable via AIPERF_DATASET_INLINE_RECORDS_WARN_THRESHOLD.

Tutorial template

A bundled template demonstrates all three formats:

$ aiperf config init --template inline_dataset --output bench.yaml