Conversation context mode controls how prior turns are accumulated when building multi-turn chat requests. Different dataset formats imply different accumulation strategies, and AIPerf automatically selects the right one based on your data.
Two dimensions determine the mode:
DELTAS (incremental per-turn content) vs MESSAGE_ARRAY (each turn carries its complete message list)WITH_RESPONSES (pre-canned assistant turns in dataset) vs WITHOUT_RESPONSES (only user content; live responses captured at runtime)deltas_without_responsesStandard multi-turn chat. Each dataset turn is a user-only delta. AIPerf accumulates turns and threads live inference responses into the history.
Dataset:
Replay:
Default for:
hash_idsdeltas_with_responsesDelta-compressed prompts. Each dataset turn only contains the new messages since the previous turn. AIPerf accumulates these deltas to reconstruct the full conversation. The live inference response is only used for measurement and discarded — the pre-canned assistant responses in the dataset are used instead.
Dataset (each turn is a delta):
Replay (deltas accumulated):
Default for:
message_array_with_responsesSelf-contained prompts. Each turn already contains its full context (including assistant responses). No session accumulation.
Dataset:
Replay:
Each turn is sent exactly as it appears in the dataset.
Default for:
messages arraysmessage_array_without_responsesReserved for future use. Each turn would carry a complete user-only message array, requiring live response merging between turns. Not yet implemented.
Context mode is resolved through a priority chain:
context_modedeltas_without_responsesThis means most users never need to think about context mode. The loader picks the right default, and individual conversations can override it when needed.