Profile with BurstGPT Traces
AIPerf supports benchmarking using BurstGPT, a real-world LLM traffic trace dataset from Microsoft Research. The dataset captures bursty request patterns with per-request token counts.
This guide covers replaying BurstGPT traces to reproduce real-world traffic patterns against your inference server.
Start a vLLM Server
Launch a vLLM server with a chat model:
Verify the server is ready:
BurstGPT Trace Format
BurstGPT traces are CSV files where each row represents a single independent request.
Example rows:
Each row is treated as an independent single-turn request. AIPerf synthesizes prompts of the prescribed token lengths — no actual prompt text is stored in the trace.
Download and Profile
Download a trace file from the BurstGPT repository and run a benchmark:
Sample Output (Successful Run):
Related Tutorials
- Bailian Traces - Bailian production trace replay
- Fixed Schedule - Precise timestamp-based execution for any dataset
- Prefix Synthesis - KV cache testing with hash-based prefix data
- Multi-Turn Conversations - Multi-turn conversation benchmarking