Profile with Blazedit Dataset
AIPerf supports benchmarking using the Blazedit datasets (vdaita/edit_5k_char and
vdaita/edit_10k_char), which contain code change requests paired with code files of varying
lengths. These datasets are useful for measuring model throughput and latency under long-context
code editing workloads.
Two variants are available:
blazedit_5k— ~5k character code contexts, lower token count per requestblazedit_10k— ~10k character code contexts, higher memory pressure
This guide covers profiling OpenAI-compatible chat completions endpoints using the Blazedit public datasets.
Start a vLLM Server
Launch a vLLM server with a chat model:
Verify the server is ready:
Profile with Blazedit Dataset
AIPerf loads the Blazedit dataset from HuggingFace and constructs prompts that include the full code file alongside the change request, matching the evaluation approach used by vLLM’s benchmark suite. Each prompt averages ~1,500 input tokens for the 5k variant.
Use --prompt-output-tokens-mean to cap output length — without it the model regenerates the
entire modified file, producing thousands of output tokens per request.
5k character variant:
Sample Output (Successful Run):
10k character variant: