Custom Prompt Benchmarking
Benchmark with prompts from your own file, sent exactly as specified without sampling or generation.
Overview
This tutorial uses the mooncake_trace dataset type with text_input field to send prompts exactly as-is.
The mooncake_trace dataset type with text_input provides:
- Exact Control: Send precisely the text you specify
- Deterministic Testing: Same file produces identical request sequence every time
- Production Replay: Use real user queries for realistic benchmarking
- Debugging: Isolate performance issues with specific prompts
This is different from random_pool which samples from a dataset.
Traces send each entry exactly once in order.
Setting Up the Server
Running the Benchmark
Create an input file with specific text inputs
Sample Output (Successful Run):
Key Points:
- Each line in the JSONL file becomes exactly one request
- Requests are sent in the order they appear in the file
- The
text_inputis sent exactly as specified
Use Cases
Perfect for:
- Regression testing (detecting performance changes)
- A/B testing different model configurations
- Debugging specific prompt performance
- Production workload replay
Not ideal for:
- Load testing with varied request patterns (use
random_poolinstead) - Scalability testing requiring many unique requests