AIPerf generates synthetic datasets for benchmarking LLM inference servers. This tutorial explains how synthetic data is generated for text, images, audio, and video inputs.
Synthetic datasets enable consistent, reproducible benchmarking with full control over input characteristics. Each modality uses a specialized generator:
All generators use deterministic random sampling for reproducibility (see Reproducibility Guide).
Text prompts are generated by sampling from a pre-tokenized Shakespeare corpus:
assets/shakespeare.txt file is tokenized once at startupKey Feature: Character-based chunking ensures reproducibility across machines with different CPU counts - same random seed always produces identical prompts.
Options:
--synthetic-input-tokens-mean: Mean input token count (default: 550)--synthetic-input-tokens-stddev: Standard deviation for input length variability (default: 0)--output-tokens-mean: Mean number of output tokens requested (default: None — model decides)--output-tokens-stddev: Standard deviation for output token length (default: 0)--seq-dist: Distribution of (ISL, OSL) pairs for mixed workload simulation (default: None). See Sequence Length Distributions for format details.--random-seed: Seed for reproducible prompt generation (default: None)For shared-prefix benchmarking (e.g., RAG scenarios):
Each request randomly selects a 512-token prefix from a pool of 10, with a randomly sampled 100-token continuation. See Prefix Synthesis for details.
Images are generated according to the configured --image-source mode. By default, AIPerf generates random-noise images at the requested dimensions — no on-disk assets required, and the pool is effectively unbounded so servers cannot dedupe on identical inputs.
Available modes:
noise (default): A fresh random-noise image is generated at the requested width × height for every request. No filesystem access; pool size is unlimited.assets: Resizes one of the 4 bundled source images in assets/source_images/ to the requested dimensions. Smaller payload bytes than noise because natural images compress well, but the pool is only 4 images.<path>: Resizes images from a user-supplied directory (e.g. --image-source ./my_images). All readable files in the directory are loaded; non-image files are skipped.After source selection, the image is converted to the configured format (PNG, JPEG, or randomly selected) and base64-encoded as a data URI for API requests.
Payload-size note: Random-noise images are roughly incompressible, so an
N×Nnoise PNG/JPEG is substantially larger on the wire than a natural-image PNG/JPEG of the same resolution. If you need realistic payload sizes for the modality, use--image-source assetsor--image-source <path>with representative images.
Options:
--image-width-mean: Mean width in pixels (default: 0)--image-width-stddev: Width standard deviation (default: 0)--image-height-mean: Mean height in pixels (default: 0)--image-height-stddev: Height standard deviation (default: 0)--image-format: png, jpeg, or random (default: png)--image-batch-size: Number of images per request (default: 1)--image-source: noise (default), assets, or a directory pathNote: Image generation requires both --image-width-mean and --image-height-mean to be > 0. Setting either to 0 disables images.
Audio files are generated as synthetic Gaussian noise:
<format>,<base64data> stringAudio Characteristics:
Options:
--audio-length-mean: Mean duration in seconds (default: 0.0)--audio-length-stddev: Duration standard deviation (default: 0.0)--audio-sample-rates: List of sample rates in kHz to randomly select from (default: [16.0])--audio-depths: List of bit depths (8, 16, 24, 32) to randomly select from (default: [16])--audio-format: wav or mp3 (default: wav)--audio-num-channels: 1 (mono) or 2 (stereo) (default: 1)--audio-batch-size: Number of audio files per request (default: 1)Note: Set --audio-length-mean > 0 to enable audio generation. MP3 supports a limited set of sample rates; use WAV for custom rates.
Video generation is fully documented in Synthetic Video Generation. Key points:
moving_shapes (animated geometry), grid_clock (grid with animation), or noise (random pixels)libvpx-vp9, libx264, libx265) or GPU (h264_nvenc, hevc_nvenc)Prerequisite: Video generation requires FFmpeg. For installations, see Synthetic Video Tutorial.
See Synthetic Video Tutorial for complete details.