Random Number Generation & Reproducibility
Random Number Generation & Reproducibility
Random Number Generation & Reproducibility
Quick Links:
Overview • What Is Reproducible • User Guide • Developer Guide • Reference
TL;DR: Use --random-seed 42 to get identical dataset content across runs. Performance metrics and worker assignment vary due to distributed system architecture.
AIPerf provides deterministic reproducibility for all seed-controlled randomness using hash-based RNG derivation. This enables reproducible dataset generation while maintaining realistic load testing performance.
Default behavior: Without --random-seed, AIPerf produces non-deterministic results. Set --random-seed <integer> for reproducibility.
Distributed System Constraints: Even with --random-seed, performance metrics and worker assignment are NOT reproducible due to system non-determinism (network timing, async I/O, ZMQ load balancing).
Reproducible (with --random-seed):
session_000000, session_000001, …)NOT Reproducible (system-dependent):
Testing: Reproducibility is enforced by integration canary tests and CI/CD validation on every commit. See Testing & Validation.
Key Principle: Seeds control WHAT you ask, not WHEN it completes or WHAT the server answers.
Dataset: Prompt text/tokens, image dimensions/formats, audio duration/formats, session IDs
Sampling: Random selection, shuffle order, conversation selection
Timing Decisions: Poisson interval values, cancellation decisions
Worker/Execution: Which worker handles which request, request start/completion order, async I/O timing
Performance: TTFT, ITL, latency, throughput
System: Timestamps, process IDs, request IDs (ZMQ routing)
Server: LLM output text, output token counts, errors/failures
AIPerf achieves its high throughput through parallel workers, ZMQ load balancing, and async I/O. Full determinism would require single-worker synchronous execution, destroying performance.
Phase 1 (Startup - PROFILE_CONFIGURE):
Phase 2 (Runtime - PROFILE_START):
Analogy: Like a deterministic deck of cards (same 52 cards, same shuffle) dealt to players who play at different speeds. The deck is reproducible; card distribution to players varies based on who finishes hands first.
Reproducibility is enforced by automated tests on every commit:
Same seed + same config = identical dataset content. Performance metrics always vary.
Debugging: Reproduce exact prompts across runs to isolate prompt-related vs. network/timing issues
Performance Testing: Compare metrics with same dataset
Stress Testing: Vary patterns by omitting seed
Where RNGs Are Used:
Process Flow:
bootstrap.py initializes RNG with rng.init(seed) in each process
random.seed() and NumPy’s np.random.seed() globally (defensive measure)__init____init__Workers do NOT use RNGs. Only use RNGs in DatasetManager (content generation) or TimingManager (request timing) components.
Rules:
__init__, not in methods (or you’ll get the same first value every call)<module>.<component>.<aspect>random module (technically seeded, but fragile—any code using it affects your sequence)Uses SHA-256 to derive independent seeds: SHA-256(root_seed:identifier) → child seed
Benefits:
❌ Deriving in methods → Returns same first value every call.
✅ Derive in __init__.
❌ Using Python’s random → Fragile (global state affected by any code).
✅ Use rng.derive().
❌ Adding operations to existing RNG → Shifts all subsequent values.
✅ Derive new RNG for new feature.
Q: Performance metrics still vary with same seed. Why?
A: Expected. Seeds control dataset content, not network timing or worker scheduling. See What Is Reproducible.
Q: Same seed across different configs?
A: Yes. Same seed + different config = different but reproducible results.
Q: Multiple workers—how does this work?
A: Workers set global seed (defensive) but don’t derive RNGs. DatasetManager pre-generates content, workers pull from this fixed pool. Validated with 5+ workers.
Q: Are RNGs thread-safe?
A: No, but not an issue—each process uses RNGs in its own space. If adding multi-threaded RNG usage, derive per-thread.
Q: Session IDs reproducible?
A: Yes. With seed: sequential (session_000000, session_000001). Without: UUIDs.
Q: Performance impact?
A: None measurable. Network I/O dominates by 1000×.
Dataset
Timing
Composer
Models
See random_generator.py for the RandomGenerator class and full API details.