For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
    • Welcome to AIPerf Documentation
  • Getting Started
    • Profiling with AIPerf
    • Comprehensive LLM Benchmarking
    • Migrating from GenAI-Perf
    • GenAI-Perf vs AIPerf CLI Feature Comparison Matrix
  • Tutorials
      • YAML Configuration Files
      • Sampling Distributions in YAML Configs
      • User Interface
      • Using Local Tokenizers Without HuggingFace
      • Random Number Generation & Reproducibility
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoDocumentation
On this page
  • Overview
  • What Is Reproducible, What Is Not
  • ✅ Reproducible with —random-seed
  • ❌ NOT Reproducible
  • Why This Architecture?
  • How It Works
  • Testing & Validation
  • User Guide
  • Basic Usage
  • Use Cases
  • Developer Guide
  • System Architecture
  • How to Use RNGs in Your Code
  • Hash-Based Seed Derivation
  • Common Mistakes
  • FAQ
  • Reference
  • All Component-Specific RNG Identifiers
  • Module API
TutorialsConfiguration

Random Number Generation & Reproducibility

||View as Markdown|
Previous

Using Local Tokenizers Without HuggingFace

Next

Search Recipes

Quick Links:
Overview • What Is Reproducible • User Guide • Developer Guide • Reference


Overview

TL;DR: Use --random-seed 42 to get identical dataset content across runs. Performance metrics and worker assignment vary due to distributed system architecture.

AIPerf provides deterministic reproducibility for all seed-controlled randomness using hash-based RNG derivation. This enables reproducible dataset generation while maintaining realistic load testing performance.

Default behavior: Without --random-seed, AIPerf produces non-deterministic results. Set --random-seed <integer> for reproducibility.

Distributed System Constraints: Even with --random-seed, performance metrics and worker assignment are NOT reproducible due to system non-determinism (network timing, async I/O, ZMQ load balancing).

Reproducible (with --random-seed):

  • ✅ Dataset content (prompts, images, audio)
  • ✅ Dataset sampling order (random/shuffle strategies)
  • ✅ Request timing intervals (Poisson values)
  • ✅ Model selection (random strategy)
  • ✅ Session IDs (session_000000, session_000001, …)

NOT Reproducible (system-dependent):

  • ❌ Worker assignment / request execution order
  • ❌ Performance metrics (TTFT, ITL, throughput)
  • ❌ Server responses / absolute timestamps

Testing: Reproducibility is enforced by integration canary tests and CI/CD validation on every commit. See Testing & Validation.

What Is Reproducible, What Is Not

Key Principle: Seeds control WHAT you ask, not WHEN it completes or WHAT the server answers.

✅ Reproducible with —random-seed

Dataset: Prompt text/tokens, image dimensions/formats, audio duration/formats, session IDs
Sampling: Random selection, shuffle order, conversation selection
Timing Decisions: Poisson interval values, cancellation decisions

❌ NOT Reproducible

Worker/Execution: Which worker handles which request, request start/completion order, async I/O timing
Performance: TTFT, ITL, latency, throughput
System: Timestamps, process IDs, request IDs (ZMQ routing)
Server: LLM output text, output token counts, errors/failures

Why This Architecture?

AIPerf achieves its high throughput through parallel workers, ZMQ load balancing, and async I/O. Full determinism would require single-worker synchronous execution, destroying performance.

How It Works

Phase 1 (Startup - PROFILE_CONFIGURE):

  • DatasetManager pre-generates complete dataset using derived RNGs and stores in memory
  • TimingManager creates credit issuing strategy with RNG-based interval generator
  • Workers set global seed (defensive measure) but don’t derive/use RNGs

Phase 2 (Runtime - PROFILE_START):

  • TimingManager generates intervals on-the-fly using RNG, sleeps, then drops credits
  • Workers receive credits via ZMQ load balancing
  • Workers request conversations from DatasetManager’s pre-generated pool
  • DatasetManager returns conversations (using sampler RNG or specific ID)
  • Workers send API requests with pre-generated content
  • Result: Same dataset and interval values, but actual timing/worker assignment vary per run

Analogy: Like a deterministic deck of cards (same 52 cards, same shuffle) dealt to players who play at different speeds. The deck is reproducible; card distribution to players varies based on who finishes hands first.

Testing & Validation

Reproducibility is enforced by automated tests on every commit:

  • test_random_generator_canary.py: Compares payloads against reference snapshots to detect regressions
  • test_deterministic_behavior.py: Verifies byte-for-byte identical outputs with same seed, different outputs with different seeds, tested with 5+ parallel workers

User Guide

Basic Usage

$# Reproducible dataset
$aiperf --random-seed 42 [options...]
$
$# Non-reproducible (default)
$aiperf [options...]

Same seed + same config = identical dataset content. Performance metrics always vary.

Use Cases

Debugging: Reproduce exact prompts across runs to isolate prompt-related vs. network/timing issues

$aiperf --random-seed 42 [...] --profile-export-file run1.json
$aiperf --random-seed 42 [...] --profile-export-file run2.json
$# Prompts identical; metrics may vary

Performance Testing: Compare metrics with same dataset

$aiperf --random-seed 42 [...] --profile-export-file baseline.json
$# After optimization...
$aiperf --random-seed 42 [...] --profile-export-file optimized.json
$# Use statistical analysis (median, p95, p99)

Stress Testing: Vary patterns by omitting seed

$for i in {1..10}; do
$ aiperf [...] --profile-export-file run_$i.json
$done

Developer Guide

System Architecture

Where RNGs Are Used:

  • DatasetManager: Pre-generates all dataset content at startup using derived RNGs
  • TimingManager: Generates Poisson timing intervals and cancellation decisions
  • Workers: Set global seed (defensive) but do NOT derive RNGs—they only execute API requests with pre-generated content

Process Flow:

  1. bootstrap.py initializes RNG with rng.init(seed) in each process
    • Sets Python’s random.seed() and NumPy’s np.random.seed() globally (defensive measure)
    • Protects against third-party code inadvertently using global random state
  2. DatasetManager creates generators (PromptGenerator, ImageGenerator, etc.) that derive RNGs in __init__
  3. TimingManager creates interval generator that derives RNG in __init__
  4. Workers initialize global seed but don’t derive any RNGs (they only execute API requests)
  5. All dataset content is generated before any requests are sent
  6. Workers pull from pre-generated pool at runtime

How to Use RNGs in Your Code

Workers do NOT use RNGs. Only use RNGs in DatasetManager (content generation) or TimingManager (request timing) components.

1from aiperf.common import random_generator as rng
2
3class MyGenerator:
4 def __init__(self, config):
5 # Derive once in __init__ with unique identifier
6 self._rng = rng.derive("dataset.mycomponent.feature")
7
8 def generate(self):
9 # Use stored RNG instance
10 return self._rng.choice([1, 2, 3, 4, 5])

Rules:

  1. Derive in __init__, not in methods (or you’ll get the same first value every call)
  2. Store as instance variable
  3. Use unique dotted identifier: <module>.<component>.<aspect>
  4. Never use Python’s random module (technically seeded, but fragile—any code using it affects your sequence)

Hash-Based Seed Derivation

Uses SHA-256 to derive independent seeds: SHA-256(root_seed:identifier) → child seed

Benefits:

  • Deterministic: Same identifier always gets same seed
  • Independent: Changing one RNG doesn’t affect others
  • Fast: ~1-2 microseconds per derivation (happens once at init)

Common Mistakes

❌ Deriving in methods → Returns same first value every call.
✅ Derive in __init__.

❌ Using Python’s random → Fragile (global state affected by any code).
✅ Use rng.derive().

❌ Adding operations to existing RNG → Shifts all subsequent values.
✅ Derive new RNG for new feature.

FAQ

Q: Performance metrics still vary with same seed. Why?
A: Expected. Seeds control dataset content, not network timing or worker scheduling. See What Is Reproducible.

Q: Same seed across different configs?
A: Yes. Same seed + different config = different but reproducible results.

Q: Multiple workers—how does this work?
A: Workers set global seed (defensive) but don’t derive RNGs. DatasetManager pre-generates content, workers pull from this fixed pool. Validated with 5+ workers.

Q: Are RNGs thread-safe?
A: No, but not an issue—each process uses RNGs in its own space. If adding multi-threaded RNG usage, derive per-thread.

Q: Session IDs reproducible?
A: Yes. With seed: sequential (session_000000, session_000001). Without: UUIDs.

Q: Performance impact?
A: None measurable. Network I/O dominates by 1000×.

Reference

All Component-Specific RNG Identifiers

Dataset

1# Prompts (3)
2"dataset.prompt.length" # Token count distribution
3"dataset.prompt.corpus" # Content position selection
4"dataset.prompt.prefix" # Prefix selection
5
6# Images (4)
7"dataset.image.dimensions" # Width + height (coupled for aspect ratio)
8"dataset.image.format" # PNG/JPEG/etc. selection
9"dataset.image.source" # Source image selection (assets and directory modes only)
10"dataset.image.noise" # Random-noise pixel generation (noise mode, default)
11
12# Audio (3)
13"dataset.audio.duration" # Length distribution
14"dataset.audio.format" # Sample rate + bit depth
15"dataset.audio.data" # Audio sample generation
16
17# Samplers (2)
18"dataset.sampler.random" # Random sampling strategy
19"dataset.sampler.shuffle" # Shuffle sampling strategy
20
21# Loaders (2)
22"dataset.loader.random_pool" # Random pool loader
23"dataset.loader.sharegpt" # ShareGPT loader

Timing

1"timing.request.cancellation" # Cancellation decisions (probabilistic)
2"timing.request.poisson_interval" # Exponential inter-arrival times (Poisson process)

Composer

1"composer.turn.model_selection" # Model selection per turn
2"composer.turn.max_tokens" # max_tokens sampling
3"composer.conversation.turn_count" # Number of turns per conversation
4"composer.conversation.turn_delay" # Delay between turns

Models

1"models.sequence.distribution" # ISL/OSL distribution sampling

Module API

1from aiperf.common import random_generator as rng
2
3# Initialize (called automatically in bootstrap.py)
4rng.init(seed: int | None)
5 # seed: Any integer for deterministic, None for random
6 # Also sets global random.seed() and np.random.seed() defensively
7
8# Derive component RNGs (call in __init__)
9my_rng = rng.derive(identifier: str) -> RandomGenerator
10 # Returns: Independent RNG with SHA-256 derived seed
11
12# Reset (for testing only)
13rng.reset()

See random_generator.py for the RandomGenerator class and full API details.