Seed System
FlakeMonster uses deterministic seeding to ensure reproducible test failures. Same seed, same delays, every run.
How Seeds Work
Every FlakeMonster run is governed by a single base seed — a 32-bit integer that serves as the root of all randomness. From this one number, the entire set of injected delays is derived deterministically.
The key properties of the seed system:
- Base seed — the starting point for all runs. Specified via
--seed <number>or--seed auto(random). When omitted,autois the default. - Derived seeds — each injection point gets a unique derived seed computed from the base seed, the file path, the function name, and the statement index within that function.
- Deterministic delays — the derived seed determines the exact delay value (0–50ms by default) injected at that point.
- Reproducibility — same base seed produces the same derived seeds, the same delays, and therefore the same test outcome.
This means that when a test fails, you can take the reported seed, pass it back to FlakeMonster, and reproduce the exact same timing conditions that triggered the failure.
Seed Derivation
The derivation chain transforms one base seed into thousands of unique, deterministic delay values. Here is how it works step by step:
1. Base Seed
The base seed is the single number you provide (or FlakeMonster generates for you). For example:
$ flake-monster test --seed 12345 --cmd "npm test"
Here 12345 is the base seed. Everything else flows from it.
2. Per-Run Seed
When you run multiple iterations (e.g., --runs 10), each run gets its own derived seed so that each iteration explores a different timing pattern:
// Run 0 runSeed = deriveSeed(12345, "run:0") // e.g. 3892047156 // Run 1 runSeed = deriveSeed(12345, "run:1") // e.g. 1740283695 // Run 2 runSeed = deriveSeed(12345, "run:2") // e.g. 2618493027
The deriveSeed function combines the base seed with a context string using DJB2 hashing:
export function deriveSeed(baseSeed, context) { return (baseSeed + hashString(context)) | 0; }
3. Per-Injection Seed
Within each run, every injection point gets its own seed derived from the run seed and a context string that encodes the file path, function name, and statement index:
// For the first await in getUser() inside src/api.js injectionSeed = deriveSeed(runSeed, "src/api.js:getUser:0") // For the second await in getUser() injectionSeed = deriveSeed(runSeed, "src/api.js:getUser:1") // For the first await in saveOrder() inside src/checkout.js injectionSeed = deriveSeed(runSeed, "src/checkout.js:saveOrder:0")
This ensures that every injection point has a distinct seed, even across different files and functions.
4. Delay Value
The injection seed is used to create a Mulberry32 PRNG, which produces a float in the range [0, 1). This float is then scaled to the delay range:
rng = createRng(injectionSeed) delay = Math.round(minDelay + rng() * (maxDelay - minDelay))
With the default range of 0–50ms, if rng() returns 0.3, the delay is:
delay = Math.round(0 + 0.3 * 50) // = Math.round(15) = 15ms
This value is embedded directly into the source code at injection time:
// @flake-monster[jt92-se2j!] v1 await __FlakeMonster__(15);
Full Derivation Chain
Putting it all together, here is the complete chain from base seed to injected delay:
baseSeed = 12345 | v runSeed = deriveSeed(12345, "run:0") | v injectionSeed = deriveSeed(runSeed, "src/api.js:getUser:0") | v rng = createRng(injectionSeed) | v delay = Math.round(minDelay + rng() * (maxDelay - minDelay)) | v await __FlakeMonster__(15) // injected into source
The Hashing Functions
Two algorithms power the seed system:
DJB2 Hash — converts context strings (file paths, function names) into 32-bit integers. This is the hashString function used inside deriveSeed:
export function hashString(str) { let hash = 5381; for (let i = 0; i < str.length; i++) { hash = ((hash << 5) + hash + str.charCodeAt(i)) | 0; } return hash >>> 0; }
Mulberry32 PRNG — a fast, high-quality 32-bit PRNG used to generate the final float values for delay computation:
export function createRng(seed) { return function () { seed |= 0; seed = (seed + 0x6d2b79f5) | 0; let t = Math.imul(seed ^ (seed >>> 15), 1 | seed); t = (t + Math.imul(t ^ (t >>> 7), 61 | t)) ^ t; return ((t ^ (t >>> 14)) >>> 0) / 4294967296; }; }
Reproducing Failures
When FlakeMonster detects a flaky test, it reports the exact seed that triggered the failure:
FlakeMonster v0.4.6 seed=12345 mode=medium runs=10 Run 1/10 PASS (seed=3892047156) Run 2/10 PASS (seed=1740283695) Run 3/10 FAIL (seed=948271536) Run 4/10 PASS (seed=2618493027) ... -- Results -- 1 flaky test detected: cart > applies discount code Failed seed: 948271536 Flaky rate: 10%
The seed in the FAIL line is the per-run derived seed. To reproduce the exact same failure, pass it back as the base seed with a single run:
$ flake-monster test --runs 1 --seed 948271536 --cmd "npm test"
This injects the exact same delays at the exact same locations, reproducing the timing conditions that caused the failure.
Manual Injection for Debugging
For deeper investigation, you can inject the delays manually and leave them in place while you debug:
# Step 1: Inject with the failing seed $ flake-monster inject --seed 948271536 "src/**/*.js" # Step 2: Run your tests — the failure will reproduce $ npm test # Step 3: Open the failing file and inspect the injected delays # You can see exactly which delay caused the timing issue # Step 4: Debug, fix the race condition, then clean up $ flake-monster restore
Because delays are computed at injection time and embedded as literal values (e.g., await __FlakeMonster__(15)), you can read the injected file and see every delay. You can even edit delay values by hand to narrow down which timing window triggers the bug.
Auto vs Fixed Seeds
FlakeMonster supports two seed modes, each suited to different workflows:
Auto Seeds (default)
$ flake-monster test --cmd "npm test" # equivalent to: --seed auto
When --seed auto is used (or no --seed flag is provided), FlakeMonster generates a random base seed using Math.random(). This is ideal for exploratory testing where you want to discover new failure modes across different timing patterns.
The generated seed is always printed in the output header, so you can capture it for reproduction:
FlakeMonster v0.4.6 seed=3221704130 mode=medium runs=10
Fixed Seeds
$ flake-monster test --seed 3221704130 --cmd "npm test"
A fixed seed produces the exact same derived seeds across every invocation. This is useful for:
- Reproduction — re-running a known failing seed to confirm the bug is fixed
- Deterministic CI — running the same timing patterns on every commit to catch regressions
- Bisecting — narrowing down which commit introduced a flaky test under specific timing
Tip: In CI, consider using a rotating seed strategy. For example, use the git commit hash as the seed so each commit gets a unique but reproducible timing pattern:
--seed $(git rev-parse HEAD | cut -c1-8 | xargs printf "%d\n" 0x)
Per-Run Variation
Even with a fixed base seed, each run within a multi-run session gets different delays. This is the fundamental mechanism that lets FlakeMonster explore different timing orderings.
With --seed 12345 --runs 5, the runs look like this:
| Run | Context String | Derived Run Seed | Delays |
|---|---|---|---|
| 0 | "run:0" |
3892047156 | Different set A |
| 1 | "run:1" |
1740283695 | Different set B |
| 2 | "run:2" |
2618493027 | Different set C |
| 3 | "run:3" |
947261538 | Different set D |
| 4 | "run:4" |
3105827419 | Different set E |
Each run re-injects with its own timing profile, but the base seed is the root — you only need it to reproduce the entire sequence. Run 0 always gets the same derived seed for a given base seed, Run 1 always gets its same seed, and so on.
This is why the output reports both the base seed (in the header) and the per-run seed (next to each PASS/FAIL):
FlakeMonster v0.4.6 seed=12345 mode=medium runs=5 <-- base seed Run 1/5 PASS (seed=3892047156) <-- per-run seed Run 2/5 PASS (seed=1740283695) Run 3/5 FAIL (seed=2618493027) <-- use this to reproduce Run 4/5 PASS (seed=947261538) Run 5/5 PASS (seed=3105827419)
Delay Range
The delay range controls the minimum and maximum milliseconds that can be injected at any point. These values determine how aggressively FlakeMonster perturbs your async timing.
| Flag | Default | Description |
|---|---|---|
--min-delay <ms> |
0 |
Minimum injected delay in milliseconds |
--max-delay <ms> |
50 |
Maximum injected delay in milliseconds |
The delay for each injection point is computed using the Mulberry32 PRNG seeded with the derived seed:
delay = Math.round(minDelay + rng() * (maxDelay - minDelay))
Choosing a Range
Default (0–50ms) — good for most projects. Introduces enough timing variation to surface common race conditions without making tests painfully slow:
$ flake-monster test --cmd "npm test" # delays range from 0ms to 50ms
Narrow range (0–10ms) — for projects where tests are sensitive to even small timing changes, or when you want faster runs:
$ flake-monster test --max-delay 10 --cmd "npm test" # delays range from 0ms to 10ms — subtle but fast
Wide range (10–200ms) — for stress testing. Amplifies timing differences to catch race conditions that only appear under heavy load or slow networks:
$ flake-monster test --min-delay 10 --max-delay 200 --cmd "npm test" # delays range from 10ms to 200ms — aggressive stress test
Non-zero minimum (5–50ms) — ensures every injection point gets at least some delay. Useful when you want to guarantee that no async operation resolves instantly:
$ flake-monster test --min-delay 5 --cmd "npm test" # every injection point gets at least 5ms of delay
Warning: Very high max-delay values (500ms+) will significantly slow down your test suite. With 100 injection points and
--max-delay 500, a single run could add up to 50 seconds of cumulative delay. Use wide ranges sparingly or reduce--runsto compensate.
Impact on Flake Detection
The delay range directly affects what kinds of flaky tests FlakeMonster can detect:
- Small range (0–10ms) — catches tight race conditions where operations compete for the same event loop tick
- Medium range (0–50ms) — catches most timing-dependent bugs including those involving debouncing, throttling, and typical async patterns
- Large range (10–200ms+) — catches ordering issues that only manifest when operations take noticeably different amounts of time, simulating network latency or slow I/O
If your initial run with defaults finds no flakes, consider increasing the range before concluding your tests are stable. Some race conditions only appear under wider timing differentials.