Tuning is not robust to highly variable runtimes

```julia
julia> f(::Int) = nothing
f (generic function with 1 method)

julia> f(::Float64) = sleep(.01)
f (generic function with 2 methods)

julia> using StableRNGs

julia> rng = StableRNG(0)
StableRNGs.LehmerRNG(state=0x00000000000000000000000000000001)

julia> @be rand(rng, (1, 2.0)) f
[ Info: Loading Chairmarks ...
Benchmark: 17 samples with 1 evaluation
min    0 ns
median 42.000 ns
mean   5.206 ms (1.88 allocs: 52.706 bytes)
max    11.079 ms (4 allocs: 112 bytes)

julia> rng = StableRNG(1)
StableRNGs.LehmerRNG(state=0x00000000000000000000000000000003)

julia> @be rand(rng, (1, 2.0)) f
[hangs for 5 minutes]
```

A less reliably reproducing variant was originally reported by @mbauman [here](https://discourse.julialang.org/t/chairmarks-jl/111096/76).

Proposed fix:

When reporting final results (or maybe half way through the runtime budget) check to see if evals is actually reasonable. If not, rerun or warn that auto-tuning failed and prompt the user to manually tune the benchmark.

When choosing a high number of evals, increase the number of evals run by at most a factor of 10x at a time _and_ make each of those trials a new sample (with new setup & teardown).

This will not cover the `@be rand() < .01 if _ sleep(10) end` case, but that case is nearly impossible to cover, and this will cover all reasonable cases (I hope).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tuning is not robust to highly variable runtimes #84

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Tuning is not robust to highly variable runtimes #84

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions