Skip to content

Tuning is not robust to highly variable runtimes #84

@LilithHafner

Description

@LilithHafner
julia> f(::Int) = nothing
f (generic function with 1 method)

julia> f(::Float64) = sleep(.01)
f (generic function with 2 methods)

julia> using StableRNGs

julia> rng = StableRNG(0)
StableRNGs.LehmerRNG(state=0x00000000000000000000000000000001)

julia> @be rand(rng, (1, 2.0)) f
[ Info: Loading Chairmarks ...
Benchmark: 17 samples with 1 evaluation
min    0 ns
median 42.000 ns
mean   5.206 ms (1.88 allocs: 52.706 bytes)
max    11.079 ms (4 allocs: 112 bytes)

julia> rng = StableRNG(1)
StableRNGs.LehmerRNG(state=0x00000000000000000000000000000003)

julia> @be rand(rng, (1, 2.0)) f
[hangs for 5 minutes]

A less reliably reproducing variant was originally reported by @mbauman here.

Proposed fix:

When reporting final results (or maybe half way through the runtime budget) check to see if evals is actually reasonable. If not, rerun or warn that auto-tuning failed and prompt the user to manually tune the benchmark.

When choosing a high number of evals, increase the number of evals run by at most a factor of 10x at a time and make each of those trials a new sample (with new setup & teardown).

This will not cover the @be rand() < .01 if _ sleep(10) end case, but that case is nearly impossible to cover, and this will cover all reasonable cases (I hope).

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions