These benchmarks do not offer much reliability; the compiler may be optimizing way large pieces of code under observation; and the way the benchmarks are performed leaves room for measurement noise.
Using criterion should provide us with better statistics.