Skip to content
This repository was archived by the owner on Jul 4, 2023. It is now read-only.

Tweak plotting to collect outliers into a single bin; drop UnicodePlots dependency #13

Merged
merged 6 commits into from
May 25, 2021

Conversation

ericphanson
Copy link
Owner

@ericphanson ericphanson commented May 23, 2021

One can see this in action in the diff to the README. We choose a value Q = quantile(x, 0.999) to represent close-to-the-upper end of the data we might expect. Then, if Q is more than 2 bins away from the actual maximum, we collapse all of those bins into a single bin. The value 0.999 can be configured by BenchmarkHistograms.OUTLIER_QUANTILE. Setting it to 1 makes Q into the actual maximum, which effectively disables the collapsing. This aims to address some of the points @chriselrod brought up in https://discourse.julialang.org/t/ann-benchmarkhistograms-jl/61653/3.

This change is coupled with dropping UnicodePlots and using @brenhinkeller's code from JuliaCI/BenchmarkTools.jl#180 (comment) instead. I hope that's OK @brenhinkeller! I've tried to add you as a coauthor on the commit 2a7af86 but it seems GitHub is not recognizing it. I've also credited you in the README.

The reason for doing this switch is that while UnicodePlots works with StatsBase.Histogram which can handle arbitrary edges, it assumes these edges are a range (and accesses their step field). Thus, it can't handle having a bin that encompasses a larger range than the other bins. Moreover, if it did, I think it might handle it by making that bin really big on the plot. In this particular case, we don't really want a giant bin at the end since that would look really ugly if there was 1 outlier an order of magnitude away from the rest (which can happen!). Benchmarking yields particular types of distributions with generally additive noise (as discussed in the BenchmarkTools paper) and we can use that context here to have a more specialized plot that gives us what we need. I.e. I think the UnicodePlots behavior is right in general, but here we want something else. Thankfully @brenhinkeller already coded up a custom histogram for us to use here.

This has the added benefit of reducing the dependencies to just standard libraries (Printf and Statistics) and BenchmarkTools itself.

I've marked this as a non-breaking change because the programmatic API has not broken, even though the display has (the plots look a little different, and this outlier detection is done by default). I think there's an argument that in a plotting package any display change that's not a bugfix is kind of a breaking change, but I'm not sure we want to commit to that here. I think probably it's not a big deal either way since this package generally should not have dependents.

I'd appreciate any thoughts on this change!

@codecov-commenter
Copy link

codecov-commenter commented May 23, 2021

Codecov Report

Merging #13 (e05bf53) into main (c86970b) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##              main       #13   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            2         3    +1     
  Lines           50        85   +35     
=========================================
+ Hits            50        85   +35     
Impacted Files Coverage Δ
src/BenchmarkHistograms.jl 100.00% <100.00%> (ø)
src/simple_unicode_histogram.jl 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c86970b...e05bf53. Read the comment docs.

(11980.0 - 12430.0] ▏3
(12430.0 - 12880.0] 0
(12880.0 - 13330.0] ▏5
(13330.0 - 18330.0] ▏11
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the outlier change is especially nice here, where on my m1 macbook, I had such severe outliers that @MasonProtter had to generate the plot in order for it to look Gaussian (#6). With the new outlier bin, I can plot it myself 😄

@chriselrod
Copy link

I think this looks great. Once it is released, I'll replace using BenchmarkTools with using BenchmarkHistograms in all my startup.jls. I may have more feedback after trying it out, but for now I think the choice of quantile looks good. The extreme means it should still show the usual sources of various, like GC activity.

@chriselrod
Copy link

chriselrod commented May 25, 2021

I assume stripping the fastest times here is a bug, as it reports them as 0 despite the minimum time being reported as 263.465 ns:

julia> @benchmark axpy_minbatch_1500!($y, eps(), $x)
samples: 10000; evals/sample: 230; memory estimate: 0 bytes; allocs estimate: 0
ns

 (263.0 - 274.0]   0
 (274.0 - 285.0]   0
 (285.0 - 295.0]   0
 (295.0 - 306.0]  ▏1
 (306.0 - 316.0]  ▏1
 (316.0 - 327.0]  ▏1
 (327.0 - 337.0]  ▎65
 (337.0 - 348.0]  ██████████████████████████████▏8708
 (348.0 - 358.0]  ███869
 (358.0 - 369.0]  ▉238
 (369.0 - 379.0]  ▎41
 (379.0 - 390.0]  ▏2
 (390.0 - 400.0]  ▏7
 (400.0 - 411.0]  ▎56
 (411.0 - 489.0]  ▏11

                  Counts

min: 263.465 ns (0.00% GC); mean: 344.388 ns (0.00% GC); median: 342.974 ns (0.00% GC); max: 489.209 ns (0.00% GC).

From the Polyester.jl README.

@ericphanson
Copy link
Owner Author

Good catch! Should be fixed now. Funnily enough, I think the min time was getting put into the bin for the slowest times, which is pretty much the worst place to put it aha.

@ericphanson
Copy link
Owner Author

Decided the change of default (now we bin outliers specially) should indeed count as a breaking change, so I bumped the version to 0.2, and will merge and tag when CI finishes.

@ericphanson ericphanson merged commit 1089a25 into main May 25, 2021
@ericphanson ericphanson deleted the eph/outliers branch May 25, 2021 10:55
@brenhinkeller
Copy link
Contributor

Oh yeah, totally cool to use that code -- glad it was helpful!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants