|
| 1 | +--- |
| 2 | +title: "SolverBenchmark.jl tutorial" |
| 3 | +tags: ["solver", "benchmark", "profile", "latex"] |
| 4 | +author: "Abel S. Siqueira and Dominique Orban" |
| 5 | +--- |
| 6 | + |
| 7 | +In this tutorial we illustrate the main uses of `SolverBenchmark`. |
| 8 | + |
| 9 | +First, let's create fake data. It is imperative that the data for each solver be stored |
| 10 | +in `DataFrame`s, and the collection of different solver must be stored in a dictionary of |
| 11 | +`Symbol` to `DataFrame`. |
| 12 | + |
| 13 | +In our examples we'll use the following data. |
| 14 | + |
| 15 | +```julia |
| 16 | +using DataFrames, Printf, Random |
| 17 | + |
| 18 | +Random.seed!(0) |
| 19 | + |
| 20 | +n = 10 |
| 21 | +names = [:alpha, :beta, :gamma] |
| 22 | +stats = Dict(name => DataFrame(:id => 1:n, |
| 23 | + :name => [@sprintf("prob%03d", i) for i = 1:n], |
| 24 | + :status => map(x -> x < 0.75 ? :first_order : :failure, rand(n)), |
| 25 | + :f => randn(n), |
| 26 | + :t => 1e-3 .+ rand(n) * 1000, |
| 27 | + :iter => rand(10:10:100, n), |
| 28 | + :irrelevant => randn(n)) for name in names) |
| 29 | +``` |
| 30 | + |
| 31 | +The data consists of a (fake) run of three solvers `alpha`, `beta` and `gamma`. |
| 32 | +Each solver has a column `id`, which is necessary for joining the solvers (names |
| 33 | +can be repeated), and columns `name`, `status`, `f`, `t` and `iter` corresponding to |
| 34 | +problem results. There is also a column `irrelevant` with extra information that will |
| 35 | +not be used to produce our benchmarks. |
| 36 | + |
| 37 | +Here are the statistics of solver `alpha`: |
| 38 | + |
| 39 | +```julia |
| 40 | +stats[:alpha] |
| 41 | +``` |
| 42 | + |
| 43 | +## Tables |
| 44 | + |
| 45 | +The first thing we may want to do is produce a table for each solver. Notice that the |
| 46 | +solver result is already a DataFrame, so there are a few options available in other |
| 47 | +packages, as well as simply printing the DataFrame. |
| 48 | +Our concern here is two-fold: producing publication-ready LaTeX tables, and web-ready |
| 49 | +markdown tables. |
| 50 | + |
| 51 | +The simplest use is `pretty_stats(io, dataframe)`. |
| 52 | +By default, `io` is `stdout`: |
| 53 | + |
| 54 | +```julia |
| 55 | +using SolverBenchmark |
| 56 | + |
| 57 | +pretty_stats(stats[:alpha]) |
| 58 | +``` |
| 59 | + |
| 60 | +Printing is LaTeX format is achieved with `pretty_latex_stats`: |
| 61 | + |
| 62 | +```julia |
| 63 | +pretty_latex_stats(stats[:alpha]) |
| 64 | +``` |
| 65 | + |
| 66 | +Alternatively, you can print to a file. |
| 67 | + |
| 68 | +```julia |
| 69 | +open("alpha.tex", "w") do io |
| 70 | + println(io, "\\documentclass[varwidth=20cm,crop=true]{standalone}") |
| 71 | + println(io, "\\usepackage{longtable}[=v4.13]") |
| 72 | + println(io, "\\begin{document}") |
| 73 | + pretty_latex_stats(io, stats[:alpha]) |
| 74 | + println(io, "\\end{document}") |
| 75 | +end |
| 76 | +``` |
| 77 | + |
| 78 | +```julia |
| 79 | +run(`latexmk -quiet -pdf alpha.tex`) |
| 80 | +run(`pdf2svg alpha.pdf alpha.svg`) |
| 81 | +``` |
| 82 | + |
| 83 | +If only a subset of columns should be printed, the DataFrame should be indexed accordingly: |
| 84 | + |
| 85 | +```julia |
| 86 | +df = stats[:alpha] |
| 87 | +pretty_stats(df[!, [:name, :f, :t]]) |
| 88 | +``` |
| 89 | + |
| 90 | +Markdown tables may be generated by supplying the PrettyTables `tf` keyword argument to specify the table format: |
| 91 | + |
| 92 | +```julia |
| 93 | +pretty_stats(df[!, [:name, :f, :t]], tf=tf_markdown) |
| 94 | +``` |
| 95 | + |
| 96 | +All values of `tf` accepted by PrettyTables may be used in SolverBenchmark. |
| 97 | + |
| 98 | +The `fmt_override` option overrides the formatting of a specific column. |
| 99 | +The argument should be a dictionary of `Symbol` to format strings, where the format string will be applied to each element of the column. |
| 100 | + |
| 101 | +The `hdr_override` changes the column headers. |
| 102 | + |
| 103 | +```julia |
| 104 | +fmt_override = Dict(:f => "%+10.3e", |
| 105 | + :t => "%08.2f") |
| 106 | +hdr_override = Dict(:name => "Name", :f => "f(x)", :t => "Time") |
| 107 | +pretty_stats(stdout, |
| 108 | + df[!, [:name, :f, :t]], |
| 109 | + col_formatters = fmt_override, |
| 110 | + hdr_override = hdr_override) |
| 111 | +``` |
| 112 | + |
| 113 | +While `col_formatters` is for simple format strings, the PrettyTables API lets us define more elaborate formatters in the form of functions: |
| 114 | + |
| 115 | +```julia |
| 116 | +fmt_override = Dict(:f => "%+10.3e", |
| 117 | + :t => "%08.2f") |
| 118 | +hdr_override = Dict(:name => "Name", :f => "f(x)", :t => "Time") |
| 119 | +pretty_stats(df[!, [:name, :f, :t]], |
| 120 | + col_formatters = fmt_override, |
| 121 | + hdr_override = hdr_override, |
| 122 | + formatters = (v, i, j) -> begin |
| 123 | + if j == 3 # t is the 3rd column |
| 124 | + vi = floor(Int, v) |
| 125 | + minutes = div(vi, 60) |
| 126 | + seconds = vi % 60 |
| 127 | + micros = round(Int, 1e6 * (v - vi)) |
| 128 | + @sprintf("%2dm %02ds %06dμs", minutes, seconds, micros) |
| 129 | + else |
| 130 | + v |
| 131 | + end |
| 132 | + end) |
| 133 | +``` |
| 134 | + |
| 135 | +See the [PrettyTables.jl documentation](https://ronisbr.github.io/PrettyTables.jl/stable/man/formatters/) for more information. |
| 136 | + |
| 137 | +When using LaTeX format, the output must be understood by LaTeX. |
| 138 | +By default, numerical data in the table is wrapped in inline math environments. |
| 139 | +But those math environments would interfere with our formatting of the time. |
| 140 | +Thus we must first disable them for the `time` column using `col_formatters`, and then apply the PrettyTables formatter as above: |
| 141 | + |
| 142 | +```julia |
| 143 | +fmt_override = Dict(:f => "%+10.3e", |
| 144 | + :t => "%08.2f") |
| 145 | +hdr_override = Dict(:name => "Name", :f => "f(x)", :t => "Time") |
| 146 | +open("alpha2.tex", "w") do io |
| 147 | + println(io, "\\documentclass[varwidth=20cm,crop=true]{standalone}") |
| 148 | + println(io, "\\usepackage{longtable}[=v4.13]") |
| 149 | + println(io, "\\begin{document}") |
| 150 | + pretty_latex_stats(io, |
| 151 | + df[!, [:name, :status, :f, :t, :iter]], |
| 152 | + col_formatters = Dict(:t => "%f"), # disable default formatting of t |
| 153 | + formatters = (v,i,j) -> begin |
| 154 | + if j == 4 |
| 155 | + xi = floor(Int, v) |
| 156 | + minutes = div(xi, 60) |
| 157 | + seconds = xi % 60 |
| 158 | + micros = round(Int, 1e6 * (v - xi)) |
| 159 | + @sprintf("\\(%2d\\)m \\(%02d\\)s \\(%06d \\mu\\)s", minutes, seconds, micros) |
| 160 | + else |
| 161 | + v |
| 162 | + end |
| 163 | + end) |
| 164 | + println(io, "\\end{document}") |
| 165 | +end |
| 166 | +``` |
| 167 | + |
| 168 | +```julia |
| 169 | +run(`latexmk -quiet -pdf alpha2.tex`) |
| 170 | +run(`pdf2svg alpha2.pdf alpha2.svg`) |
| 171 | +``` |
| 172 | + |
| 173 | +### Joining tables |
| 174 | + |
| 175 | +In some occasions, instead of/in addition to showing individual results, we show |
| 176 | +a table with the result of multiple solvers. |
| 177 | + |
| 178 | +```julia |
| 179 | +df = join(stats, [:f, :t]) |
| 180 | +pretty_stats(stdout, df) |
| 181 | +``` |
| 182 | + |
| 183 | +The column `:id` is used as guide on where to join. In addition, we may have |
| 184 | +repeated columns between the solvers. We convery that information with argument `invariant_cols`. |
| 185 | + |
| 186 | +```julia |
| 187 | +df = join(stats, [:f, :t], invariant_cols=[:name]) |
| 188 | +pretty_stats(stdout, df) |
| 189 | +``` |
| 190 | + |
| 191 | +`join` also accepts `hdr_override` for changing the column name before appending |
| 192 | +`_solver`. |
| 193 | + |
| 194 | +```julia |
| 195 | +hdr_override = Dict(:name => "Name", :f => "f(x)", :t => "Time") |
| 196 | +df = join(stats, [:f, :t], invariant_cols=[:name], hdr_override=hdr_override) |
| 197 | +pretty_stats(stdout, df) |
| 198 | +``` |
| 199 | + |
| 200 | +```julia |
| 201 | +hdr_override = Dict(:name => "Name", :f => "\\(f(x)\\)", :t => "Time") |
| 202 | +df = join(stats, [:f, :t], invariant_cols=[:name], hdr_override=hdr_override) |
| 203 | +open("alpha3.tex", "w") do io |
| 204 | + println(io, "\\documentclass[varwidth=20cm,crop=true]{standalone}") |
| 205 | + println(io, "\\usepackage{longtable}[=v4.13]") |
| 206 | + println(io, "\\begin{document}") |
| 207 | + pretty_latex_stats(io, df) |
| 208 | + println(io, "\\end{document}") |
| 209 | +end |
| 210 | +``` |
| 211 | + |
| 212 | +```julia |
| 213 | +run(`latexmk -quiet -pdf alpha3.tex`) |
| 214 | +run(`pdf2svg alpha3.pdf alpha3.svg`) |
| 215 | +``` |
| 216 | + |
| 217 | +## Profiles |
| 218 | + |
| 219 | +Performance profiles are a comparison tool developed by [Dolan and |
| 220 | +Moré, 2002](https://link.springer.com/article/10.1007/s101070100263/) that takes into |
| 221 | +account the relative performance of a solver and whether it has achieved convergence for each |
| 222 | +problem. `SolverBenchmark.jl` uses |
| 223 | +[BenchmarkProfiles.jl](https://github.com/JuliaSmoothOptimizers/BenchmarkProfiles.jl) |
| 224 | +for generating performance profiles from the dictionary of `DataFrame`s. |
| 225 | + |
| 226 | +The basic usage is `performance_profile(stats, cost)`, where `cost` is a function |
| 227 | +applied to a `DataFrame` and returning a vector. |
| 228 | + |
| 229 | +```julia |
| 230 | +using Plots |
| 231 | +pyplot() |
| 232 | + |
| 233 | +p = performance_profile(stats, df -> df.t) |
| 234 | +``` |
| 235 | + |
| 236 | +Notice that we used `df -> df.t` which corresponds to the column `:t` of the |
| 237 | +`DataFrame`s. |
| 238 | +This does not take into account that the solvers have failed for a few problems |
| 239 | +(according to column :status). The next profile takes that into account. |
| 240 | + |
| 241 | +```julia |
| 242 | +cost(df) = (df.status .!= :first_order) * Inf + df.t |
| 243 | +p = performance_profile(stats, cost) |
| 244 | +``` |
| 245 | + |
| 246 | +### Profile wall |
| 247 | + |
| 248 | +Another profile function is `profile_solvers`, which creates a wall of performance |
| 249 | +profiles, accepting multiple costs and doing 1 vs 1 comparisons in addition to the |
| 250 | +traditional performance profile. |
| 251 | + |
| 252 | +```julia |
| 253 | +solved(df) = (df.status .== :first_order) |
| 254 | +costs = [df -> .!solved(df) * Inf + df.t, df -> .!solved(df) * Inf + df.iter] |
| 255 | +costnames = ["Time", "Iterations"] |
| 256 | +p = profile_solvers(stats, costs, costnames) |
| 257 | +``` |
| 258 | + |
| 259 | +### Example of benchmark running |
| 260 | +Here is a useful tutorial on how to use the benchmark with specific solver: |
| 261 | +[Run a benchmark with OptimizationProblems](https://juliasmoothoptimizers.github.io/OptimizationProblems.jl/dev/benchmark/) |
| 262 | +The tutorial covers how to use the problems from `OptimizationProblems` to run a benchmark for unconstrained optimization. |
0 commit comments