Skip to content

Commit 74d813f

Browse files
tmigotabelsiqueira
authored andcommitted
add SolverBenchmark tutorial
1 parent e4cddf2 commit 74d813f

File tree

2 files changed

+276
-0
lines changed

2 files changed

+276
-0
lines changed
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
[deps]
2+
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
3+
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
4+
Printf = "de0858da-6303-5e67-8744-51eddeeeb8d7"
5+
PyPlot = "d330b81b-6aea-500a-939a-2ce795aea3ee"
6+
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
7+
SolverBenchmark = "581a75fa-a23a-52d0-a590-d6201de2218a"
8+
9+
[compat]
10+
11+
DataFrames = "1.3.4"
12+
Plots = "1.31.7"
13+
PyPlot = "2.10.0"
14+
SolverBenchmark = "0.5.3"
Lines changed: 262 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,262 @@
1+
---
2+
title: "SolverBenchmark.jl tutorial"
3+
tags: ["solver", "benchmark", "profile", "latex"]
4+
author: "Abel S. Siqueira and Dominique Orban"
5+
---
6+
7+
In this tutorial we illustrate the main uses of `SolverBenchmark`.
8+
9+
First, let's create fake data. It is imperative that the data for each solver be stored
10+
in `DataFrame`s, and the collection of different solver must be stored in a dictionary of
11+
`Symbol` to `DataFrame`.
12+
13+
In our examples we'll use the following data.
14+
15+
```julia
16+
using DataFrames, Printf, Random
17+
18+
Random.seed!(0)
19+
20+
n = 10
21+
names = [:alpha, :beta, :gamma]
22+
stats = Dict(name => DataFrame(:id => 1:n,
23+
:name => [@sprintf("prob%03d", i) for i = 1:n],
24+
:status => map(x -> x < 0.75 ? :first_order : :failure, rand(n)),
25+
:f => randn(n),
26+
:t => 1e-3 .+ rand(n) * 1000,
27+
:iter => rand(10:10:100, n),
28+
:irrelevant => randn(n)) for name in names)
29+
```
30+
31+
The data consists of a (fake) run of three solvers `alpha`, `beta` and `gamma`.
32+
Each solver has a column `id`, which is necessary for joining the solvers (names
33+
can be repeated), and columns `name`, `status`, `f`, `t` and `iter` corresponding to
34+
problem results. There is also a column `irrelevant` with extra information that will
35+
not be used to produce our benchmarks.
36+
37+
Here are the statistics of solver `alpha`:
38+
39+
```julia
40+
stats[:alpha]
41+
```
42+
43+
## Tables
44+
45+
The first thing we may want to do is produce a table for each solver. Notice that the
46+
solver result is already a DataFrame, so there are a few options available in other
47+
packages, as well as simply printing the DataFrame.
48+
Our concern here is two-fold: producing publication-ready LaTeX tables, and web-ready
49+
markdown tables.
50+
51+
The simplest use is `pretty_stats(io, dataframe)`.
52+
By default, `io` is `stdout`:
53+
54+
```julia
55+
using SolverBenchmark
56+
57+
pretty_stats(stats[:alpha])
58+
```
59+
60+
Printing is LaTeX format is achieved with `pretty_latex_stats`:
61+
62+
```julia
63+
pretty_latex_stats(stats[:alpha])
64+
```
65+
66+
Alternatively, you can print to a file.
67+
68+
```julia
69+
open("alpha.tex", "w") do io
70+
println(io, "\\documentclass[varwidth=20cm,crop=true]{standalone}")
71+
println(io, "\\usepackage{longtable}[=v4.13]")
72+
println(io, "\\begin{document}")
73+
pretty_latex_stats(io, stats[:alpha])
74+
println(io, "\\end{document}")
75+
end
76+
```
77+
78+
```julia
79+
run(`latexmk -quiet -pdf alpha.tex`)
80+
run(`pdf2svg alpha.pdf alpha.svg`)
81+
```
82+
83+
If only a subset of columns should be printed, the DataFrame should be indexed accordingly:
84+
85+
```julia
86+
df = stats[:alpha]
87+
pretty_stats(df[!, [:name, :f, :t]])
88+
```
89+
90+
Markdown tables may be generated by supplying the PrettyTables `tf` keyword argument to specify the table format:
91+
92+
```julia
93+
pretty_stats(df[!, [:name, :f, :t]], tf=tf_markdown)
94+
```
95+
96+
All values of `tf` accepted by PrettyTables may be used in SolverBenchmark.
97+
98+
The `fmt_override` option overrides the formatting of a specific column.
99+
The argument should be a dictionary of `Symbol` to format strings, where the format string will be applied to each element of the column.
100+
101+
The `hdr_override` changes the column headers.
102+
103+
```julia
104+
fmt_override = Dict(:f => "%+10.3e",
105+
:t => "%08.2f")
106+
hdr_override = Dict(:name => "Name", :f => "f(x)", :t => "Time")
107+
pretty_stats(stdout,
108+
df[!, [:name, :f, :t]],
109+
col_formatters = fmt_override,
110+
hdr_override = hdr_override)
111+
```
112+
113+
While `col_formatters` is for simple format strings, the PrettyTables API lets us define more elaborate formatters in the form of functions:
114+
115+
```julia
116+
fmt_override = Dict(:f => "%+10.3e",
117+
:t => "%08.2f")
118+
hdr_override = Dict(:name => "Name", :f => "f(x)", :t => "Time")
119+
pretty_stats(df[!, [:name, :f, :t]],
120+
col_formatters = fmt_override,
121+
hdr_override = hdr_override,
122+
formatters = (v, i, j) -> begin
123+
if j == 3 # t is the 3rd column
124+
vi = floor(Int, v)
125+
minutes = div(vi, 60)
126+
seconds = vi % 60
127+
micros = round(Int, 1e6 * (v - vi))
128+
@sprintf("%2dm %02ds %06dμs", minutes, seconds, micros)
129+
else
130+
v
131+
end
132+
end)
133+
```
134+
135+
See the [PrettyTables.jl documentation](https://ronisbr.github.io/PrettyTables.jl/stable/man/formatters/) for more information.
136+
137+
When using LaTeX format, the output must be understood by LaTeX.
138+
By default, numerical data in the table is wrapped in inline math environments.
139+
But those math environments would interfere with our formatting of the time.
140+
Thus we must first disable them for the `time` column using `col_formatters`, and then apply the PrettyTables formatter as above:
141+
142+
```julia
143+
fmt_override = Dict(:f => "%+10.3e",
144+
:t => "%08.2f")
145+
hdr_override = Dict(:name => "Name", :f => "f(x)", :t => "Time")
146+
open("alpha2.tex", "w") do io
147+
println(io, "\\documentclass[varwidth=20cm,crop=true]{standalone}")
148+
println(io, "\\usepackage{longtable}[=v4.13]")
149+
println(io, "\\begin{document}")
150+
pretty_latex_stats(io,
151+
df[!, [:name, :status, :f, :t, :iter]],
152+
col_formatters = Dict(:t => "%f"), # disable default formatting of t
153+
formatters = (v,i,j) -> begin
154+
if j == 4
155+
xi = floor(Int, v)
156+
minutes = div(xi, 60)
157+
seconds = xi % 60
158+
micros = round(Int, 1e6 * (v - xi))
159+
@sprintf("\\(%2d\\)m \\(%02d\\)s \\(%06d \\mu\\)s", minutes, seconds, micros)
160+
else
161+
v
162+
end
163+
end)
164+
println(io, "\\end{document}")
165+
end
166+
```
167+
168+
```julia
169+
run(`latexmk -quiet -pdf alpha2.tex`)
170+
run(`pdf2svg alpha2.pdf alpha2.svg`)
171+
```
172+
173+
### Joining tables
174+
175+
In some occasions, instead of/in addition to showing individual results, we show
176+
a table with the result of multiple solvers.
177+
178+
```julia
179+
df = join(stats, [:f, :t])
180+
pretty_stats(stdout, df)
181+
```
182+
183+
The column `:id` is used as guide on where to join. In addition, we may have
184+
repeated columns between the solvers. We convery that information with argument `invariant_cols`.
185+
186+
```julia
187+
df = join(stats, [:f, :t], invariant_cols=[:name])
188+
pretty_stats(stdout, df)
189+
```
190+
191+
`join` also accepts `hdr_override` for changing the column name before appending
192+
`_solver`.
193+
194+
```julia
195+
hdr_override = Dict(:name => "Name", :f => "f(x)", :t => "Time")
196+
df = join(stats, [:f, :t], invariant_cols=[:name], hdr_override=hdr_override)
197+
pretty_stats(stdout, df)
198+
```
199+
200+
```julia
201+
hdr_override = Dict(:name => "Name", :f => "\\(f(x)\\)", :t => "Time")
202+
df = join(stats, [:f, :t], invariant_cols=[:name], hdr_override=hdr_override)
203+
open("alpha3.tex", "w") do io
204+
println(io, "\\documentclass[varwidth=20cm,crop=true]{standalone}")
205+
println(io, "\\usepackage{longtable}[=v4.13]")
206+
println(io, "\\begin{document}")
207+
pretty_latex_stats(io, df)
208+
println(io, "\\end{document}")
209+
end
210+
```
211+
212+
```julia
213+
run(`latexmk -quiet -pdf alpha3.tex`)
214+
run(`pdf2svg alpha3.pdf alpha3.svg`)
215+
```
216+
217+
## Profiles
218+
219+
Performance profiles are a comparison tool developed by [Dolan and
220+
Moré, 2002](https://link.springer.com/article/10.1007/s101070100263/) that takes into
221+
account the relative performance of a solver and whether it has achieved convergence for each
222+
problem. `SolverBenchmark.jl` uses
223+
[BenchmarkProfiles.jl](https://github.com/JuliaSmoothOptimizers/BenchmarkProfiles.jl)
224+
for generating performance profiles from the dictionary of `DataFrame`s.
225+
226+
The basic usage is `performance_profile(stats, cost)`, where `cost` is a function
227+
applied to a `DataFrame` and returning a vector.
228+
229+
```julia
230+
using Plots
231+
pyplot()
232+
233+
p = performance_profile(stats, df -> df.t)
234+
```
235+
236+
Notice that we used `df -> df.t` which corresponds to the column `:t` of the
237+
`DataFrame`s.
238+
This does not take into account that the solvers have failed for a few problems
239+
(according to column :status). The next profile takes that into account.
240+
241+
```julia
242+
cost(df) = (df.status .!= :first_order) * Inf + df.t
243+
p = performance_profile(stats, cost)
244+
```
245+
246+
### Profile wall
247+
248+
Another profile function is `profile_solvers`, which creates a wall of performance
249+
profiles, accepting multiple costs and doing 1 vs 1 comparisons in addition to the
250+
traditional performance profile.
251+
252+
```julia
253+
solved(df) = (df.status .== :first_order)
254+
costs = [df -> .!solved(df) * Inf + df.t, df -> .!solved(df) * Inf + df.iter]
255+
costnames = ["Time", "Iterations"]
256+
p = profile_solvers(stats, costs, costnames)
257+
```
258+
259+
### Example of benchmark running
260+
Here is a useful tutorial on how to use the benchmark with specific solver:
261+
[Run a benchmark with OptimizationProblems](https://juliasmoothoptimizers.github.io/OptimizationProblems.jl/dev/benchmark/)
262+
The tutorial covers how to use the problems from `OptimizationProblems` to run a benchmark for unconstrained optimization.

0 commit comments

Comments
 (0)