Skip to content

Commit edf29bb

Browse files
efaulhabersvchbLasNikas
authored
Improve benchmarking code (#118)
* Improve benchmarking code * Rename plot.jl * Use realistic search radius * Fix update benchmark * Fix tests * Reformat * Fix tests * Fix tests * Improve docs --------- Co-authored-by: Sven Berger <[email protected]> Co-authored-by: Niklas Neher <[email protected]>
1 parent 70dfeab commit edf29bb

File tree

9 files changed

+299
-158
lines changed

9 files changed

+299
-158
lines changed

benchmarks/benchmarks.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ include("n_body.jl")
33
include("smoothed_particle_hydrodynamics.jl")
44
include("update.jl")
55

6-
include("plot.jl")
6+
include("run_benchmarks.jl")

benchmarks/count_neighbors.jl

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,8 @@ using PointNeighbors
22
using BenchmarkTools
33

44
"""
5-
benchmark_count_neighbors(neighborhood_search, coordinates; parallel = true)
5+
benchmark_count_neighbors(neighborhood_search, coordinates;
6+
parallelization_backend = default_backend(coordinates))
67
78
A very cheap and simple neighborhood search benchmark, only counting the neighbors of each
89
point. For each point-neighbor pair, only an array entry is incremented.

benchmarks/n_body.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,8 @@ using PointNeighbors
22
using BenchmarkTools
33

44
"""
5-
benchmark_n_body(neighborhood_search, coordinates; parallel = true)
5+
benchmark_n_body(neighborhood_search, coordinates;
6+
parallelization_backend = default_backend(coordinates))
67
78
A simple neighborhood search benchmark, computing the right-hand side of an n-body
89
simulation with a cutoff (corresponding to the search radius of `neighborhood_search`).
@@ -16,7 +17,6 @@ function benchmark_n_body(neighborhood_search, coordinates_;
1617
parallelization_backend = default_backend(coordinates_))
1718
# Passing a different backend like `CUDA.CUDABackend`
1819
# allows us to change the type of the array to run the benchmark on the GPU.
19-
# Passing `parallel = true` or `parallel = false` will not change anything here.
2020
coordinates = PointNeighbors.Adapt.adapt(parallelization_backend, coordinates_)
2121
nhs = PointNeighbors.Adapt.adapt(parallelization_backend, neighborhood_search)
2222

benchmarks/plot.jl

Lines changed: 0 additions & 86 deletions
This file was deleted.

benchmarks/run_benchmarks.jl

Lines changed: 228 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,228 @@
1+
using Plots
2+
using BenchmarkTools
3+
4+
# Generate a rectangular point cloud
5+
include("../test/point_cloud.jl")
6+
7+
"""
8+
run_benchmarks(benchmark, n_points_per_dimension, iterations, neighborhood_searches;
9+
parallelization_backend = PolyesterBackend(),
10+
names = ["NeighborhoodSearch 1" "NeighborhoodSearch 2" ...],
11+
seed = 1, perturbation_factor_position = 1.0)
12+
13+
Run a benchmark with several neighborhood searches multiple times for increasing numbers
14+
of points and return the results as `(n_particles_vec, times)`, where `n_particles_vec`
15+
is a vector containing the number of particles for each iteration and `times` is a matrix
16+
containing the runtimes for each neighborhood search and iteration.
17+
18+
See also
19+
- [`plot_benchmark`](@ref) to plot the results,
20+
- [`run_benchmark_default`](@ref) to run the benchmark with the most commonly used
21+
neighborhood search implementations,
22+
- [`run_benchmark_gpu`](@ref) to run the benchmark with all GPU-compatible neighborhood
23+
search implementations.
24+
25+
# Arguments
26+
- `benchmark`: The benchmark function. See [`benchmark_count_neighbors`](@ref),
27+
[`benchmark_n_body`](@ref), [`benchmark_wcsph`](@ref),
28+
[`benchmark_wcsph_fp32`](@ref) and [`benchmark_tlsph`](@ref).
29+
- `n_points_per_dimension`: Initial resolution as tuple. The product is the initial number
30+
of points. For example, use `(100, 100)` for a 2D benchmark or
31+
`(10, 10, 10)` for a 3D benchmark.
32+
- `iterations`: Number of refinement iterations
33+
34+
# Keywords
35+
- `parallelization_backend = PolyesterBackend()`: Parallelization strategy to use. See
36+
[`@threaded`](@ref) for a list of available
37+
backends.
38+
- `seed = 1`: Seed to perturb the point positions. Different seeds yield
39+
slightly different point positions.
40+
- `perturbation_factor_position = 1.0`: Scale the point position perturbation by this factor.
41+
A factor of `1.0` corresponds to a standard deviation
42+
similar to that of a realistic simulation.
43+
44+
# Examples
45+
```julia
46+
include("benchmarks/benchmarks.jl")
47+
48+
run_benchmark(benchmark_count_neighbors, (10, 10), 3,
49+
[TrivialNeighborhoodSearch{2}(), GridNeighborhoodSearch{2}()])
50+
```
51+
"""
52+
function run_benchmark(benchmark, n_points_per_dimension, iterations, neighborhood_searches;
53+
parallelization_backend = PolyesterBackend(),
54+
names = ["Neighborhood search $i"
55+
for i in 1:length(neighborhood_searches)]',
56+
seed = 1, perturbation_factor_position = 1.0)
57+
# Multiply number of points in each iteration (roughly) by this factor
58+
scaling_factor = 4
59+
per_dimension_factor = scaling_factor^(1 / length(n_points_per_dimension))
60+
sizes = [round.(Int, n_points_per_dimension .* per_dimension_factor^(iter - 1))
61+
for iter in 1:iterations]
62+
63+
n_particles_vec = prod.(sizes)
64+
times = zeros(iterations, length(neighborhood_searches))
65+
66+
for iter in 1:iterations
67+
coordinates = point_cloud(sizes[iter]; seed, perturbation_factor_position)
68+
domain_size = maximum(sizes[iter]) + 1
69+
70+
# Normalize domain size to 1
71+
coordinates ./= domain_size
72+
73+
# Make this Float32 to make sure that Float32 benchmarks use Float32 exclusively
74+
search_radius = 4.0f0 / domain_size
75+
n_particles = size(coordinates, 2)
76+
77+
neighborhood_searches_copy = copy_neighborhood_search.(neighborhood_searches,
78+
search_radius, n_particles)
79+
80+
for i in eachindex(neighborhood_searches_copy)
81+
neighborhood_search = neighborhood_searches_copy[i]
82+
PointNeighbors.initialize!(neighborhood_search, coordinates, coordinates)
83+
84+
time = benchmark(neighborhood_search, coordinates; parallelization_backend)
85+
times[iter, i] = time
86+
time_string = BenchmarkTools.prettytime(time * 1e9)
87+
time_string_per_particle = BenchmarkTools.prettytime(time * 1e9 / n_particles)
88+
println("$(names[i])")
89+
println("with $(join(sizes[iter], "x")) = $(prod(sizes[iter])) particles " *
90+
"finished in $time_string ($time_string_per_particle per particle)\n")
91+
end
92+
end
93+
94+
return n_particles_vec, times
95+
end
96+
97+
"""
98+
run_benchmark_default(benchmark, n_points_per_dimension, iterations; kwargs...)
99+
100+
Shortcut to call [`run_benchmark`](@ref) with the most commonly used neighborhood search
101+
implementations:
102+
- `GridNeighborhoodSearch`
103+
- `GridNeighborhoodSearch` with `FullGridCellList`
104+
- `PrecomputedNeighborhoodSearch`
105+
106+
# Arguments
107+
- `benchmark`: The benchmark function. See [`benchmark_count_neighbors`](@ref),
108+
[`benchmark_n_body`](@ref), [`benchmark_wcsph`](@ref),
109+
[`benchmark_wcsph_fp32`](@ref) and [`benchmark_tlsph`](@ref).
110+
- `n_points_per_dimension`: Initial resolution as tuple. The product is the initial number
111+
of points. For example, use `(100, 100)` for a 2D benchmark or
112+
`(10, 10, 10)` for a 3D benchmark.
113+
- `iterations`: Number of refinement iterations
114+
115+
# Keywords
116+
See [`run_benchmark`](@ref) for a list of available keywords.
117+
118+
# Examples
119+
```julia
120+
include("benchmarks/benchmarks.jl")
121+
122+
run_benchmark_default(benchmark_n_body, (10, 10), 3)
123+
```
124+
"""
125+
function run_benchmark_default(benchmark, n_points_per_dimension, iterations; kwargs...)
126+
NDIMS = length(n_points_per_dimension)
127+
min_corner = 0.0f0 .* n_points_per_dimension
128+
max_corner = Float32.(n_points_per_dimension ./ maximum(n_points_per_dimension))
129+
130+
neighborhood_searches = [
131+
GridNeighborhoodSearch{NDIMS}(),
132+
GridNeighborhoodSearch{NDIMS}(search_radius = 0.0f0,
133+
cell_list = FullGridCellList(; search_radius = 0.0f0,
134+
min_corner, max_corner)),
135+
PrecomputedNeighborhoodSearch{NDIMS}()
136+
]
137+
138+
names = ["GridNeighborhoodSearch";;
139+
"GridNeighborhoodSearch with FullGridCellList";;
140+
"PrecomputedNeighborhoodSearch"]
141+
142+
run_benchmark(benchmark, n_points_per_dimension, iterations,
143+
neighborhood_searches; names, kwargs...)
144+
end
145+
146+
"""
147+
run_benchmark_gpu(benchmark, n_points_per_dimension, iterations; kwargs...)
148+
149+
Shortcut to call [`run_benchmark`](@ref) with all GPU-compatible neighborhood search
150+
implementations:
151+
- `GridNeighborhoodSearch` with `FullGridCellList`
152+
153+
# Arguments
154+
- `benchmark`: The benchmark function. See [`benchmark_count_neighbors`](@ref),
155+
[`benchmark_n_body`](@ref), [`benchmark_wcsph`](@ref),
156+
[`benchmark_wcsph_fp32`](@ref) and [`benchmark_tlsph`](@ref).
157+
- `n_points_per_dimension`: Initial resolution as tuple. The product is the initial number
158+
of points. For example, use `(100, 100)` for a 2D benchmark or
159+
`(10, 10, 10)` for a 3D benchmark.
160+
- `iterations`: Number of refinement iterations
161+
162+
# Keywords
163+
See [`run_benchmark`](@ref) for a list of available keywords.
164+
165+
# Examples
166+
```julia
167+
include("benchmarks/benchmarks.jl")
168+
169+
run_benchmark_gpu(benchmark_n_body, (10, 10), 3)
170+
```
171+
"""
172+
function run_benchmark_gpu(benchmark, n_points_per_dimension, iterations; kwargs...)
173+
NDIMS = length(n_points_per_dimension)
174+
175+
min_corner = 0.0f0 .* n_points_per_dimension
176+
max_corner = Float32.(n_points_per_dimension ./ maximum(n_points_per_dimension))
177+
neighborhood_searches = [
178+
GridNeighborhoodSearch{NDIMS}(search_radius = 0.0f0,
179+
cell_list = FullGridCellList(; search_radius = 0.0f0,
180+
min_corner, max_corner))
181+
]
182+
183+
names = ["GridNeighborhoodSearch with FullGridCellList";;]
184+
185+
run_benchmark(benchmark, n_points_per_dimension, iterations,
186+
neighborhood_searches; names, kwargs...)
187+
end
188+
189+
"""
190+
plot_benchmark(n_particles_vec, times; kwargs...)
191+
192+
Plot the results of a benchmark run with [`run_benchmark`](@ref).
193+
Note that the arguments are the outputs of that function.
194+
195+
# Arguments
196+
- `n_particles_vec`: Vector containing the number of particles for each iteration.
197+
- `times`: Matrix containing the runtimes for each neighborhood search and iteration.
198+
199+
# Keywords
200+
Keyword arguments are passed to `Plots.plot`. For example, use `title = "My title"`.
201+
202+
# Examples
203+
```julia
204+
include("benchmarks/benchmarks.jl")
205+
206+
n_particles_vec, times = run_benchmark_default(benchmark_count_neighbors, (10, 10), 3)
207+
plot_benchmark(n_particles_vec, times; title = "Count neighbors benchmark")
208+
```
209+
"""
210+
function plot_benchmark(n_particles_vec, times; kwargs...)
211+
function format_n_particles(n)
212+
if n >= 1_000_000
213+
return "$(round(Int, n / 1_000_000))M"
214+
elseif n >= 1_000
215+
return "$(round(Int, n / 1_000))k"
216+
else
217+
return string(n)
218+
end
219+
end
220+
xticks = format_n_particles.(n_particles_vec)
221+
222+
plot(n_particles_vec, times ./ n_particles_vec .* 1e9;
223+
xaxis = :log,
224+
xticks = (n_particles_vec, xticks), linewidth = 2,
225+
xlabel = "#particles", ylabel = "runtime per particle [ns]",
226+
legend = :outerright, size = (700, 350), dpi = 200, margin = 4 * Plots.mm,
227+
palette = palette(:tab10), kwargs...)
228+
end

0 commit comments

Comments
 (0)