Skip to content

Commit 4b91f66

Browse files
carstenbauerMasonProtterfredrikekre
authored
Prepare for ChunkSplitters 3.0 (#119)
* prepare for ChunkSplitters 3.0 * update * support symbol and split * blub * check chunks incompatible kwargs * error when chunks or index_chunks innput + chunking=true * rename + error on incompatible chunking kwargs * throw errors for unsupported split symbols * dont export split types * Update src/schedulers.jl Co-authored-by: Mason Protter <[email protected]> * Update src/schedulers.jl Co-authored-by: Mason Protter <[email protected]> * minor improvements * Update src/schedulers.jl Co-authored-by: Fredrik Ekre <[email protected]> * Update src/schedulers.jl Co-authored-by: Fredrik Ekre <[email protected]> * Update src/schedulers.jl Co-authored-by: Fredrik Ekre <[email protected]> * split -> type parameter --------- Co-authored-by: Mason Protter <[email protected]> Co-authored-by: Fredrik Ekre <[email protected]>
1 parent 491dac9 commit 4b91f66

File tree

13 files changed

+176
-109
lines changed

13 files changed

+176
-109
lines changed

CHANGELOG.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,16 @@
11
OhMyThreads.jl Changelog
22
=========================
33

4+
Version 0.7.0
5+
-------------
6+
- ![BREAKING][badge-breaking] We now use ChunkSplitters version 3.0. The function `OhMyThreads.chunks` has been renamed to `OhMyThreads.index_chunks`. The new functions `index_chunks` and `chunks` (different from the old one with the same name!) are now exported. See ChunkSplitters.jl for more information.
7+
- ![BREAKING][badge-breaking] If you provide a `chunks` or `index_chunks` as input we now disable the internal chunking without a warning. Previously, we did show a warning unless you had set `chunking=false`. In contrast, we now throw an error when you set any incompatible chunking related keyword arguments.
8+
- ![Deprecation][badge-deprecation] The `split` options `:batch` and `:scatter` are now deprecated (they still work but will be dropped at some point). Use `:consecutive` and `:roundrobin`, respectively, instead.
9+
- ![Enhancement][badge-enhancement] The `split` keyword argument can now also be a `<: OhMyThreads.Split`. Compared to providing a `Symbol`, the former can potentially give better performance. For example, you can replace `:consecutive` by `OhMyThreads.Consecutive()` and `:roundrobin` by `OhMyThreads.RoundRobin()`.
10+
411
Version 0.6.2
512
-------------
6-
- ![Enhancement][badge-enhancement] Added API support for `enumerate(chunks(...))`. Best used in combination with `chunking=false`.
13+
- ![Enhancement][badge-enhancement] Added API support for `enumerate(chunks(...))`. Best used in combination with `chunking=false`
714

815
Version 0.6.1
916
-------------

Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ TaskLocalValues = "ed4db957-447d-4319-bfb6-7fa9ae7ecf34"
1212
[compat]
1313
Aqua = "0.8"
1414
BangBang = "0.3.40, 0.4"
15-
ChunkSplitters = "2.4"
15+
ChunkSplitters = "3"
1616
StableTasks = "0.1.5"
1717
TaskLocalValues = "0.1"
1818
Test = "1"

docs/src/literate/falsesharing/falsesharing.jl

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -30,11 +30,11 @@ data = rand(1_000_000 * nthreads());
3030
#
3131
# A common, manual implementation of this idea might look like this:
3232

33-
using OhMyThreads: @spawn, chunks
33+
using OhMyThreads: @spawn, index_chunks
3434

3535
function parallel_sum_falsesharing(data; nchunks = nthreads())
3636
psums = zeros(eltype(data), nchunks)
37-
@sync for (c, idcs) in enumerate(chunks(data; n = nchunks))
37+
@sync for (c, idcs) in enumerate(index_chunks(data; n = nchunks))
3838
@spawn begin
3939
for i in idcs
4040
psums[c] += data[i]
@@ -102,7 +102,7 @@ nthreads()
102102

103103
function parallel_sum_tasklocal(data; nchunks = nthreads())
104104
psums = zeros(eltype(data), nchunks)
105-
@sync for (c, idcs) in enumerate(chunks(data; n = nchunks))
105+
@sync for (c, idcs) in enumerate(index_chunks(data; n = nchunks))
106106
@spawn begin
107107
local s = zero(eltype(data))
108108
for i in idcs
@@ -131,7 +131,7 @@ end
131131
# using `map` and reusing the built-in (sequential) `sum` function on each parallel task:
132132

133133
function parallel_sum_map(data; nchunks = nthreads())
134-
ts = map(chunks(data, n = nchunks)) do idcs
134+
ts = map(index_chunks(data, n = nchunks)) do idcs
135135
@spawn @views sum(data[idcs])
136136
end
137137
return sum(fetch.(ts))

docs/src/literate/falsesharing/falsesharing.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -39,11 +39,11 @@ catastrophic numerical errors due to potential rearrangements of terms in the su
3939
A common, manual implementation of this idea might look like this:
4040

4141
````julia
42-
using OhMyThreads: @spawn, chunks
42+
using OhMyThreads: @spawn, index_chunks
4343

4444
function parallel_sum_falsesharing(data; nchunks = nthreads())
4545
psums = zeros(eltype(data), nchunks)
46-
@sync for (c, idcs) in enumerate(chunks(data; n = nchunks))
46+
@sync for (c, idcs) in enumerate(index_chunks(data; n = nchunks))
4747
@spawn begin
4848
for i in idcs
4949
psums[c] += data[i]
@@ -132,7 +132,7 @@ into `psums` (once!).
132132
````julia
133133
function parallel_sum_tasklocal(data; nchunks = nthreads())
134134
psums = zeros(eltype(data), nchunks)
135-
@sync for (c, idcs) in enumerate(chunks(data; n = nchunks))
135+
@sync for (c, idcs) in enumerate(index_chunks(data; n = nchunks))
136136
@spawn begin
137137
local s = zero(eltype(data))
138138
for i in idcs
@@ -168,7 +168,7 @@ using `map` and reusing the built-in (sequential) `sum` function on each paralle
168168

169169
````julia
170170
function parallel_sum_map(data; nchunks = nthreads())
171-
ts = map(chunks(data, n = nchunks)) do idcs
171+
ts = map(index_chunks(data, n = nchunks)) do idcs
172172
@spawn @views sum(data[idcs])
173173
end
174174
return sum(fetch.(ts))

docs/src/literate/mc/mc.jl

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -79,15 +79,15 @@ using OhMyThreads: StaticScheduler
7979

8080
# ## Manual parallelization
8181
#
82-
# First, using the `chunks` function, we divide the iteration interval `1:N` into
82+
# First, using the `index_chunks` function, we divide the iteration interval `1:N` into
8383
# `nthreads()` parts. Then, we apply a regular (sequential) `map` to spawn a Julia task
8484
# per chunk. Each task will locally and independently perform a sequential Monte Carlo
8585
# simulation. Finally, we fetch the results and compute the average estimate for $\pi$.
8686

87-
using OhMyThreads: @spawn, chunks
87+
using OhMyThreads: @spawn, index_chunks
8888

8989
function mc_parallel_manual(N; nchunks = nthreads())
90-
tasks = map(chunks(1:N; n = nchunks)) do idcs
90+
tasks = map(index_chunks(1:N; n = nchunks)) do idcs
9191
@spawn mc(length(idcs))
9292
end
9393
pi = sum(fetch, tasks) / nchunks
@@ -104,7 +104,7 @@ mc_parallel_manual(N)
104104
# `mc(length(idcs))` is faster than the implicit task-local computation within
105105
# `tmapreduce` (which itself is a `mapreduce`).
106106

107-
idcs = first(chunks(1:N; n = nthreads()))
107+
idcs = first(index_chunks(1:N; n = nthreads()))
108108

109109
@btime mapreduce($+, $idcs) do i
110110
rand()^2 + rand()^2 < 1.0

docs/src/literate/mc/mc.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -112,16 +112,16 @@ using OhMyThreads: StaticScheduler
112112

113113
## Manual parallelization
114114

115-
First, using the `chunks` function, we divide the iteration interval `1:N` into
115+
First, using the `index_chunks` function, we divide the iteration interval `1:N` into
116116
`nthreads()` parts. Then, we apply a regular (sequential) `map` to spawn a Julia task
117117
per chunk. Each task will locally and independently perform a sequential Monte Carlo
118118
simulation. Finally, we fetch the results and compute the average estimate for $\pi$.
119119

120120
````julia
121-
using OhMyThreads: @spawn, chunks
121+
using OhMyThreads: @spawn, index_chunks
122122

123123
function mc_parallel_manual(N; nchunks = nthreads())
124-
tasks = map(chunks(1:N; n = nchunks)) do idcs
124+
tasks = map(index_chunks(1:N; n = nchunks)) do idcs
125125
@spawn mc(length(idcs))
126126
end
127127
pi = sum(fetch, tasks) / nchunks
@@ -151,7 +151,7 @@ It is faster than `mc_parallel` above because the task-local computation
151151
`tmapreduce` (which itself is a `mapreduce`).
152152

153153
````julia
154-
idcs = first(chunks(1:N; n = nthreads()))
154+
idcs = first(index_chunks(1:N; n = nthreads()))
155155

156156
@btime mapreduce($+, $idcs) do i
157157
rand()^2 + rand()^2 < 1.0

docs/src/literate/tls/tls.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -102,12 +102,12 @@ res ≈ res_naive
102102
# iterations (i.e. matrix pairs) for which this task is responsible.
103103
# Before we learn how to do this more conveniently, let's implement this idea of a
104104
# task-local temporary buffer (for each parallel task) manually.
105-
using OhMyThreads: chunks, @spawn
105+
using OhMyThreads: index_chunks, @spawn
106106
using Base.Threads: nthreads
107107

108108
function matmulsums_manual(As, Bs)
109109
N = size(first(As), 1)
110-
tasks = map(chunks(As; n = 2 * nthreads())) do idcs
110+
tasks = map(index_chunks(As; n = 2 * nthreads())) do idcs
111111
@spawn begin
112112
local C = Matrix{Float64}(undef, N, N)
113113
map(idcs) do i

docs/src/literate/tls/tls.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -140,12 +140,12 @@ Before we learn how to do this more conveniently, let's implement this idea of a
140140
task-local temporary buffer (for each parallel task) manually.
141141

142142
````julia
143-
using OhMyThreads: chunks, @spawn
143+
using OhMyThreads: index_chunks, @spawn
144144
using Base.Threads: nthreads
145145

146146
function matmulsums_manual(As, Bs)
147147
N = size(first(As), 1)
148-
tasks = map(chunks(As; n = 2 * nthreads())) do idcs
148+
tasks = map(index_chunks(As; n = 2 * nthreads())) do idcs
149149
@spawn begin
150150
local C = Matrix{Float64}(undef, N, N)
151151
map(idcs) do i

docs/src/refs/api.md

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,16 +37,25 @@ GreedyScheduler
3737
SerialScheduler
3838
```
3939

40-
## Non-Exported
40+
## Re-exported
41+
42+
| | |
43+
|------------------------|---------------------------------------------------------------------|
44+
| `OhMyThreads.chunks` | see [ChunkSplitters.jl](https://juliafolds2.github.io/ChunkSplitters.jl/stable/references/#ChunkSplitters.chunks) |
45+
| `OhMyThreads.index_chunks` | see [ChunkSplitters.jl](https://juliafolds2.github.io/ChunkSplitters.jl/stable/references/#ChunkSplitters.index_chunks) |
46+
47+
## Public but not exported
4148

4249
| | |
4350
|------------------------|---------------------------------------------------------------------|
4451
| `OhMyThreads.@spawn` | see [StableTasks.jl](https://github.com/JuliaFolds2/StableTasks.jl) |
4552
| `OhMyThreads.@spawnat` | see [StableTasks.jl](https://github.com/JuliaFolds2/StableTasks.jl) |
4653
| `OhMyThreads.@fetch` | see [StableTasks.jl](https://github.com/JuliaFolds2/StableTasks.jl) |
4754
| `OhMyThreads.@fetchfrom` | see [StableTasks.jl](https://github.com/JuliaFolds2/StableTasks.jl) |
48-
| `OhMyThreads.chunks` | see [ChunkSplitters.jl](https://juliafolds2.github.io/ChunkSplitters.jl/dev/references/#ChunkSplitters.chunks) |
4955
| `OhMyThreads.TaskLocalValue` | see [TaskLocalValues.jl](https://github.com/vchuravy/TaskLocalValues.jl) |
56+
| `OhMyThreads.Split` | see [ChunkSplitters.jl](https://juliafolds2.github.io/ChunkSplitters.jl/stable/references/#ChunkSplitters.Split) |
57+
| `OhMyThreads.Consecutive` | see [ChunkSplitters.jl](https://juliafolds2.github.io/ChunkSplitters.jl/stable/references/#ChunkSplitters.Consecutive) |
58+
| `OhMyThreads.RoundRobin` | see [ChunkSplitters.jl](https://juliafolds2.github.io/ChunkSplitters.jl/stable/references/#ChunkSplitters.RoundRobin) |
5059

5160

5261
```@docs

src/OhMyThreads.jl

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,12 @@ for mac in Symbol.(["@spawn", "@spawnat", "@fetch", "@fetchfrom"])
66
end
77

88
using ChunkSplitters: ChunkSplitters
9+
const index_chunks = ChunkSplitters.index_chunks
910
const chunks = ChunkSplitters.chunks
11+
const Split = ChunkSplitters.Split
12+
const Consecutive = ChunkSplitters.Consecutive
13+
const RoundRobin = ChunkSplitters.RoundRobin
14+
export chunks, index_chunks
1015

1116
using TaskLocalValues: TaskLocalValues
1217
const TaskLocalValue = TaskLocalValues.TaskLocalValue

0 commit comments

Comments
 (0)