Skip to content

Commit 95febd2

Browse files
authored
Merge pull request #549 from wolthom/docs-occupancy
add documentation for Task / Thread occupancy
2 parents 5b5f816 + a9e21e0 commit 95febd2

File tree

2 files changed

+77
-2
lines changed

2 files changed

+77
-2
lines changed

.github/workflows/Documentation.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,4 +26,4 @@ jobs:
2626
env:
2727
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # For authentication with GitHub Actions token
2828
DOCUMENTER_KEY: ${{ secrets.DOCUMENTER_KEY }} # For authentication with SSH deploy key
29-
run: julia --project=docs/ docs/make.jl
29+
run: julia --threads=auto --project=docs/ docs/make.jl

docs/src/task-spawning.md

Lines changed: 76 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,11 @@ its result will be passed into the function receiving the argument. If the
2727
argument is *not* an [`DTask`](@ref) (instead, some other type of Julia object),
2828
it'll be passed as-is to the function `f` (with some exceptions).
2929

30+
!!! note "Task / thread occupancy"
31+
By default, `Dagger` assumes that tasks saturate the thread they are running on and does not try to schedule other tasks on the thread.
32+
This default can be controlled by specifying [`Sch.ThunkOptions`](@ref) (more details can be found under [Scheduler and Thunk options](@ref)).
33+
The section [Changing the thread occupancy](@ref) shows a runnable example of how to achieve this.
34+
3035
## Options
3136

3237
The [`Options`](@ref Dagger.Options) struct in the second argument position is
@@ -182,7 +187,7 @@ Note that, as a legacy API, usage of the lazy API is generally discouraged for m
182187
- Distinct schedulers don't share runtime metrics or learned parameters, thus causing the scheduler to act less intelligently
183188
- Distinct schedulers can't share work or data directly
184189

185-
### Scheduler and Thunk options
190+
## Scheduler and Thunk options
186191

187192
While Dagger generally "just works", sometimes one needs to exert some more
188193
fine-grained control over how the scheduler allocates work. There are two
@@ -215,3 +220,73 @@ Dagger.spawn(+, Dagger.Options(;single=1), 1, 2)
215220

216221
delayed(+; single=1)(1, 2)
217222
```
223+
224+
## Changing the thread occupancy
225+
226+
One of the supported [`Sch.ThunkOptions`](@ref) is the `occupancy` keyword.
227+
This keyword can be used to communicate that a task is not expected to fully saturate a CPU core (e.g. due to being IO-bound).
228+
The basic usage looks like this:
229+
230+
```julia
231+
Dagger.@spawn occupancy=Dict(Dagger.ThreadProc=>0) fn
232+
```
233+
234+
Consider the following function definitions:
235+
236+
```julia
237+
using Dagger
238+
239+
function inner()
240+
sleep(0.1)
241+
end
242+
243+
function outer_full_occupancy()
244+
@sync for _ in 1:2
245+
# By default, full occupancy is assumed
246+
Dagger.@spawn inner()
247+
end
248+
end
249+
250+
function outer_low_occupancy()
251+
@sync for _ in 1:2
252+
# Here, we're explicitly telling the scheduler to assume low occupancy
253+
Dagger.@spawn occupancy=Dict(Dagger.ThreadProc => 0) inner()
254+
end
255+
end
256+
```
257+
258+
When running the first outer function N times in parallel, you should see parallelization until all threads are blocked:
259+
260+
```julia
261+
for N in [1, 2, 4, 8, 16]
262+
@time fetch.([Dagger.@spawn outer_full_occupancy() for _ in 1:N])
263+
end
264+
```
265+
266+
The results from the above code snippet should look similar to this (the timings will be influenced by your specific machine):
267+
268+
```text
269+
0.124829 seconds (44.27 k allocations: 3.055 MiB, 12.61% compilation time)
270+
0.104652 seconds (14.80 k allocations: 1.081 MiB)
271+
0.110588 seconds (28.94 k allocations: 2.138 MiB, 4.91% compilation time)
272+
0.208937 seconds (47.53 k allocations: 2.932 MiB)
273+
0.527545 seconds (79.35 k allocations: 4.384 MiB, 0.64% compilation time)
274+
```
275+
276+
Whereas running the outer function that communicates a low occupancy (`outer_low_occupancy`) should run fully in parallel:
277+
278+
```julia
279+
for N in [1, 2, 4, 8, 16]
280+
@time fetch.([Dagger.@spawn outer_low_occupancy() for _ in 1:N])
281+
end
282+
```
283+
284+
In comparison, the `outer_low_occupancy` snippet should show results like this:
285+
286+
```text
287+
0.120686 seconds (44.38 k allocations: 3.070 MiB, 13.00% compilation time)
288+
0.105665 seconds (15.40 k allocations: 1.072 MiB)
289+
0.107495 seconds (28.56 k allocations: 1.940 MiB)
290+
0.109904 seconds (55.03 k allocations: 3.631 MiB)
291+
0.117239 seconds (87.95 k allocations: 5.372 MiB)
292+
```

0 commit comments

Comments
 (0)