@@ -27,6 +27,11 @@ its result will be passed into the function receiving the argument. If the
27
27
argument is * not* an [ ` DTask ` ] ( @ref ) (instead, some other type of Julia object),
28
28
it'll be passed as-is to the function ` f ` (with some exceptions).
29
29
30
+ !!! note "Task / thread occupancy"
31
+ By default, ` Dagger ` assumes that tasks saturate the thread they are running on and does not try to schedule other tasks on the thread.
32
+ This default can be controlled by specifying [ ` Sch.ThunkOptions ` ] ( @ref ) (more details can be found under [ Scheduler and Thunk options] ( @ref ) ).
33
+ The section [ Changing the thread occupancy] ( @ref ) shows a runnable example of how to achieve this.
34
+
30
35
## Options
31
36
32
37
The [ ` Options ` ] (@ref Dagger.Options) struct in the second argument position is
@@ -182,7 +187,7 @@ Note that, as a legacy API, usage of the lazy API is generally discouraged for m
182
187
- Distinct schedulers don't share runtime metrics or learned parameters, thus causing the scheduler to act less intelligently
183
188
- Distinct schedulers can't share work or data directly
184
189
185
- ### Scheduler and Thunk options
190
+ ## Scheduler and Thunk options
186
191
187
192
While Dagger generally "just works", sometimes one needs to exert some more
188
193
fine-grained control over how the scheduler allocates work. There are two
@@ -215,3 +220,73 @@ Dagger.spawn(+, Dagger.Options(;single=1), 1, 2)
215
220
216
221
delayed (+ ; single= 1 )(1 , 2 )
217
222
```
223
+
224
+ ## Changing the thread occupancy
225
+
226
+ One of the supported [ ` Sch.ThunkOptions ` ] ( @ref ) is the ` occupancy ` keyword.
227
+ This keyword can be used to communicate that a task is not expected to fully saturate a CPU core (e.g. due to being IO-bound).
228
+ The basic usage looks like this:
229
+
230
+ ``` julia
231
+ Dagger. @spawn occupancy= Dict (Dagger. ThreadProc=> 0 ) fn
232
+ ```
233
+
234
+ Consider the following function definitions:
235
+
236
+ ``` julia
237
+ using Dagger
238
+
239
+ function inner ()
240
+ sleep (0.1 )
241
+ end
242
+
243
+ function outer_full_occupancy ()
244
+ @sync for _ in 1 : 2
245
+ # By default, full occupancy is assumed
246
+ Dagger. @spawn inner ()
247
+ end
248
+ end
249
+
250
+ function outer_low_occupancy ()
251
+ @sync for _ in 1 : 2
252
+ # Here, we're explicitly telling the scheduler to assume low occupancy
253
+ Dagger. @spawn occupancy= Dict (Dagger. ThreadProc => 0 ) inner ()
254
+ end
255
+ end
256
+ ```
257
+
258
+ When running the first outer function N times in parallel, you should see parallelization until all threads are blocked:
259
+
260
+ ``` julia
261
+ for N in [1 , 2 , 4 , 8 , 16 ]
262
+ @time fetch .([Dagger. @spawn outer_full_occupancy () for _ in 1 : N])
263
+ end
264
+ ```
265
+
266
+ The results from the above code snippet should look similar to this (the timings will be influenced by your specific machine):
267
+
268
+ ``` text
269
+ 0.124829 seconds (44.27 k allocations: 3.055 MiB, 12.61% compilation time)
270
+ 0.104652 seconds (14.80 k allocations: 1.081 MiB)
271
+ 0.110588 seconds (28.94 k allocations: 2.138 MiB, 4.91% compilation time)
272
+ 0.208937 seconds (47.53 k allocations: 2.932 MiB)
273
+ 0.527545 seconds (79.35 k allocations: 4.384 MiB, 0.64% compilation time)
274
+ ```
275
+
276
+ Whereas running the outer function that communicates a low occupancy (` outer_low_occupancy ` ) should run fully in parallel:
277
+
278
+ ``` julia
279
+ for N in [1 , 2 , 4 , 8 , 16 ]
280
+ @time fetch .([Dagger. @spawn outer_low_occupancy () for _ in 1 : N])
281
+ end
282
+ ```
283
+
284
+ In comparison, the ` outer_low_occupancy ` snippet should show results like this:
285
+
286
+ ``` text
287
+ 0.120686 seconds (44.38 k allocations: 3.070 MiB, 13.00% compilation time)
288
+ 0.105665 seconds (15.40 k allocations: 1.072 MiB)
289
+ 0.107495 seconds (28.56 k allocations: 1.940 MiB)
290
+ 0.109904 seconds (55.03 k allocations: 3.631 MiB)
291
+ 0.117239 seconds (87.95 k allocations: 5.372 MiB)
292
+ ```
0 commit comments