You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add task affinity support with compute_scope and result_scope in Dagger.jl's @Spawn macro
- Enhanced the Thunk struct to include compute_scope and result_scope for better task execution control.
- Updated the Thunk constructor to accept new scope parameters.
- Modified the spawn function to handle the new scope parameters appropriately.
- Introduced a new test suite for task affinity, covering various scenarios with scope interactions.
- Added comprehensive documentation for task affinity, detailing the usage of scope, compute_scope, and result_scope.
- Implemented tests to validate behavior when using chunks as inputs in tasks, ensuring correct scope handling.
- Updated documentation for the `@spawn` macro to clarify the usage of `scope`, `compute_scope`, and `result_scope`, including examples with the new syntax.
- Improved error messages in the scheduling logic to provide clearer feedback when scopes are incompatible.
- Refactored test cases for task affinity to ensure they align with the new scope handling and provide better coverage for edge cases.
- Removed deprecated comments and cleaned up the code for better readability.
Dagger's allows for precise control over task placement and result availability using scopes. Tasks are assigned based on the combination of multiple scopes: `scope`/`compute_scope`, and `result_scope` (which can all be specified with `@spawn`), and additionally the scopes of any arguments to the task (in the form of a scope attached to a `Chunk` argument). Let's take a look at how to configure these scopes, and how they work together to direct task placement.
4
+
5
+
For more information on how scopes work, see [Scopes](@ref).
6
+
7
+
---
8
+
9
+
## Task Scopes
10
+
11
+
### Scope
12
+
13
+
`scope` defines the general set of locations where a Dagger task can execute. If `scope` is not specified, the task falls back to `DefaultScope()`, allowing it to run wherever execution is possible. Execution occurs on any worker within the defined scope.
14
+
15
+
**Example:**
16
+
```julia
17
+
g = Dagger.@spawn scope=Dagger.scope(worker=3) f(x,y)
18
+
```
19
+
Task `g` executes only on worker 3. Its result can be accessed by any worker.
20
+
21
+
---
22
+
23
+
### Compute Scope
24
+
25
+
Like `scope`, `compute_scope` also specifies where a Dagger task can execute. The key difference is if both `compute_scope` and `scope` are provided, `compute_scope` takes precedence over `scope` for execution placement. If neither is specified, then they default to `DefaultScope()`.
Tasks `g1` and `g2` execute on either thread 2 of worker 1, or thread 1 of worker 3. The `scope` argument to `g1` is ignored. Their result can be accessed by any worker.
33
+
34
+
---
35
+
36
+
### Result Scope
37
+
38
+
The result_scope limits the processors from which a task's result can be accessed. This can be useful for managing data locality and minimizing transfers. If `result_scope` is not specified, it defaults to `AnyScope()`, meaning the result can be accessed by any processor (including those not default enabled for task execution, such as GPUs).
39
+
40
+
**Example:**
41
+
```julia
42
+
g = Dagger.@spawn result_scope=Dagger.scope(worker=3, threads=[1, 3, 4]) f(x,y)
43
+
```
44
+
The result of `g` is accessible only from threads 1, 3 and 4 of worker process 3. The task's execution may happen anywhere on threads 1, 3 and 4 of worker 3.
45
+
46
+
---
47
+
48
+
## Interaction of `compute_scope` and `result_scope`
49
+
50
+
When `scope`/`compute_scope` and `result_scope` are specified, the scheduler executes the task on the intersection of the effective compute scope (which will be `compute_scope` if provided, otherwise `scope`) and the `result_scope`. If the intersection is empty, then the scheduler throws a `Dagger.Sch.SchedulerException` error.
51
+
52
+
**Example:**
53
+
```julia
54
+
g = Dagger.@spawn scope=Dagger.scope(worker=3,thread=2) compute_scope=Dagger.scope(worker=2) result_scope=Dagger.scope((worker=2, thread=2), (worker=4, thread=2)) f(x,y)
55
+
```
56
+
The task `g` computes on thread 2 of worker 2 (as it's the intersection of compute and result scopes), but accessng its result is restricted to thread 2 of worker 2 and thread 2 of worker 4.
57
+
58
+
---
59
+
60
+
## Function as a Chunk
61
+
62
+
This section explains how `scope`/`compute_scope` and `result_scope` affect tasks when a `Chunk` is used to specify the function to be executed by `@spawn` (e.g. created via `Dagger.tochunk(...)` or by calling `fetch(task; raw=true)` on a task). This may seem strange (to use a `Chunk` to specify the function to be executed), but it can be useful with working with callable structs, such as closures or Flux.jl models.
63
+
64
+
Assume `g` is some function, e.g. `g(x, y) = x * 2 + y * 3`, and `chunk_scope` is its defined affinity.
65
+
66
+
When `Dagger.tochunk(...)` is used to pass a `Chunk` as the function to be executed by `@spawn`:
67
+
- The result is accessible only on processors in `chunk_scope`.
68
+
- Dagger validates that there is an intersection between `chunk_scope`, the effective `compute_scope` (derived from `@spawn`'s `compute_scope` or `scope`), and the `result_scope`. If no intersection exists, the scheduler throws an exception.
69
+
70
+
!!! info While `chunk_proc` is currently required when constructing a chunk, it is only used to pick the most optimal processor for accessing the chunk; it does not affect which set of processors the task may execute on.
71
+
72
+
**Usage:**
73
+
```julia
74
+
chunk_scope = Dagger.scope(worker=3)
75
+
chunk_proc = Dagger.OSProc(3) # not important, just needs to be a valid processor
In all these cases (`h1` through `h5`), the tasks get executed on any processor within `chunk_scope` and its result is accessible only within `chunk_scope`.
85
+
86
+
---
87
+
88
+
## Chunk arguments
89
+
90
+
This section details behavior when some or all of a task's arguments are `Chunk`s.
91
+
92
+
Assume `g(x, y) = x * 2 + y * 3`, and `arg = Dagger.tochunk(g(1, 2), arg_proc, arg_scope)`, where `arg_scope` is the argument's defined scope. Assume `arg_scope = Dagger.scope(worker=2)`.
93
+
94
+
### Scope
95
+
If `arg_scope` and `scope` do not intersect, the scheduler throws an exception. Execution occurs on the intersection of `scope` and `arg_scope`.
96
+
97
+
```julia
98
+
h = Dagger.@spawn scope=Dagger.scope(worker=2) g(arg, 11)
99
+
```
100
+
Task `h` executes on any worker within the intersection of `scope` and `arg_scope`. The result is accessible from any processor.
101
+
102
+
---
103
+
104
+
### Compute scope and Chunk argument scopes interaction
105
+
If `arg_scope` and `compute_scope` do not intersect, the scheduler throws an exception. Otherwise, execution happens on the intersection of the effective compute scope (which will be `compute_scope` if provided, otherwise `scope`) and `arg_scope`.
Tasks `h1` and `h2` execute on any processor within the intersection of the `compute_scope` and `arg_scope`. `scope` is ignored if `compute_scope` is specified. The result is accessible from any processor.
112
+
113
+
---
114
+
115
+
### Result scope and Chunk argument scopes interaction
116
+
If only `result_scope` is specified, computation happens on any processor within the intersection of `arg_scope` and `result_scope`, and the result is only accessible within `result_scope`.
117
+
118
+
```julia
119
+
h = Dagger.@spawn result_scope=Dagger.scope(worker=2) g(arg, 11)
120
+
```
121
+
Task `h` executes on any processor within the intersection of `arg_scope` and `result_scope`. The result is accessible from only within `result_scope`.
122
+
123
+
---
124
+
125
+
### Compute, result, and chunk argument scopes interaction
126
+
When `scope`/`compute_scope`, `result_scope`, and `Chunk` argument scopes are all used, the scheduler executes the task on the intersection of `arg_scope`, the effective compute scope (which is `compute_scope` if provided, otherwise `scope`), and `result_scope`. If no intersection exists, the scheduler throws an exception.
Task `h` computes on thread 2 of worker 2 (as it's the intersection of `arg_scope`, `compute_scope`, and `result_scope`), and its result access is restricted to thread 2 of worker 2 or thread 2 of worker 4.
0 commit comments