@@ -10,30 +10,33 @@ import kotlin.jvm.internal.Ref.ObjectRef
10
10
import kotlin.math.*
11
11
12
12
/* *
13
- * Coroutine scheduler (pool of shared threads) which primary target is to distribute dispatched coroutines
14
- * over worker threads, including both CPU-intensive and blocking tasks, in the most efficient manner.
13
+ * Coroutine scheduler (pool of shared threads) with a primary target to distribute dispatched coroutines
14
+ * over worker threads, including both CPU-intensive and potentially blocking tasks, in the most efficient manner.
15
15
*
16
- * Current scheduler implementation has two optimization targets:
17
- * - Efficiency in the face of communication patterns (e.g. actors communicating via channel)
18
- * - Dynamic resizing to support blocking calls without re-dispatching coroutine to separate "blocking" thread pool.
16
+ * The current scheduler implementation has two optimization targets:
17
+ * - Efficiency in the face of communication patterns (e.g. actors communicating via channel).
18
+ * - Dynamic thread state and resizing to schedule blocking calls without re-dispatching coroutine to a separate "blocking" thread pool.
19
19
*
20
20
* ### Structural overview
21
21
*
22
- * Scheduler consists of [corePoolSize] worker threads to execute CPU-bound tasks and up to
23
- * [maxPoolSize] lazily created threads to execute blocking tasks.
24
- * Every worker has a local queue in addition to a global scheduler queue
25
- * and the global queue has priority over local queue to avoid starvation of externally-submitted
26
- * (e.g. from Android UI thread) tasks.
27
- * Work-stealing is implemented on top of that queues to provide
28
- * even load distribution and illusion of centralized run queue.
22
+ * The scheduler consists of [corePoolSize] worker threads to execute CPU-bound tasks and up to
23
+ * [maxPoolSize] lazily created threads to execute blocking tasks.
24
+ * The scheduler has two global queues -- one for CPU tasks and one for blocking tasks.
25
+ * These queues are used for tasks that a submited externally (from threads not belonging to the scheduler)
26
+ * and as overflow buffers for thread-local queues.
27
+ *
28
+ * Every worker has a local queue in addition to global scheduler queues.
29
+ * The queue to pick the task from is selected randomly to avoid starvation of both local queue and
30
+ * global queue submitted tasks.
31
+ * Work-stealing is implemented on top of that queues to provide even load distribution and an illusion of centralized run queue.
29
32
*
30
33
* ### Scheduling policy
31
34
*
32
35
* When a coroutine is dispatched from within a scheduler worker, it's placed into the head of worker run queue.
33
36
* If the head is not empty, the task from the head is moved to the tail. Though it is an unfair scheduling policy,
34
37
* it effectively couples communicating coroutines into one and eliminates scheduling latency
35
38
* that arises from placing tasks to the end of the queue.
36
- * Placing former head to the tail is necessary to provide semi-FIFO order, otherwise, queue degenerates to stack.
39
+ * Placing former head to the tail is necessary to provide semi-FIFO order, otherwise, queue degenerates to a stack.
37
40
* When a coroutine is dispatched from an external thread, it's put into the global queue.
38
41
* The original idea with a single-slot LIFO buffer comes from Golang runtime scheduler by D. Vyukov.
39
42
* It was proven to be "fair enough", performant and generally well accepted and initially was a significant inspiration
@@ -45,39 +48,41 @@ import kotlin.math.*
45
48
* before parking when his local queue is empty.
46
49
* A non-standard solution is implemented to provide tasks affinity: a task from FIFO buffer may be stolen
47
50
* only if it is stale enough based on the value of [WORK_STEALING_TIME_RESOLUTION_NS].
48
- * For this purpose, monotonic global clock is used, and every task has associated with its submission time.
51
+ * For this purpose, monotonic global clock is used, and every task has a submission time associated with task .
49
52
* This approach shows outstanding results when coroutines are cooperative,
50
- * but as downside scheduler now depends on a high-resolution global clock,
51
- * which may limit scalability on NUMA machines. Tasks from LIFO buffer can be stolen on a regular basis.
53
+ * but as a downside, the scheduler now depends on a high-resolution global clock,
54
+ * which may limit scalability on NUMA machines.
52
55
*
53
56
* ### Thread management
54
- * One of the hardest parts of the scheduler is decentralized management of the threads with the progress guarantees
57
+ *
58
+ * One of the hardest parts of the scheduler is decentralized management of the threads with progress guarantees
55
59
* similar to the regular centralized executors.
56
60
* The state of the threads consists of [controlState] and [parkedWorkersStack] fields.
57
- * The former field incorporates the amount of created threads, CPU-tokens and blocking tasks
58
- * that require a thread compensation,
59
- * while the latter represents intrusive versioned Treiber stack of idle workers.
60
- * When a worker cannot find any work, they first add themselves to the stack,
61
+ * The former field incorporates the number of created threads, CPU-tokens and blocking tasks
62
+ * that require thread compensation,
63
+ * while the latter represents an intrusive versioned Treiber stack of idle workers.
64
+ * When a worker cannot find any work, it first adds itself to the stack,
61
65
* then re-scans the queue to avoid missing signals and then attempts to park
62
- * with additional rendezvous against unnecessary parking.
66
+ * with an additional rendezvous against unnecessary parking.
63
67
* If a worker finds a task that it cannot yet steal due to time constraints, it stores this fact in its state
64
68
* (to be uncounted when additional work is signalled) and parks for such duration.
65
69
*
66
- * When a new task arrives in the scheduler (whether it is local or global queue),
70
+ * When a new task arrives to the scheduler (whether it is a local or a global queue),
67
71
* either an idle worker is being signalled, or a new worker is attempted to be created.
68
72
* (Only [corePoolSize] workers can be created for regular CPU tasks)
69
73
*
70
74
* ### Support for blocking tasks
75
+ *
71
76
* The scheduler also supports the notion of [blocking][TASK_PROBABLY_BLOCKING] tasks.
72
- * When executing or enqueuing blocking tasks, the scheduler notifies or creates one more worker in
73
- * addition to core pool size, so at any given moment, it has [corePoolSize] threads (potentially not yet created)
74
- * to serve CPU-bound tasks. To properly guarantee liveness, the scheduler maintains
75
- * "CPU permits" -- [corePoolSize] special tokens that permit an arbitrary worker to execute and steal CPU-bound tasks.
76
- * When worker encounters blocking tasks, it basically hands off its permit to another thread (not directly though) to
77
- * keep invariant "scheduler always has at least min(pending CPU tasks, core pool size)
77
+ * When executing or enqueuing blocking tasks, the scheduler notifies or creates an additional worker in
78
+ * addition to the core pool size, so at any given moment, it has [corePoolSize] threads (potentially not yet created)
79
+ * available to serve CPU-bound tasks. To properly guarantee liveness, the scheduler maintains
80
+ * "CPU permits" -- # [corePoolSize] special tokens that allow an arbitrary worker to execute and steal CPU-bound tasks.
81
+ * When a worker encounters a blocking tasks, it releases its permit to the scheduler to
82
+ * keep an invariant "scheduler always has at least min(pending CPU tasks, core pool size)
78
83
* and at most core pool size threads to execute CPU tasks".
79
84
* To avoid overprovision, workers without CPU permit are allowed to scan [globalBlockingQueue]
80
- * and steal **only** blocking tasks from other workers.
85
+ * and steal **only** blocking tasks from other workers which imposes a non-trivial complexity to the queue management .
81
86
*
82
87
* The scheduler does not limit the count of pending blocking tasks, potentially creating up to [maxPoolSize] threads.
83
88
* End users do not have access to the scheduler directly and can dispatch blocking tasks only with
0 commit comments