|
10 | 10 | * DOC: teo-description |
11 | 11 | * |
12 | 12 | * The idea of this governor is based on the observation that on many systems |
13 | | - * timer events are two or more orders of magnitude more frequent than any |
14 | | - * other interrupts, so they are likely to be the most significant cause of CPU |
15 | | - * wakeups from idle states. Moreover, information about what happened in the |
16 | | - * (relatively recent) past can be used to estimate whether or not the deepest |
17 | | - * idle state with target residency within the (known) time till the closest |
18 | | - * timer event, referred to as the sleep length, is likely to be suitable for |
19 | | - * the upcoming CPU idle period and, if not, then which of the shallower idle |
20 | | - * states to choose instead of it. |
| 13 | + * timer interrupts are two or more orders of magnitude more frequent than any |
| 14 | + * other interrupt types, so they are likely to dominate CPU wakeup patterns. |
| 15 | + * Moreover, in principle, the time when the next timer event is going to occur |
| 16 | + * can be determined at the idle state selection time, although doing that may |
| 17 | + * be costly, so it can be regarded as the most reliable source of information |
| 18 | + * for idle state selection. |
21 | 19 | * |
22 | | - * Of course, non-timer wakeup sources are more important in some use cases |
23 | | - * which can be covered by taking a few most recent idle time intervals of the |
24 | | - * CPU into account. However, even in that context it is not necessary to |
25 | | - * consider idle duration values greater than the sleep length, because the |
26 | | - * closest timer will ultimately wake up the CPU anyway unless it is woken up |
27 | | - * earlier. |
| 20 | + * Of course, non-timer wakeup sources are more important in some use cases, |
| 21 | + * but even then it is generally unnecessary to consider idle duration values |
| 22 | + * greater than the time time till the next timer event, referred as the sleep |
| 23 | + * length in what follows, because the closest timer will ultimately wake up the |
| 24 | + * CPU anyway unless it is woken up earlier. |
28 | 25 | * |
29 | | - * Thus this governor estimates whether or not the prospective idle duration of |
30 | | - * a CPU is likely to be significantly shorter than the sleep length and selects |
31 | | - * an idle state for it accordingly. |
| 26 | + * However, since obtaining the sleep length may be costly, the governor first |
| 27 | + * checks if it can select a shallow idle state using wakeup pattern information |
| 28 | + * from recent times, in which case it can do without knowing the sleep length |
| 29 | + * at all. For this purpose, it counts CPU wakeup events and looks for an idle |
| 30 | + * state whose target residency has not exceeded the idle duration (measured |
| 31 | + * after wakeup) in the majority of relevant recent cases. If the target |
| 32 | + * residency of that state is small enough, it may be used right away and the |
| 33 | + * sleep length need not be determined. |
32 | 34 | * |
33 | 35 | * The computations carried out by this governor are based on using bins whose |
34 | 36 | * boundaries are aligned with the target residency parameter values of the CPU |
|
39 | 41 | * idle state 2, the third bin spans from the target residency of idle state 2 |
40 | 42 | * up to, but not including, the target residency of idle state 3 and so on. |
41 | 43 | * The last bin spans from the target residency of the deepest idle state |
42 | | - * supplied by the driver to infinity. |
| 44 | + * supplied by the driver to the scheduler tick period length or to infinity if |
| 45 | + * the tick period length is less than the target residency of that state. In |
| 46 | + * the latter case, the governor also counts events with the measured idle |
| 47 | + * duration between the tick period length and the target residency of the |
| 48 | + * deepest idle state. |
43 | 49 | * |
44 | 50 | * Two metrics called "hits" and "intercepts" are associated with each bin. |
45 | 51 | * They are updated every time before selecting an idle state for the given CPU |
|
49 | 55 | * sleep length and the idle duration measured after CPU wakeup fall into the |
50 | 56 | * same bin (that is, the CPU appears to wake up "on time" relative to the sleep |
51 | 57 | * length). In turn, the "intercepts" metric reflects the relative frequency of |
52 | | - * situations in which the measured idle duration is so much shorter than the |
53 | | - * sleep length that the bin it falls into corresponds to an idle state |
54 | | - * shallower than the one whose bin is fallen into by the sleep length (these |
55 | | - * situations are referred to as "intercepts" below). |
| 58 | + * non-timer wakeup events for which the measured idle duration falls into a bin |
| 59 | + * that corresponds to an idle state shallower than the one whose bin is fallen |
| 60 | + * into by the sleep length (these events are also referred to as "intercepts" |
| 61 | + * below). |
56 | 62 | * |
57 | 63 | * In order to select an idle state for a CPU, the governor takes the following |
58 | 64 | * steps (modulo the possible latency constraint that must be taken into account |
59 | 65 | * too): |
60 | 66 | * |
61 | | - * 1. Find the deepest CPU idle state whose target residency does not exceed |
62 | | - * the current sleep length (the candidate idle state) and compute 2 sums as |
63 | | - * follows: |
| 67 | + * 1. Find the deepest enabled CPU idle state (the candidate idle state) and |
| 68 | + * compute 2 sums as follows: |
64 | 69 | * |
65 | | - * - The sum of the "hits" and "intercepts" metrics for the candidate state |
66 | | - * and all of the deeper idle states (it represents the cases in which the |
67 | | - * CPU was idle long enough to avoid being intercepted if the sleep length |
68 | | - * had been equal to the current one). |
| 70 | + * - The sum of the "hits" metric for all of the idle states shallower than |
| 71 | + * the candidate one (it represents the cases in which the CPU was likely |
| 72 | + * woken up by a timer). |
69 | 73 | * |
70 | | - * - The sum of the "intercepts" metrics for all of the idle states shallower |
71 | | - * than the candidate one (it represents the cases in which the CPU was not |
72 | | - * idle long enough to avoid being intercepted if the sleep length had been |
73 | | - * equal to the current one). |
| 74 | + * - The sum of the "intercepts" metric for all of the idle states shallower |
| 75 | + * than the candidate one (it represents the cases in which the CPU was |
| 76 | + * likely woken up by a non-timer wakeup source). |
74 | 77 | * |
75 | | - * 2. If the second sum is greater than the first one the CPU is likely to wake |
76 | | - * up early, so look for an alternative idle state to select. |
| 78 | + * 2. If the second sum computed in step 1 is greater than a half of the sum of |
| 79 | + * both metrics for the candidate state bin and all subsequent bins(if any), |
| 80 | + * a shallower idle state is likely to be more suitable, so look for it. |
77 | 81 | * |
78 | | - * - Traverse the idle states shallower than the candidate one in the |
| 82 | + * - Traverse the enabled idle states shallower than the candidate one in the |
79 | 83 | * descending order. |
80 | 84 | * |
81 | 85 | * - For each of them compute the sum of the "intercepts" metrics over all |
82 | 86 | * of the idle states between it and the candidate one (including the |
83 | 87 | * former and excluding the latter). |
84 | 88 | * |
85 | | - * - If each of these sums that needs to be taken into account (because the |
86 | | - * check related to it has indicated that the CPU is likely to wake up |
87 | | - * early) is greater than a half of the corresponding sum computed in step |
88 | | - * 1 (which means that the target residency of the state in question had |
89 | | - * not exceeded the idle duration in over a half of the relevant cases), |
90 | | - * select the given idle state instead of the candidate one. |
| 89 | + * - If this sum is greater than a half of the second sum computed in step 1, |
| 90 | + * use the given idle state as the new candidate one. |
91 | 91 | * |
92 | | - * 3. By default, select the candidate state. |
| 92 | + * 3. If the current candidate state is state 0 or its target residency is short |
| 93 | + * enough, return it and prevent the scheduler tick from being stopped. |
| 94 | + * |
| 95 | + * 4. Obtain the sleep length value and check if it is below the target |
| 96 | + * residency of the current candidate state, in which case a new shallower |
| 97 | + * candidate state needs to be found, so look for it. |
93 | 98 | */ |
94 | 99 |
|
95 | 100 | #include <linux/cpuidle.h> |
|
0 commit comments