Skip to content

Commit d76343c

Browse files
Valentin SchneiderIngo Molnar
authored andcommitted
sched/fair: Align rq->avg_idle and rq->avg_scan_cost
sched/core.c uses update_avg() for rq->avg_idle and sched/fair.c uses an open-coded version (with the exact same decay factor) for rq->avg_scan_cost. On top of that, select_idle_cpu() expects to be able to compare these two fields. The only difference between the two is that rq->avg_scan_cost is computed using a pure division rather than a shift. Turns out it actually matters, first of all because the shifted value can be negative, and the standard has this to say about it: """ The result of E1 >> E2 is E1 right-shifted E2 bit positions. [...] If E1 has a signed type and a negative value, the resulting value is implementation-defined. """ Not only this, but (arithmetic) right shifting a negative value (using 2's complement) is *not* equivalent to dividing it by the corresponding power of 2. Let's look at a few examples: -4 -> 0xF..FC -4 >> 3 -> 0xF..FF == -1 != -4 / 8 -8 -> 0xF..F8 -8 >> 3 -> 0xF..FF == -1 == -8 / 8 -9 -> 0xF..F7 -9 >> 3 -> 0xF..FE == -2 != -9 / 8 Make update_avg() use a division, and export it to the private scheduler header to reuse it where relevant. Note that this still lets compilers use a shift here, but should prevent any unwanted surprise. The disassembly of select_idle_cpu() remains unchanged on arm64, and ttwu_do_wakeup() gains 2 instructions; the diff sort of looks like this: - sub x1, x1, x0 + subs x1, x1, x0 // set condition codes + add x0, x1, #0x7 + csel x0, x0, x1, mi // x0 = x1 < 0 ? x0 : x1 add x0, x3, x0, asr #3 which does the right thing (i.e. gives us the expected result while still using an arithmetic shift) Signed-off-by: Valentin Schneider <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
1 parent f5e94d1 commit d76343c

File tree

3 files changed

+8
-11
lines changed

3 files changed

+8
-11
lines changed

kernel/sched/core.c

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2119,12 +2119,6 @@ int select_task_rq(struct task_struct *p, int cpu, int sd_flags, int wake_flags)
21192119
return cpu;
21202120
}
21212121

2122-
static void update_avg(u64 *avg, u64 sample)
2123-
{
2124-
s64 diff = sample - *avg;
2125-
*avg += diff >> 3;
2126-
}
2127-
21282122
void sched_set_stop_task(int cpu, struct task_struct *stop)
21292123
{
21302124
struct sched_param param = { .sched_priority = MAX_RT_PRIO - 1 };

kernel/sched/fair.c

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6080,8 +6080,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
60806080
struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
60816081
struct sched_domain *this_sd;
60826082
u64 avg_cost, avg_idle;
6083-
u64 time, cost;
6084-
s64 delta;
6083+
u64 time;
60856084
int this = smp_processor_id();
60866085
int cpu, nr = INT_MAX;
60876086

@@ -6119,9 +6118,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
61196118
}
61206119

61216120
time = cpu_clock(this) - time;
6122-
cost = this_sd->avg_scan_cost;
6123-
delta = (s64)(time - cost) / 8;
6124-
this_sd->avg_scan_cost += delta;
6121+
update_avg(&this_sd->avg_scan_cost, time);
61256122

61266123
return cpu;
61276124
}

kernel/sched/sched.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -195,6 +195,12 @@ static inline int task_has_dl_policy(struct task_struct *p)
195195

196196
#define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT)
197197

198+
static inline void update_avg(u64 *avg, u64 sample)
199+
{
200+
s64 diff = sample - *avg;
201+
*avg += diff / 8;
202+
}
203+
198204
/*
199205
* !! For sched_setattr_nocheck() (kernel) only !!
200206
*

0 commit comments

Comments
 (0)