Skip to content

Commit 95f0912

Browse files
ZhenguoYao1akpm00
authored andcommitted
watchdog/softlockup: fix incorrect CPU utilization output during softlockup
Since we use 16-bit precision, the raw data will undergo integer division, which may sometimes result in data loss. This can lead to slightly inaccurate CPU utilization calculations. Under normal circumstances, this isn't an issue. However, when CPU utilization reaches 100%, the calculated result might exceed 100%. For example, with raw data like the following: sample_period 400000134 new_stat 83648414036 old_stat 83247417494 sample_period=400000134/2^24=23 new_stat=83648414036/2^24=4985 old_stat=83247417494/2^24=4961 util=105% Below log will output: CPU#3 Utilization every 0s during lockup: #1: 0% system, 0% softirq, 105% hardirq, 0% idle #2: 0% system, 0% softirq, 105% hardirq, 0% idle #3: 0% system, 0% softirq, 100% hardirq, 0% idle #4: 0% system, 0% softirq, 105% hardirq, 0% idle #5: 0% system, 0% softirq, 105% hardirq, 0% idle To avoid confusion, we enforce a 100% display cap when calculations exceed this threshold. We also round to the nearest multiple of 16.8 milliseconds to improve the accuracy. [[email protected]: make get_16bit_precision() more accurate, fix comment layout] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: ZhenguoYao <[email protected]> Cc: Bitao Hu <[email protected]> Cc: Li Huafei <[email protected]> Cc: Max Kellermann <[email protected]> Cc: Thomas Gleinxer <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
1 parent 41f88dd commit 95f0912

File tree

1 file changed

+13
-1
lines changed

1 file changed

+13
-1
lines changed

kernel/watchdog.c

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -425,7 +425,11 @@ static DEFINE_PER_CPU(u8, cpustat_tail);
425425
*/
426426
static u16 get_16bit_precision(u64 data_ns)
427427
{
428-
return data_ns >> 24LL; /* 2^24ns ~= 16.8ms */
428+
/*
429+
* 2^24ns ~= 16.8ms
430+
* Round to the nearest multiple of 16.8 milliseconds.
431+
*/
432+
return (data_ns + (1 << 23)) >> 24LL;
429433
}
430434

431435
static void update_cpustat(void)
@@ -444,6 +448,14 @@ static void update_cpustat(void)
444448
old_stat = __this_cpu_read(cpustat_old[i]);
445449
new_stat = get_16bit_precision(cpustat[tracked_stats[i]]);
446450
util = DIV_ROUND_UP(100 * (new_stat - old_stat), sample_period_16);
451+
/*
452+
* Since we use 16-bit precision, the raw data will undergo
453+
* integer division, which may sometimes result in data loss,
454+
* and then result might exceed 100%. To avoid confusion,
455+
* we enforce a 100% display cap when calculations exceed this threshold.
456+
*/
457+
if (util > 100)
458+
util = 100;
447459
__this_cpu_write(cpustat_util[tail][i], util);
448460
__this_cpu_write(cpustat_old[i], new_stat);
449461
}

0 commit comments

Comments
 (0)