Skip to content

Commit 950b74d

Browse files
Peter Zijlstrawilldeacon
authored andcommitted
arm64: perf: Implement correct cap_user_time
As reported by Leo; the existing implementation is broken when the clock and counter don't intersect at 0. Use the sched_clock's struct clock_read_data information to correctly implement cap_user_time and cap_user_time_zero. Note that the ARM64 counter is architecturally only guaranteed to be 56bit wide (implementations are allowed to be wider) and the existing perf ABI cannot deal with wrap-around. This implementation should also be faster than the old; seeing how we don't need to recompute mult and shift all the time. [leoyan: Use mul_u64_u32_shr() to convert cyc to ns to avoid overflow] Reported-by: Leo Yan <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Leo Yan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
1 parent aadd6e5 commit 950b74d

File tree

1 file changed

+29
-9
lines changed

1 file changed

+29
-9
lines changed

arch/arm64/kernel/perf_event.c

Lines changed: 29 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
#include <linux/of.h>
2020
#include <linux/perf/arm_pmu.h>
2121
#include <linux/platform_device.h>
22+
#include <linux/sched_clock.h>
2223
#include <linux/smp.h>
2324

2425
/* ARMv8 Cortex-A53 specific event types. */
@@ -1168,28 +1169,47 @@ device_initcall(armv8_pmu_driver_init)
11681169
void arch_perf_update_userpage(struct perf_event *event,
11691170
struct perf_event_mmap_page *userpg, u64 now)
11701171
{
1171-
u32 freq;
1172-
u32 shift;
1172+
struct clock_read_data *rd;
1173+
unsigned int seq;
1174+
u64 ns;
11731175

11741176
/*
11751177
* Internal timekeeping for enabled/running/stopped times
11761178
* is always computed with the sched_clock.
11771179
*/
1178-
freq = arch_timer_get_rate();
11791180
userpg->cap_user_time = 1;
1181+
userpg->cap_user_time_zero = 1;
1182+
1183+
do {
1184+
rd = sched_clock_read_begin(&seq);
1185+
1186+
userpg->time_mult = rd->mult;
1187+
userpg->time_shift = rd->shift;
1188+
userpg->time_zero = rd->epoch_ns;
1189+
1190+
/*
1191+
* This isn't strictly correct, the ARM64 counter can be
1192+
* 'short' and then we get funnies when it wraps. The correct
1193+
* thing would be to extend the perf ABI with a cycle and mask
1194+
* value, but because wrapping on ARM64 is very rare in
1195+
* practise this 'works'.
1196+
*/
1197+
ns = mul_u64_u32_shr(rd->epoch_cyc, rd->mult, rd->shift);
1198+
userpg->time_zero -= ns;
1199+
1200+
} while (sched_clock_read_retry(seq));
1201+
1202+
userpg->time_offset = userpg->time_zero - now;
11801203

1181-
clocks_calc_mult_shift(&userpg->time_mult, &shift, freq,
1182-
NSEC_PER_SEC, 0);
11831204
/*
11841205
* time_shift is not expected to be greater than 31 due to
11851206
* the original published conversion algorithm shifting a
11861207
* 32-bit value (now specifies a 64-bit value) - refer
11871208
* perf_event_mmap_page documentation in perf_event.h.
11881209
*/
1189-
if (shift == 32) {
1190-
shift = 31;
1210+
if (userpg->time_shift == 32) {
1211+
userpg->time_shift = 31;
11911212
userpg->time_mult >>= 1;
11921213
}
1193-
userpg->time_shift = (u16)shift;
1194-
userpg->time_offset = -now;
1214+
11951215
}

0 commit comments

Comments
 (0)