Skip to content

Commit 1b1cf8f

Browse files
committed
Merge tag 'x86-splitlock-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 splitlock updates from Ingo Molnar: - Add the "ratelimit:N" parameter to the split_lock_detect= boot option, to rate-limit the generation of bus-lock exceptions. This is both easier on system resources and kinder to offending applications than the current policy of outright killing them. - Document the split-lock detection feature and its parameters. * tag 'x86-splitlock-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation/x86: Add ratelimit in buslock.rst Documentation/admin-guide: Add bus lock ratelimit x86/bus_lock: Set rate limit for bus lock Documentation/x86: Add buslock.rst
2 parents 5f49832 + d28397e commit 1b1cf8f

File tree

4 files changed

+175
-2
lines changed

4 files changed

+175
-2
lines changed

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5278,6 +5278,14 @@
52785278
exception. Default behavior is by #AC if
52795279
both features are enabled in hardware.
52805280

5281+
ratelimit:N -
5282+
Set system wide rate limit to N bus locks
5283+
per second for bus lock detection.
5284+
0 < N <= 1000.
5285+
5286+
N/A for split lock detection.
5287+
5288+
52815289
If an #AC exception is hit in the kernel or in
52825290
firmware (i.e. not while executing in user mode)
52835291
the kernel will oops in either "warn" or "fatal"

Documentation/x86/buslock.rst

Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
.. include:: <isonum.txt>
4+
5+
===============================
6+
Bus lock detection and handling
7+
===============================
8+
9+
:Copyright: |copy| 2021 Intel Corporation
10+
:Authors: - Fenghua Yu <[email protected]>
11+
- Tony Luck <[email protected]>
12+
13+
Problem
14+
=======
15+
16+
A split lock is any atomic operation whose operand crosses two cache lines.
17+
Since the operand spans two cache lines and the operation must be atomic,
18+
the system locks the bus while the CPU accesses the two cache lines.
19+
20+
A bus lock is acquired through either split locked access to writeback (WB)
21+
memory or any locked access to non-WB memory. This is typically thousands of
22+
cycles slower than an atomic operation within a cache line. It also disrupts
23+
performance on other cores and brings the whole system to its knees.
24+
25+
Detection
26+
=========
27+
28+
Intel processors may support either or both of the following hardware
29+
mechanisms to detect split locks and bus locks.
30+
31+
#AC exception for split lock detection
32+
--------------------------------------
33+
34+
Beginning with the Tremont Atom CPU split lock operations may raise an
35+
Alignment Check (#AC) exception when a split lock operation is attemped.
36+
37+
#DB exception for bus lock detection
38+
------------------------------------
39+
40+
Some CPUs have the ability to notify the kernel by an #DB trap after a user
41+
instruction acquires a bus lock and is executed. This allows the kernel to
42+
terminate the application or to enforce throttling.
43+
44+
Software handling
45+
=================
46+
47+
The kernel #AC and #DB handlers handle bus lock based on the kernel
48+
parameter "split_lock_detect". Here is a summary of different options:
49+
50+
+------------------+----------------------------+-----------------------+
51+
|split_lock_detect=|#AC for split lock |#DB for bus lock |
52+
+------------------+----------------------------+-----------------------+
53+
|off |Do nothing |Do nothing |
54+
+------------------+----------------------------+-----------------------+
55+
|warn |Kernel OOPs |Warn once per task and |
56+
|(default) |Warn once per task and |and continues to run. |
57+
| |disable future checking | |
58+
| |When both features are | |
59+
| |supported, warn in #AC | |
60+
+------------------+----------------------------+-----------------------+
61+
|fatal |Kernel OOPs |Send SIGBUS to user. |
62+
| |Send SIGBUS to user | |
63+
| |When both features are | |
64+
| |supported, fatal in #AC | |
65+
+------------------+----------------------------+-----------------------+
66+
|ratelimit:N |Do nothing |Limit bus lock rate to |
67+
|(0 < N <= 1000) | |N bus locks per second |
68+
| | |system wide and warn on|
69+
| | |bus locks. |
70+
+------------------+----------------------------+-----------------------+
71+
72+
Usages
73+
======
74+
75+
Detecting and handling bus lock may find usages in various areas:
76+
77+
It is critical for real time system designers who build consolidated real
78+
time systems. These systems run hard real time code on some cores and run
79+
"untrusted" user processes on other cores. The hard real time cannot afford
80+
to have any bus lock from the untrusted processes to hurt real time
81+
performance. To date the designers have been unable to deploy these
82+
solutions as they have no way to prevent the "untrusted" user code from
83+
generating split lock and bus lock to block the hard real time code to
84+
access memory during bus locking.
85+
86+
It's also useful for general computing to prevent guests or user
87+
applications from slowing down the overall system by executing instructions
88+
with bus lock.
89+
90+
91+
Guidance
92+
========
93+
off
94+
---
95+
96+
Disable checking for split lock and bus lock. This option can be useful if
97+
there are legacy applications that trigger these events at a low rate so
98+
that mitigation is not needed.
99+
100+
warn
101+
----
102+
103+
A warning is emitted when a bus lock is detected which allows to identify
104+
the offending application. This is the default behavior.
105+
106+
fatal
107+
-----
108+
109+
In this case, the bus lock is not tolerated and the process is killed.
110+
111+
ratelimit
112+
---------
113+
114+
A system wide bus lock rate limit N is specified where 0 < N <= 1000. This
115+
allows a bus lock rate up to N bus locks per second. When the bus lock rate
116+
is exceeded then any task which is caught via the buslock #DB exception is
117+
throttled by enforced sleeps until the rate goes under the limit again.
118+
119+
This is an effective mitigation in cases where a minimal impact can be
120+
tolerated, but an eventual Denial of Service attack has to be prevented. It
121+
allows to identify the offending processes and analyze whether they are
122+
malicious or just badly written.
123+
124+
Selecting a rate limit of 1000 allows the bus to be locked for up to about
125+
seven million cycles each second (assuming 7000 cycles for each bus
126+
lock). On a 2 GHz processor that would be about 0.35% system slowdown.

Documentation/x86/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ x86-specific Documentation
2929
microcode
3030
resctrl
3131
tsx_async_abort
32+
buslock
3233
usb-legacy-support
3334
i386/index
3435
x86_64/index

arch/x86/kernel/cpu/intel.c

Lines changed: 40 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
#include <linux/thread_info.h>
1111
#include <linux/init.h>
1212
#include <linux/uaccess.h>
13+
#include <linux/delay.h>
1314

1415
#include <asm/cpufeature.h>
1516
#include <asm/msr.h>
@@ -41,6 +42,7 @@ enum split_lock_detect_state {
4142
sld_off = 0,
4243
sld_warn,
4344
sld_fatal,
45+
sld_ratelimit,
4446
};
4547

4648
/*
@@ -999,13 +1001,30 @@ static const struct {
9991001
{ "off", sld_off },
10001002
{ "warn", sld_warn },
10011003
{ "fatal", sld_fatal },
1004+
{ "ratelimit:", sld_ratelimit },
10021005
};
10031006

1007+
static struct ratelimit_state bld_ratelimit;
1008+
10041009
static inline bool match_option(const char *arg, int arglen, const char *opt)
10051010
{
1006-
int len = strlen(opt);
1011+
int len = strlen(opt), ratelimit;
1012+
1013+
if (strncmp(arg, opt, len))
1014+
return false;
1015+
1016+
/*
1017+
* Min ratelimit is 1 bus lock/sec.
1018+
* Max ratelimit is 1000 bus locks/sec.
1019+
*/
1020+
if (sscanf(arg, "ratelimit:%d", &ratelimit) == 1 &&
1021+
ratelimit > 0 && ratelimit <= 1000) {
1022+
ratelimit_state_init(&bld_ratelimit, HZ, ratelimit);
1023+
ratelimit_set_flags(&bld_ratelimit, RATELIMIT_MSG_ON_RELEASE);
1024+
return true;
1025+
}
10071026

1008-
return len == arglen && !strncmp(arg, opt, len);
1027+
return len == arglen;
10091028
}
10101029

10111030
static bool split_lock_verify_msr(bool on)
@@ -1084,6 +1103,15 @@ static void sld_update_msr(bool on)
10841103

10851104
static void split_lock_init(void)
10861105
{
1106+
/*
1107+
* #DB for bus lock handles ratelimit and #AC for split lock is
1108+
* disabled.
1109+
*/
1110+
if (sld_state == sld_ratelimit) {
1111+
split_lock_verify_msr(false);
1112+
return;
1113+
}
1114+
10871115
if (cpu_model_supports_sld)
10881116
split_lock_verify_msr(sld_state != sld_off);
10891117
}
@@ -1156,6 +1184,12 @@ void handle_bus_lock(struct pt_regs *regs)
11561184
switch (sld_state) {
11571185
case sld_off:
11581186
break;
1187+
case sld_ratelimit:
1188+
/* Enforce no more than bld_ratelimit bus locks/sec. */
1189+
while (!__ratelimit(&bld_ratelimit))
1190+
msleep(20);
1191+
/* Warn on the bus lock. */
1192+
fallthrough;
11591193
case sld_warn:
11601194
pr_warn_ratelimited("#DB: %s/%d took a bus_lock trap at address: 0x%lx\n",
11611195
current->comm, current->pid, regs->ip);
@@ -1261,6 +1295,10 @@ static void sld_state_show(void)
12611295
" from non-WB" : "");
12621296
}
12631297
break;
1298+
case sld_ratelimit:
1299+
if (boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT))
1300+
pr_info("#DB: setting system wide bus lock rate limit to %u/sec\n", bld_ratelimit.burst);
1301+
break;
12641302
}
12651303
}
12661304

0 commit comments

Comments
 (0)