Skip to content

Commit 1df73b2

Browse files
committed
x86/mce: Fixup exception only for the correct MCEs
The severity grading code returns IN_KERNEL_RECOV error context for errors which have happened in kernel space but from which the kernel can recover. Whether the recovery can happen is determined by the exception table entry having as handler ex_handler_fault() and which has been declared at build time using _ASM_EXTABLE_FAULT(). IN_KERNEL_RECOV is used in mce_severity_intel() to lookup the corresponding error severity in the severities table. However, the mapping back from error severity to whether the error is IN_KERNEL_RECOV is ambiguous and in the very paranoid case - which might not be possible right now - but be better safe than sorry later, an exception fixup could be attempted for another MCE whose address is in the exception table and has the proper severity. Which would be unfortunate, to say the least. Therefore, mark such MCEs explicitly as MCE_IN_KERNEL_RECOV so that the recovery attempt is done only for them. Document the whole handling, while at it, as it is not trivial. Reported-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Tested-by: Tony Luck <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
1 parent 7fc0b9b commit 1df73b2

File tree

3 files changed

+19
-3
lines changed

3 files changed

+19
-3
lines changed

arch/x86/include/asm/mce.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -136,6 +136,7 @@
136136
#define MCE_HANDLED_NFIT BIT_ULL(3)
137137
#define MCE_HANDLED_EDAC BIT_ULL(4)
138138
#define MCE_HANDLED_MCELOG BIT_ULL(5)
139+
#define MCE_IN_KERNEL_RECOV BIT_ULL(6)
139140

140141
/*
141142
* This structure contains all data related to the MCE log. Also

arch/x86/kernel/cpu/mce/core.c

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1331,8 +1331,19 @@ void notrace do_machine_check(struct pt_regs *regs, long error_code)
13311331
local_irq_disable();
13321332
ist_end_non_atomic();
13331333
} else {
1334-
if (!fixup_exception(regs, X86_TRAP_MC, error_code, 0))
1335-
mce_panic("Failed kernel mode recovery", &m, msg);
1334+
/*
1335+
* Handle an MCE which has happened in kernel space but from
1336+
* which the kernel can recover: ex_has_fault_handler() has
1337+
* already verified that the rIP at which the error happened is
1338+
* a rIP from which the kernel can recover (by jumping to
1339+
* recovery code specified in _ASM_EXTABLE_FAULT()) and the
1340+
* corresponding exception handler which would do that is the
1341+
* proper one.
1342+
*/
1343+
if (m.kflags & MCE_IN_KERNEL_RECOV) {
1344+
if (!fixup_exception(regs, X86_TRAP_MC, error_code, 0))
1345+
mce_panic("Failed kernel mode recovery", &m, msg);
1346+
}
13361347
}
13371348

13381349
out_ist:

arch/x86/kernel/cpu/mce/severity.c

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -213,8 +213,12 @@ static int error_context(struct mce *m)
213213
{
214214
if ((m->cs & 3) == 3)
215215
return IN_USER;
216-
if (mc_recoverable(m->mcgstatus) && ex_has_fault_handler(m->ip))
216+
217+
if (mc_recoverable(m->mcgstatus) && ex_has_fault_handler(m->ip)) {
218+
m->kflags |= MCE_IN_KERNEL_RECOV;
217219
return IN_KERNEL_RECOV;
220+
}
221+
218222
return IN_KERNEL;
219223
}
220224

0 commit comments

Comments
 (0)