Skip to content

Commit 9fe0818

Browse files
committed
[UBSan][BoundsSafety] Implement support for more expressive "trap reasons" (llvm#154618)
In 29992cf (llvm#145967) support was added for "trap reasons" on traps emitted in UBSan in trapping mode (e.g. `-fsanitize-trap=undefined`). This improved the debugging experience by attaching the reason for trapping as a string on the debug info on trap instructions. Consumers such as LLDB can display this trap reason string when the trap is reached. A limitation of that patch is that the trap reason string is hard-coded for each `SanitizerKind` even though the compiler actually has much more information about the trap available at compile time that could be shown to the user. This patch is an incremental step in fixing that. It consists of two main steps. **1. Introduce infrastructure for building trap reason strings** To make it convenient to construct trap reason strings this patch re-uses Clang's powerful diagnostic infrastructure to provide a convenient API for constructing trap reason strings. This is achieved by: * Introducing a new `Trap` diagnostic kind to represent trap diagnostics in TableGen files. * Adding a new `Trap` diagnostic component. While this part probably isn't technically necessary it seemed like I should follow the existing convention used by the diagnostic system. * Adding `DiagnosticTrapKinds.td` to describe the different trap reasons. * Add the `TrapReasonBuilder` and `TrapReason` classes to provide an interface for constructing trap reason strings and the trap category. Note this API while similar to `DiagnosticBuilder` has different semantics which are described in the code comments. In particular the behavior when the destructor is called is very different. * Adding `CodeGenModule::BuildTrapReason()` as a convenient constructor for the `TrapReasonBuilder`. This use of the diagnostic system is a little unusual in that the emitted trap diagnostics aren't actually consumed by normal diagnostic consumers (e.g. the console). Instead the `TrapReasonBuilder` is just used to format a string, so in effect the builder is somewhat analagous to "printf". However, re-using the diagnostics system in this way brings a several benefits: * The powerful diagnostic templating languge (e.g. `%select`) can be used. * Formatting Clang data types (e.g. `Type`, `Expr`, etc.) just work out-of-the-box. * Describing trap reasons in tablegen files opens the door for translation to different languages in the future. * The `TrapReasonBuilder` API is very similar to `DiagnosticBuilder` which makes it easy to use by anyone already familiar with Clang's diagnostic system. While UBSan is the first consumer of this new infrastructure the intent is to use this to overhaul how trap reasons are implemented in the `-fbounds-safety` implementation (currently exists downstream). **2. Apply the new infrastructure to UBSan checks for arithmetic overflow** To demonstrate using `TrapReasonBuilder` this patch applies it to UBSan traps for arithmetic overflow. The intention is that we would iteratively switch to using the `TrapReasonBuilder` for all UBSan traps where it makes sense in future patches. Previously for code like ``` int test(int a, int b) { return a + b; } ``` The trap reason string looked like ``` Undefined Behavior Sanitizer: Integer addition overflowed ``` now the trap message looks like: ``` Undefined Behavior Sanitizer: signed integer addition overflow in 'a + b' ``` This string is much more specific because * It explains if signed or unsigned overflow occurred * It actually shows the expression that overflowed One possible downside of this approach is it may blow up Debug info size because now there can be many more distinct trap reason strings. To allow users to avoid this a new driver/cc1 flag `-fsanitize-debug-trap-reasons=` has been added which can either be `none` (disable trap reasons entirely), `basic` (use the per `SanitizerKind` hard coded strings), and `detailed` (use the new expressive trap reasons implemented in this patch). The default is `detailed` to give the best out-of-the-box debugging experience. The existing `-fsanitize-debug-trap-reasons` and `-fno-sanitize-debug-trap-reasons` have been kept for compatibility and are aliases of the new flag with `detailed` and `none` arguments passed respectively. rdar://158612755 Conflicts: clang/include/clang/Basic/AllDiagnosticKinds.inc clang/include/clang/Basic/DiagnosticIDs.h clang/lib/Basic/DiagnosticIDs.cpp clang/lib/CodeGen/CGExpr.cpp clang/lib/CodeGen/CodeGenFunction.h clang/docs/ReleaseNotes.rst clang/include/clang/Basic/DiagnosticIDs.h rdar://159302620 (cherry picked from commit 6d385c3) (cherry picked from commit f1ee047)
1 parent 54c90be commit 9fe0818

30 files changed

+536
-72
lines changed

clang/docs/ReleaseNotes.rst

Lines changed: 24 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -366,18 +366,33 @@ Non-comprehensive list of changes in this release
366366
correct method to check for these features is to test for the ``__PTRAUTH__``
367367
macro.
368368

369-
- Trapping UBSan (e.g. ``-fsanitize-trap=undefined``) now emits a string describing the reason for
370-
trapping into the generated debug info. This feature allows debuggers (e.g. LLDB) to display
371-
the reason for trapping if the trap is reached. The string is currently encoded in the debug
372-
info as an artificial frame that claims to be inlined at the trap location. The function used
373-
for the artificial frame is an artificial function whose name encodes the reason for trapping.
374-
The encoding used is currently the same as ``__builtin_verbose_trap`` but might change in the future.
375-
This feature is enabled by default but can be disabled by compiling with
376-
``-fno-sanitize-annotate-debug-info-traps``.
369+
- Trapping UBSan (e.g. ``-fsanitize=undefined -fsanitize-trap=undefined``) now
370+
emits a string describing the reason for trapping into the generated debug
371+
info. This feature allows debuggers (e.g. LLDB) to display the reason for
372+
trapping if the trap is reached. The string is currently encoded in the debug
373+
info as an artificial frame that claims to be inlined at the trap location.
374+
The function used for the artificial frame is an artificial function whose
375+
name encodes the reason for trapping. The encoding used is currently the same
376+
as ``__builtin_verbose_trap`` but might change in the future. This feature is
377+
enabled by default but can be disabled by compiling with
378+
``-fno-sanitize-debug-trap-reasons``. The feature has a ``basic`` and
379+
``detailed`` mode (the default). The ``basic`` mode emits a hard-coded string
380+
per trap kind (e.g. ``Integer addition overflowed``) and the ``detailed`` mode
381+
emits a more descriptive string describing each individual trap (e.g. ``signed
382+
integer addition overflow in 'a + b'``). The ``detailed`` mode produces larger
383+
debug info than ``basic`` but is more helpful for debugging. The
384+
``-fsanitize-debug-trap-reasons=`` flag can be used to switch between the
385+
different modes or disable the feature entirely. Note due to trap merging in
386+
optimized builds (i.e. in each function all traps of the same kind get merged
387+
into the same trap instruction) the trap reasons might be removed. To prevent
388+
this build without optimizations (i.e. use `-O0` or use the `optnone` function
389+
attribute) or use the `fno-sanitize-merge=` flag in optimized builds.
377390

378391
New Compiler Flags
379392
------------------
380-
- New option ``-fno-sanitize-annotate-debug-info-traps`` added to disable emitting trap reasons into the debug info when compiling with trapping UBSan (e.g. ``-fsanitize-trap=undefined``).
393+
- New option ``-fno-sanitize-debug-trap-reasons`` added to disable emitting trap reasons into the debug info when compiling with trapping UBSan (e.g. ``-fsanitize-trap=undefined``).
394+
- New option ``-fsanitize-debug-trap-reasons=`` added to control emitting trap reasons into the debug info when compiling with trapping UBSan (e.g. ``-fsanitize-trap=undefined``).
395+
381396

382397
- New option ``-Wundef-true`` added and enabled by default to warn when `true` is used in the C preprocessor without being defined before C23.
383398

clang/include/clang/Basic/AllDiagnosticKinds.inc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,5 +30,7 @@
3030
#include "clang/Basic/DiagnosticAnalysisKinds.inc"
3131
#include "clang/Basic/DiagnosticRefactoringKinds.inc"
3232
#include "clang/Basic/DiagnosticInstallAPIKinds.inc"
33+
#include "clang/Basic/DiagnosticTrapKinds.inc"
3334
#include "clang/Basic/DiagnosticCASKinds.inc"
35+
3436
// clang-format on

clang/include/clang/Basic/AllDiagnostics.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@
2727
#include "clang/Basic/DiagnosticRefactoring.h"
2828
#include "clang/Basic/DiagnosticSema.h"
2929
#include "clang/Basic/DiagnosticSerialization.h"
30+
#include "clang/Basic/DiagnosticTrap.h"
3031

3132
namespace clang {
3233
template <size_t SizeOfStr, typename FieldType> class StringSizerHelper {

clang/include/clang/Basic/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ clang_diag_gen(Parse)
3434
clang_diag_gen(Refactoring)
3535
clang_diag_gen(Sema)
3636
clang_diag_gen(Serialization)
37+
clang_diag_gen(Trap)
3738
clang_tablegen(DiagnosticGroups.inc -gen-clang-diag-groups
3839
SOURCE Diagnostic.td
3940
TARGET ClangDiagnosticGroups)

clang/include/clang/Basic/CodeGenOptions.def

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -306,7 +306,7 @@ CODEGENOPT(SanitizeBinaryMetadataAtomics, 1, 0, Benign) ///< Emit PCs for atomic
306306
CODEGENOPT(SanitizeBinaryMetadataUAR, 1, 0, Benign) ///< Emit PCs for start of functions
307307
///< that are subject for use-after-return checking.
308308
CODEGENOPT(SanitizeStats , 1, 0, Benign) ///< Collect statistics for sanitizers.
309-
CODEGENOPT(SanitizeDebugTrapReasons, 1, 1 , Benign) ///< Enable UBSan trapping messages
309+
ENUM_CODEGENOPT(SanitizeDebugTrapReasons, SanitizeDebugTrapReasonKind, 2, SanitizeDebugTrapReasonKind::Detailed, Benign) ///< Control how "trap reasons" are emitted in debug info
310310
CODEGENOPT(SimplifyLibCalls , 1, 1, Benign) ///< Set when -fbuiltin is enabled.
311311
CODEGENOPT(SoftFloat , 1, 0, Benign) ///< -soft-float.
312312
CODEGENOPT(SpeculativeLoadHardening, 1, 0, Benign) ///< Enable speculative load hardening.

clang/include/clang/Basic/CodeGenOptions.h

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -178,6 +178,16 @@ class CodeGenOptions : public CodeGenOptionsBase {
178178
/// The callback for mc result.
179179
std::optional<llvm::MCTargetOptions::ResultCallBackTy> MCCallBack;
180180

181+
enum SanitizeDebugTrapReasonKind {
182+
None, ///< Trap Messages are omitted. This offers the smallest debug info
183+
///< size but at the cost of making traps hard to debug.
184+
Basic, ///< Trap Message is fixed per SanitizerKind. Produces smaller debug
185+
///< info than `Detailed` but is not as helpful for debugging.
186+
Detailed, ///< Trap Message includes more context (e.g. the expression being
187+
///< overflowed). This is more helpful for debugging but produces
188+
///< larger debug info than `Basic`.
189+
};
190+
181191
/// The code model to use (-mcmodel).
182192
std::string CodeModel;
183193

clang/include/clang/Basic/Diagnostic.h

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
#include "llvm/ADT/DenseMap.h"
2424
#include "llvm/ADT/FunctionExtras.h"
2525
#include "llvm/ADT/IntrusiveRefCntPtr.h"
26+
#include "llvm/ADT/SmallString.h"
2627
#include "llvm/ADT/SmallVector.h"
2728
#include "llvm/ADT/iterator_range.h"
2829
#include "llvm/Support/Compiler.h"
@@ -1269,10 +1270,13 @@ class DiagnosticBuilder : public StreamingDiagnostic {
12691270

12701271
DiagnosticBuilder() = default;
12711272

1273+
protected:
12721274
DiagnosticBuilder(DiagnosticsEngine *DiagObj, SourceLocation DiagLoc,
12731275
unsigned DiagID);
12741276

1275-
protected:
1277+
DiagnosticsEngine *getDiagnosticsEngine() const { return DiagObj; }
1278+
unsigned getDiagID() const { return DiagID; }
1279+
12761280
/// Clear out the current diagnostic.
12771281
void Clear() const {
12781282
DiagObj = nullptr;

clang/include/clang/Basic/Diagnostic.td

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ def CLASS_REMARK : DiagClass;
3030
def CLASS_WARNING : DiagClass;
3131
def CLASS_EXTENSION : DiagClass;
3232
def CLASS_ERROR : DiagClass;
33+
def CLASS_TRAP : DiagClass;
3334

3435
// Responses to a diagnostic in a SFINAE context.
3536
class SFINAEResponse;
@@ -144,7 +145,8 @@ class Extension<string str> : Diagnostic<str, CLASS_EXTENSION, SEV_Ignored>;
144145
class ExtWarn<string str> : Diagnostic<str, CLASS_EXTENSION, SEV_Warning>;
145146
// Notes can provide supplementary information on errors, warnings, and remarks.
146147
class Note<string str> : Diagnostic<str, CLASS_NOTE, SEV_Fatal/*ignored*/>;
147-
148+
// Trap messages attached to traps in debug info.
149+
class Trap<string str> : Diagnostic<str, CLASS_TRAP, SEV_Fatal/*ignored*/>;
148150

149151
class DefaultIgnore { Severity DefaultSeverity = SEV_Ignored; }
150152
class DefaultWarn { Severity DefaultSeverity = SEV_Warning; }
@@ -236,3 +238,4 @@ include "DiagnosticParseKinds.td"
236238
include "DiagnosticRefactoringKinds.td"
237239
include "DiagnosticSemaKinds.td"
238240
include "DiagnosticSerializationKinds.td"
241+
include "DiagnosticTrapKinds.td"

clang/include/clang/Basic/DiagnosticIDs.h

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ enum {
4747
DIAG_SIZE_ANALYSIS = 100,
4848
DIAG_SIZE_REFACTORING = 1000,
4949
DIAG_SIZE_INSTALLAPI = 100,
50+
DIAG_SIZE_TRAP = 100,
5051
DIAG_SIZE_CAS = 100,
5152
};
5253
// Start position for diagnostics.
@@ -65,7 +66,8 @@ enum {
6566
DIAG_START_ANALYSIS = DIAG_START_SEMA + static_cast<int>(DIAG_SIZE_SEMA),
6667
DIAG_START_REFACTORING = DIAG_START_ANALYSIS + static_cast<int>(DIAG_SIZE_ANALYSIS),
6768
DIAG_START_INSTALLAPI = DIAG_START_REFACTORING + static_cast<int>(DIAG_SIZE_REFACTORING),
68-
DIAG_START_CAS = DIAG_START_INSTALLAPI + static_cast<int>(DIAG_SIZE_INSTALLAPI),
69+
DIAG_START_TRAP = DIAG_START_INSTALLAPI + static_cast<int>(DIAG_SIZE_INSTALLAPI),
70+
DIAG_START_CAS = DIAG_START_TRAP + static_cast<int>(DIAG_SIZE_TRAP),
6971
DIAG_UPPER_LIMIT = DIAG_START_CAS + static_cast<int>(DIAG_SIZE_CAS)
7072
};
7173
// clang-format on
@@ -191,7 +193,8 @@ class DiagnosticIDs : public RefCountedBase<DiagnosticIDs> {
191193
CLASS_REMARK = 0x02,
192194
CLASS_WARNING = 0x03,
193195
CLASS_EXTENSION = 0x04,
194-
CLASS_ERROR = 0x05
196+
CLASS_ERROR = 0x05,
197+
CLASS_TRAP = 0x06
195198
};
196199

197200
static bool IsCustomDiag(diag::kind Diag) {
@@ -361,6 +364,10 @@ class DiagnosticIDs : public RefCountedBase<DiagnosticIDs> {
361364
///
362365
bool isExtensionDiag(unsigned DiagID, bool &EnabledByDefault) const;
363366

367+
bool isTrapDiag(unsigned DiagID) const {
368+
return getDiagClass(DiagID) == CLASS_TRAP;
369+
}
370+
364371
/// Given a group ID, returns the flag that toggles the group.
365372
/// For example, for Group::DeprecatedDeclarations, returns
366373
/// "deprecated-declarations".
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
//===----------------------------------------------------------------------===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
#ifndef LLVM_CLANG_BASIC_DIAGNOSTICTRAP_H
9+
#define LLVM_CLANG_BASIC_DIAGNOSTICTRAP_H
10+
11+
#include "clang/Basic/Diagnostic.h"
12+
#include "clang/Basic/DiagnosticTrapInterface.inc"
13+
14+
#endif

0 commit comments

Comments
 (0)