From 1c985e6001d970b8c19ac02b702f3f5f7057ed5f Mon Sep 17 00:00:00 2001 From: Anthony Tran Date: Sun, 31 Aug 2025 03:18:00 -0700 Subject: [PATCH 01/14] Add GSoC 2025 post on UBSan improvements --- .../2025-09-01-gsoc-ubsan-trap-messages.md | 123 ++++++++++++++++++ 1 file changed, 123 insertions(+) create mode 100644 content/2025-09-01-gsoc-ubsan-trap-messages.md diff --git a/content/2025-09-01-gsoc-ubsan-trap-messages.md b/content/2025-09-01-gsoc-ubsan-trap-messages.md new file mode 100644 index 00000000..496e1d9d --- /dev/null +++ b/content/2025-09-01-gsoc-ubsan-trap-messages.md @@ -0,0 +1,123 @@ +--- +author: "Anthony Tran (anthonyhatran)" +date: "2025-09-01" +tags: ["GSoC", "Clang", "CodeGen"] +title: "GSoC 2025: Usability Improvements for the Undefined Behavior Sanitizer" +--- + +## Introduction + +Hi everyone, my name is Anthony and I had the pleasure of working on improving the Undefined Behavior Sanitizer this Google Summer of Code 2025. My mentors were Dan Liew and Michael Buch. + +## Background + +Undefined Behavior Sanitizer (UBSan) is an undefined behavior checker which detects some of the undefined behaviors in C, C++, and Objective-C languages at runtime. This project focused mainly on trapping UBSan, which is evoked through `-fsanitize-trap=<...>` along with `-fsanitize=<...>`. Trapping UBSan is a lighter-weight version of runtime UBSan because upon detection of undefined behavior a trap instruction is executed rather than calling into a runtime library to handle the undefined behavior. This makes it more appealing for kernel, embedded, and production hardening use cases. + +TODO: Show example of normal userspace UBSan and how it outputs the problem +TODO: show example of trapping UBSan. Both with and without the debugger. This will let you illustrate the problem of it being hard to understand what happened, even in the debugger. This gives you the motivation for your work which is currently missing. + +## Human readable descriptions of UBSan traps in LLDB + +During my GSoC project I implemented support for displaying human readable descriptions of UBSan traps in LLDB to improve the debugging experience. + +The approach used is based on how `__builtin_verbose_trap` is implemented. + +This was done by inserting a fake frame into debug-info, which is formatted like so: `__clang_trap_msg$$`. This specific format +is recognized and can be encoded by LLDB, which resembles what is used for `__builtin_verbose_trap`. + +In both cases, the compiler encodes the trap's context by emitting the trap instruction inside an artificial function with a specially-formatted name (e.g., `__clang_trap_msg$$`). When the trap occurs, LLDB's Verbose Trap StackFrame Recognizer identifies this special name in the debug info and uses the extracted category and message to generate the user-friendly stop reason. By adopting this existing protocol, this feature ensures robust and consistent behavior within the debugger. + +Take for example this erroneous program, `foo.c`: + +``` +#include + +int main() { return INT_MAX + 1; } +``` + +When run with these flags before my change: +`$ clang -fsanitize=signed-integer-overflow -fsanitize-trap=signed-integer-overflow foo.c` + +Debug info would not provide a stopping reason, leaving the user possibly confused. + +After the changes, this line is emitted in LLVM IR: +`!18 = distinct !DISubprogram(name: "__clang_trap_msg$Undefined Behavior Sanitizer$signed integer addition overflow in '2147483647 + 1'", scope: !1, file: !1, type: !19, flags: DIFlagArtificial, spFlags: DISPFlagDefinition, unit: !0)` + +Which looks something like this in LLDB: + +`stop reason = Undefined Behavior Sanitizer: signed integer addition overflow in '2147483647 + 1'` + +Previously, the program would stop execution, but the user would not know why. With the new feature, the stop reason is apparent. + +The `-fsanitize-debug-trap-reasons` flag [1] enables trap messages for UBSan, which provides context for stop reasons in trapping UBSan. + +One concern that a reviewer had was the debug info size difference. Using bloaty, I tested a release build of clang with the `-fsanitize-debug-trap-reasons` flag enabled, and one with it disabled (`-fno-sanitize-debug-trap-reasons`). Results are below. +``` + FILE SIZE VM SIZE + -------------- -------------- + +0.3% +6.01Mi +0.3% +6.01Mi ,__debug_info + +2.0% +2.26Mi [ = ] 0 [Unmapped] + +1.2% +1.35Mi +1.2% +1.35Mi ,__apple_names + +0.0% +1.01Mi +0.0% +1.01Mi ,__debug_str + +0.8% +636Ki +0.8% +635Ki ,__debug_line + +0.4% +161Ki +0.4% +161Ki ,__debug_ranges + +0.4% +47.9Ki +0.4% +47.9Ki ,__debug_abbrev + +0.0% +14 +0.0% +14 ,__apple_types + [ = ] 0 +0.0% +8 ,__common + [ = ] 0 +7.1% +4 ,__thread_bss + -0.0% -4 -0.0% -4 ,__const + -0.0% -1.27Ki -0.0% -1.27Ki ,__cstring + +0.2% +11.5Mi +0.1% +9.19Mi TOTAL + ``` + + +## RFC: Add a warning when `-fsanitize=` is passed without associated `-fsanitize-trap=` + +Currently, clang does not warn about cases where `-fsanitize-trap=` does nothing (silent no-op), particularly in the case where `-fsanitize-trap=` is passed without `-fsanitize=` [2]: + +Ex: +`$ clang -fsanitize-trap=undefined foo.c` + +Emits no warning, even though `-fsanitize-trap=undefined` is not doing anything here. + + +We thought it would be more user-friendly to add a warning for such cases, but due to some initial community pushback [3], it was decided that an RFC should be opened. I ended up writing a sketch patch that emitted a warning for such cases [4]. + +Ex: +`$ clang -fsanitize-trap=undefined foo.c` + +Would now emit: +`warning: -fsanitize-trap=undefined has no effect because the "undefined" sanitizer is disabled; consider passing "-fsanitize=undefined" to enable the sanitizer` + +Unfortunately, we found that the emission of such warnings could become exceedingly complicated and a point of contention due to the the existence of sanitizer groups, subgroups, and individual sanitizers. Determining the correct behavior for various cases, historical precedence with no-ops, interference with current build systems, prioritization of existing build systems over the user experience, and compatibility with gcc led to the end of the RFC [5]. + +## Expand upon the hard-coded strings in `-fsanitize-debug-trap-reasons` to be more specific + +One of my mentors, Dan, created an extension to the clang diagnostics subsystem to work with trap messages [6]. By extending the diagnostics subsystem, it allows us to leverage the powerful semantics that the diagnostics system offers. We still use what was implemented in the first task as sort of a fallback, meaning that if no additional context was needed for the trap message, then the default hard-coded string will be used instead. + +## What I've Learned + +Before I started this GSoC, I barely even knew how to build clang and LLVM or use git in a large open-source project. My mentors showed me the ropes on a lot of things, and I came out of this summer knowing a lot more of how to get my changes properly reviewed and upstreamed. + +## Future Work + +The diagnostics extension for trap messages has been recently upstreamed by Dan [6]. As of right now, only signed and unsigned overflow for addition, subtraction, and multiplication is being used by this system. I've investigated some use cases outside of signed and unsigned overflow, and I plan to implement that within the next week(s). [stub, since I'll probably upstream my changes soon so this part will change] + +There is also an issue [8] where trap messages are not emitted in cases where they should be due to a null check. The purpose of the null check was to prevent a nullptr dereference that occurred in debug-info prologue. This is a known issue to which there isn't a concrete solution as of current. + +## Conclusion + +I want to give a special thanks to my mentors, Dan and Michael, for being there for me the whole way. I appreciate their commitment to the project and their patience with me. I'm incredibly grateful that I was able to work on this project and I wouldn't have traded it for anything else. Being a beginner to both LLVM and open-source, I have to admit I was overwhelmed at first, but slowly, along with their help, I was able to gain at least a semblance of understanding of how things worked. I could not have asked for a better set of mentors, so again, a huge thanks to them. I want to also extend my gratitude to the LLVM Foundation for this opportunity. + +I've had a lot of fun with this project and I hope to contribute more to LLVM, or in open-source in general, in the future. + +## External Links + +[1] https://github.com/llvm/llvm-project/pull/145967 +[2] https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#id5 +[3] https://discourse.llvm.org/t/clang-gsoc-2025-usability-improvements-for-trapping-undefined-behavior-sanitizer/84568/11 +[4] https://github.com/llvm/llvm-project/pull/147997 +[5] https://discourse.llvm.org/t/rfc-emit-a-warning-when-fsanitize-trap-is-passed-without-associated-fsanitize/87893 +[6] https://github.com/llvm/llvm-project/pull/154618 +[7] https://github.com/llvm/llvm-project/pull/153845 +[8] https://github.com/llvm/llvm-project/issues/150707 \ No newline at end of file From a93e205785ca218ec61df464ad9b9d4c174a5893 Mon Sep 17 00:00:00 2001 From: Anthony Tran Date: Tue, 2 Sep 2025 12:59:38 -0700 Subject: [PATCH 02/14] Port over Dan's changes/TODOs from Google Doc --- .../2025-09-01-gsoc-ubsan-trap-messages.md | 334 +++++++++++++++--- 1 file changed, 290 insertions(+), 44 deletions(-) diff --git a/content/2025-09-01-gsoc-ubsan-trap-messages.md b/content/2025-09-01-gsoc-ubsan-trap-messages.md index 496e1d9d..e36a3100 100644 --- a/content/2025-09-01-gsoc-ubsan-trap-messages.md +++ b/content/2025-09-01-gsoc-ubsan-trap-messages.md @@ -1,123 +1,369 @@ --- -author: "Anthony Tran (anthonyhatran)" +author: "Anthony Tran" date: "2025-09-01" tags: ["GSoC", "Clang", "CodeGen"] title: "GSoC 2025: Usability Improvements for the Undefined Behavior Sanitizer" --- + ## Introduction -Hi everyone, my name is Anthony and I had the pleasure of working on improving the Undefined Behavior Sanitizer this Google Summer of Code 2025. My mentors were Dan Liew and Michael Buch. + +My name is Anthony and I had the pleasure of working on improving the Undefined Behavior Sanitizer this Google Summer of Code 2025. My mentors were Dan Liew and Michael Buch. + ## Background -Undefined Behavior Sanitizer (UBSan) is an undefined behavior checker which detects some of the undefined behaviors in C, C++, and Objective-C languages at runtime. This project focused mainly on trapping UBSan, which is evoked through `-fsanitize-trap=<...>` along with `-fsanitize=<...>`. Trapping UBSan is a lighter-weight version of runtime UBSan because upon detection of undefined behavior a trap instruction is executed rather than calling into a runtime library to handle the undefined behavior. This makes it more appealing for kernel, embedded, and production hardening use cases. -TODO: Show example of normal userspace UBSan and how it outputs the problem -TODO: show example of trapping UBSan. Both with and without the debugger. This will let you illustrate the problem of it being hard to understand what happened, even in the debugger. This gives you the motivation for your work which is currently missing. +Undefined Behavior Sanitizer (UBSan) is a tool for detecting a subset of the undefined behaviors in the C, C++, and Objective-C languages at runtime. This project focused mainly on the trapping variant of UBSan, which is evoked through `-fsanitize-trap=<...>` along with `-fsanitize=<...>`. Trapping UBSan is a lighter-weight version of UBSan because upon detection of undefined behavior a trap instruction is executed rather than calling into a runtime library to handle the undefined behavior. This makes it more appealing for kernel, embedded, and production hardening use cases. + + +However, an issue with trapping UBSan prior to my work was that it was much harder to debug undefined behavior when it is detected when compared to the non-trapping mode. To illustrate this consider this C program that reads integers from the command line arguments and adds them. + + +``` +#include +#include + + +int add(int a, int b) { + return a+b; +} + + +int main(int argc, const char** argv) { + if (argc < 3) + return 1; + int a = atoi(argv[1]); + int b = atoi(argv[2]); + int result = add(a, b); + printf("Added %d + %d = %d\n", a, b, result); + return 0; +} +``` + + +If this program is compiled and executed using UBSan with its userspace runtime it provides helpful output diagnosing the problem and also allows execution to continue. + + +``` +$ bin/clang -fsanitize=undefined add.c -g -o add && ./add 2147483647 1 +add.c:5:13: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int' +SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior tmp/add.c:5:13 +Added 2147483647 + 1 = -2147483648 +``` + + +In contrast when using UBSan in trapping mode the program immediately terminates when undefined behavior is detected as shown below. + + +``` +$ clang -fsanitize=undefined -fsanitize-trap=undefined add.c -g -o add && ./add 2147483647 1 +[1] 54357 trace trap ./tmp/add 2147483647 1 +``` + + +This is the expected behavior of trapping mode but how should a developer debug what happened when a trap is hit? If we attach a debugger and run the example program this is the output LLDB shows. + + +``` +$ lldb ./add -- 2147483647 1 +(lldb) target create "./tmp/add" +(lldb) settings set -- target.run-args "2147483647" "1" +(lldb) r +Process 17347 launched: '/tmp/add' (arm64) +Process 17347 stopped +* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BREAKPOINT (code=1, subcode=0x100003d3c) + frame #0: 0x0000000100003d3c add`add(a=2147483647, b=1) at add.c:5:13 + 2 #include + 3 + 4 int add(int a, int b) { +-> 5 return a+b; + 6 } + 7 + 8 int main(int argc, const char** argv) { +(lldb) bt +* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BREAKPOINT (code=1, subcode=0x100003d3c) + * frame #0: 0x0000000100003d3c add`add(a=2147483647, b=1) at add.c:5:13 + frame #1: 0x0000000100003eec add`main(argc=3, argv=0x000000016fdff110) at add.c:13:18 + frame #2: 0x00000001842bab98 dyld`start + 6076 + + +(lldb) dis -p +add`add: +-> 0x100003d3c <+40>: brk #0x5500 + 0x100003d40 <+44>: ldr w0, [sp, #0x4] + 0x100003d44 <+48>: add sp, sp, #0x10 + 0x100003d48 <+52>: ret +``` + + +We can see that a `brk` instruction was hit while handling the `a+b` expression but there is no good explanation of what happened. `brk` is the trap instruction on arm64 but it is not particularly clear this has anything to do with UBSan. For this toy example we can speculate that integer overflow occurred because the program was built with trapping UBSan and the trap was hit while handling `a+b`. However, in real programs built with trapping UBSan and potentially other hardening mechanisms it is often far less obvious what happened. + + +For this particular example. The information that this is an integer overflow UBSan check is actually there but it is not very obvious. On x86_64 and arm64 the reason for trapping is actually encoded in the operand to the trap instruction [9]. In this case the `#0x5500` immediate to the brk instruction encodes that this is a UBSan trap for integer overflow. The UBSan immediate is encoded as `('U' << 8) + SanitizerHandler` where `SanitizerHandler` is the enum value from the `SanitierHandler` enum inside Clang’s internals. + + +As we can see the debugging experience with UBSan traps is not ideal and improving this was the primary goal of the GSoC project. + + + ## Human readable descriptions of UBSan traps in LLDB + During my GSoC project I implemented support for displaying human readable descriptions of UBSan traps in LLDB to improve the debugging experience. -The approach used is based on how `__builtin_verbose_trap` is implemented. -This was done by inserting a fake frame into debug-info, which is formatted like so: `__clang_trap_msg$$`. This specific format -is recognized and can be encoded by LLDB, which resembles what is used for `__builtin_verbose_trap`. +The approach used is based on how `__builtin_verbose_trap` is currently implemented inside Clang. At a high-level this works by encoding the reason for trapping as a string on the trap instruction in the debug info for the program being compiled. Then when a trap is hit in the debugger, the debugger retrieves this string and shows it as the reason for trapping. + + +### An Alternative approach + + +An alternative to this approach would be to teach debuggers (e.g. LLDB) to decode the trap reason encoded in trap instructions in the debugger. However, this approach wasn’t taken for several reasons: + + +Using the trap reason encoded in trap instructions only works for x86_64 and arm64. The approach that I used works for all targets where debug info is supported (many more). +Relying on decoding the trap reason encoded in the trap instruction creates a tight coupling between the compiler and the debugger because if the encoding ever changes +The debugger would need to be changed to adapt to the new encoding. +Older versions of the debugger would fail to work with binaries using the new encoding. +New versions of the debugger would fail to work with binaries using the old encoding. +In contrast, encoding the trap reason as a string in the debug info is a much looser coupling because the compiler is free to change the trap reason without changes to the debugger. -In both cases, the compiler encodes the trap's context by emitting the trap instruction inside an artificial function with a specially-formatted name (e.g., `__clang_trap_msg$$`). When the trap occurs, LLDB's Verbose Trap StackFrame Recognizer identifies this special name in the debug info and uses the extracted category and message to generate the user-friendly stop reason. By adopting this existing protocol, this feature ensures robust and consistent behavior within the debugger. -Take for example this erroneous program, `foo.c`: + +### Encoding the trap reason in the debug info + + +As previously mentioned the approach I took is based on how `__builtin_verbose_trap` encodes its message into debug info. This is done by pretending in the debug info that the trap instruction was inlined from another function, where that function is artificially generated and its name is of the form `__clang_trap_msg$$`, where and are the trap category and message to display when trapping respectively. This function does not actually exist in the compiled program. It only exists in the debug info as a convenient (albeit hacky) way to describe the reason for trapping. + + +If we take the example shown earlier and compile we can see this in the LLVM IR. + ``` -#include +$ clang -fsanitize=undefined -fsanitize-trap=undefined add.c -g -o - -o - -S -emit-llvm -fsanitize-debug-trap-reasons=basic + + +; Function Attrs: noinline nounwind optnone ssp uwtable(sync) +define i32 @add(i32 noundef %a, i32 noundef %b) #0 !dbg !17 !func_sanitize !22 { +entry: + %a.addr = alloca i32, align 4 + %b.addr = alloca i32, align 4 + store i32 %a, ptr %a.addr, align 4 + #dbg_declare(ptr %a.addr, !23, !DIExpression(), !24) + store i32 %b, ptr %b.addr, align 4 + #dbg_declare(ptr %b.addr, !25, !DIExpression(), !26) + %0 = load i32, ptr %a.addr, align 4, !dbg !27 + %1 = load i32, ptr %b.addr, align 4, !dbg !28 + %2 = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %0, i32 %1), !dbg !29, !nosanitize !21 + %3 = extractvalue { i32, i1 } %2, 0, !dbg !29, !nosanitize !21 + %4 = extractvalue { i32, i1 } %2, 1, !dbg !29, !nosanitize !21 + %5 = xor i1 %4, true, !dbg !29, !nosanitize !21 + br i1 %5, label %cont, label %trap, !dbg !29, !prof !30, !nosanitize !21 + + +trap: ; preds = %entry + call void @llvm.ubsantrap(i8 0) #4, !dbg !31, !nosanitize !21 + unreachable, !dbg !31, !nosanitize !21 + + +cont: ; preds = %entry + ret i32 %3, !dbg !34 +} + + +;... + -int main() { return INT_MAX + 1; } +!29 = !DILocation(line: 5, column: 13, scope: !17) +!30 = !{!"branch_weights", i32 1048575, i32 1} +!31 = !DILocation(line: 0, scope: !32, inlinedAt: !29) +!32 = distinct !DISubprogram(name: "__clang_trap_msg$Undefined Behavior Sanitizer$Integer addition overflowed", scope: !2, file: !2, type: !33, flags: DIFlagArtificial, spFlags: DISPFlagDefinition, unit: !14) ``` -When run with these flags before my change: -`$ clang -fsanitize=signed-integer-overflow -fsanitize-trap=signed-integer-overflow foo.c` -Debug info would not provide a stopping reason, leaving the user possibly confused. +The debug metadata for the `@llvm.ubsantrap` call is `!31`. That `DILocation` has the scope of the `DISubprogram` assigned to `!32` which is the artificial function which encodes the trap category (`Undefined Behavior Sanitizer`) and the trap message (`Integer addition overflowed`). Note that the `DILocation` for `!31` has `inlinedAt:` which tells us that the trap was inlined from !32 into the location at !29 which is the location of the `a+b` expression in the `add` function. -After the changes, this line is emitted in LLVM IR: -`!18 = distinct !DISubprogram(name: "__clang_trap_msg$Undefined Behavior Sanitizer$signed integer addition overflow in '2147483647 + 1'", scope: !1, file: !1, type: !19, flags: DIFlagArtificial, spFlags: DISPFlagDefinition, unit: !0)` -Which looks something like this in LLDB: +I implemented this change in [1]. -`stop reason = Undefined Behavior Sanitizer: signed integer addition overflow in '2147483647 + 1'` -Previously, the program would stop execution, but the user would not know why. With the new feature, the stop reason is apparent. +### Debug info size changes + + +One concern that a reviewer had was the debug info size difference. Using bloaty, I tested a release build of clang with the `-fsanitize-debug-trap-reasons` flag enabled, and one with it disabled (`-fno-sanitize-debug-trap-reasons`). We found that the size difference was negligible; results are below. -The `-fsanitize-debug-trap-reasons` flag [1] enables trap messages for UBSan, which provides context for stop reasons in trapping UBSan. -One concern that a reviewer had was the debug info size difference. Using bloaty, I tested a release build of clang with the `-fsanitize-debug-trap-reasons` flag enabled, and one with it disabled (`-fno-sanitize-debug-trap-reasons`). Results are below. ``` - FILE SIZE VM SIZE - -------------- -------------- - +0.3% +6.01Mi +0.3% +6.01Mi ,__debug_info - +2.0% +2.26Mi [ = ] 0 [Unmapped] - +1.2% +1.35Mi +1.2% +1.35Mi ,__apple_names - +0.0% +1.01Mi +0.0% +1.01Mi ,__debug_str - +0.8% +636Ki +0.8% +635Ki ,__debug_line - +0.4% +161Ki +0.4% +161Ki ,__debug_ranges - +0.4% +47.9Ki +0.4% +47.9Ki ,__debug_abbrev - +0.0% +14 +0.0% +14 ,__apple_types - [ = ] 0 +0.0% +8 ,__common - [ = ] 0 +7.1% +4 ,__thread_bss - -0.0% -4 -0.0% -4 ,__const - -0.0% -1.27Ki -0.0% -1.27Ki ,__cstring - +0.2% +11.5Mi +0.1% +9.19Mi TOTAL - ``` + FILE SIZE VM SIZE +-------------- -------------- + +0.3% +6.01Mi +0.3% +6.01Mi ,__debug_info + +2.0% +2.26Mi [ = ] 0 [Unmapped] + +1.2% +1.35Mi +1.2% +1.35Mi ,__apple_names + +0.0% +1.01Mi +0.0% +1.01Mi ,__debug_str + +0.8% +636Ki +0.8% +635Ki ,__debug_line + +0.4% +161Ki +0.4% +161Ki ,__debug_ranges + +0.4% +47.9Ki +0.4% +47.9Ki ,__debug_abbrev + +0.0% +14 +0.0% +14 ,__apple_types + [ = ] 0 +0.0% +8 ,__common + [ = ] 0 +7.1% +4 ,__thread_bss + -0.0% -4 -0.0% -4 ,__const + -0.0% -1.27Ki -0.0% -1.27Ki ,__cstring + +0.2% +11.5Mi +0.1% +9.19Mi TOTAL + ``` + + +Note it is likely the code size difference is negligible because because in optimized builds trap instructions in a function get merged together which causes the additional debug info my patch adds to be dropped + + +TODO: We should probably do a comparison in an unoptimized build and show the results there. + + +### Displaying the trap reason in the debugger + + +With the support in the compiler for encoding the trap reasons for UBSan implemented I then turned my attention to displaying these in the LLDB debugger. + + +In this particular case nothing new needs to be implemented in LLDB because the `VerboseTrapFrameRecognizer` in LLDB which was implemented for `__builtin_verbose_trap` is general enough that it already supports any artificial function in the debug info of the form `__clang_trap_msg$$`. + + +So if we take the running example and run it under LLDB, its output now looks like. + + +``` +$ clang -fsanitize=undefined -fsanitize-trap=undefined add.c -g -o add +$ lldb ./add -- 2147483647 1 +(lldb) target create "tmp/add" +(lldb) settings set -- target.run-args "2147483647" "1" +(lldb) r +Process 81705 launched: '/tmp/add' (arm64) +Process 81705 stopped +* thread #1, queue = 'com.apple.main-thread', stop reason = Undefined Behavior Sanitizer: Integer addition overflowed + frame #1: 0x0000000100003d3c add`add(a=2147483647, b=1) at add.c:5:13 + 2 #include + 3 + 4 int add(int a, int b) { +-> 5 return a+b; + 6 } + 7 + 8 int main(int argc, const char** argv) { +(lldb) bt +* thread #1, queue = 'com.apple.main-thread', stop reason = Undefined Behavior Sanitizer: Integer addition overflowed + frame #0: 0x0000000100003d3c add`__clang_trap_msg$Undefined Behavior Sanitizer$Integer addition overflowed at add.c:0 [inlined] + * frame #1: 0x0000000100003d3c add`add(a=2147483647, b=1) at add.c:5:13 + frame #2: 0x0000000100003eec add`main(argc=3, argv=0x000000016fdff110) at add.c:13:18 + frame #3: 0x00000001842bab98 dyld`start + 6076 +(lldb) dis -p +add`__clang_trap_msg$Undefined Behavior Sanitizer$Integer addition overflowed: +-> 0x100003d3c <+40>: brk #0x5500 + 0x100003d40 <+44>: ldr w0, [sp, #0x4] + 0x100003d44 <+48>: add sp, sp, #0x10 + 0x100003d48 <+52>: ret +``` + + +Notice that + + +The stop reason now shows as `Undefined Behavior Sanitizer: Integer addition overflowed`. Previously no helpful stop reason was shown. +We are stopped with `frame #1` selected and the artificial frame (`frame #0`) is present in the backtrace. LLDB does this so stopping is not shown in the artificial function which would be confusing. +The `dis -pc` output claims we are inside the artificial function. This is an artifact of the implementation that is a little confusing but worth the trade-off. + + +So for this portion of my GSoC project the only thing I need to do was added a test case to ensure LLDB behaved appropriately. I did this in [10] + + ## RFC: Add a warning when `-fsanitize=` is passed without associated `-fsanitize-trap=` + +The next part of my GSoC project was to post an RFC and implement a sketch fix for a usability problem with trapping UBSan. + + Currently, clang does not warn about cases where `-fsanitize-trap=` does nothing (silent no-op), particularly in the case where `-fsanitize-trap=` is passed without `-fsanitize=` [2]: + Ex: + + `$ clang -fsanitize-trap=undefined foo.c` + Emits no warning, even though `-fsanitize-trap=undefined` is not doing anything here. -We thought it would be more user-friendly to add a warning for such cases, but due to some initial community pushback [3], it was decided that an RFC should be opened. I ended up writing a sketch patch that emitted a warning for such cases [4]. +We thought it would be more user-friendly to add a warning for such cases, but due to some initial community pushback [3], it was decided that an RFC should be opened. I ended up writing a sketch patch that emitted a warning for such cases [4]. + Ex: + + `$ clang -fsanitize-trap=undefined foo.c` + Would now emit: + + `warning: -fsanitize-trap=undefined has no effect because the "undefined" sanitizer is disabled; consider passing "-fsanitize=undefined" to enable the sanitizer` -Unfortunately, we found that the emission of such warnings could become exceedingly complicated and a point of contention due to the the existence of sanitizer groups, subgroups, and individual sanitizers. Determining the correct behavior for various cases, historical precedence with no-ops, interference with current build systems, prioritization of existing build systems over the user experience, and compatibility with gcc led to the end of the RFC [5]. + +Unfortunately, we found that the emission of such warnings could become exceedingly complicated and a point of contention due to the existence of sanitizer groups, subgroups, and individual sanitizers. Determining the correct behavior for various cases, historical precedence with no-ops, interference with current build systems, prioritization of existing build systems over the user experience, and compatibility with gcc led to the end of the RFC [5]. + ## Expand upon the hard-coded strings in `-fsanitize-debug-trap-reasons` to be more specific -One of my mentors, Dan, created an extension to the clang diagnostics subsystem to work with trap messages [6]. By extending the diagnostics subsystem, it allows us to leverage the powerful semantics that the diagnostics system offers. We still use what was implemented in the first task as sort of a fallback, meaning that if no additional context was needed for the trap message, then the default hard-coded string will be used instead. + +One of my mentors, Dan, created an extension to the clang diagnostics subsystem to work with trap messages [6]. By extending the diagnostics subsystem, it allows us to leverage the powerful string formatting engine of the diagnostics system. We still use what was implemented in the first task as sort of a fallback, meaning that if no additional context was needed for the trap message, then the default hard-coded string will be used instead. + ## What I've Learned + Before I started this GSoC, I barely even knew how to build clang and LLVM or use git in a large open-source project. My mentors showed me the ropes on a lot of things, and I came out of this summer knowing a lot more of how to get my changes properly reviewed and upstreamed. + ## Future Work + The diagnostics extension for trap messages has been recently upstreamed by Dan [6]. As of right now, only signed and unsigned overflow for addition, subtraction, and multiplication is being used by this system. I've investigated some use cases outside of signed and unsigned overflow, and I plan to implement that within the next week(s). [stub, since I'll probably upstream my changes soon so this part will change] -There is also an issue [8] where trap messages are not emitted in cases where they should be due to a null check. The purpose of the null check was to prevent a nullptr dereference that occurred in debug-info prologue. This is a known issue to which there isn't a concrete solution as of current. + +There is also an issue [8] where trap messages are not emitted in cases where they should be due to a null check. The purpose of the null check was to prevent a nullptr dereference that occurred in the debug-info prologue. This is a known issue to which there isn't a concrete solution as of current. + ## Conclusion -I want to give a special thanks to my mentors, Dan and Michael, for being there for me the whole way. I appreciate their commitment to the project and their patience with me. I'm incredibly grateful that I was able to work on this project and I wouldn't have traded it for anything else. Being a beginner to both LLVM and open-source, I have to admit I was overwhelmed at first, but slowly, along with their help, I was able to gain at least a semblance of understanding of how things worked. I could not have asked for a better set of mentors, so again, a huge thanks to them. I want to also extend my gratitude to the LLVM Foundation for this opportunity. + +I want to give a special thanks to my mentors, Dan and Michael, for being there for me the whole way. They helped a lot with guiding me through git, the LLVM code base, and even this blog post. I appreciate their commitment to the project and their patience with me. + +I'm incredibly grateful that I was able to work on this project and I wouldn't have traded it for anything else. Being a beginner to both LLVM and open-source, I have to admit I was overwhelmed at first, but slowly, along with their help, I was able to gain at least a semblance of understanding of how things worked. I could not have asked for a better set of mentors, so again, a huge thanks to them. I want to also extend my gratitude to the LLVM Foundation for this opportunity. + I've had a lot of fun with this project and I hope to contribute more to LLVM, or in open-source in general, in the future. + ## External Links [1] https://github.com/llvm/llvm-project/pull/145967 + [2] https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#id5 + [3] https://discourse.llvm.org/t/clang-gsoc-2025-usability-improvements-for-trapping-undefined-behavior-sanitizer/84568/11 + [4] https://github.com/llvm/llvm-project/pull/147997 + [5] https://discourse.llvm.org/t/rfc-emit-a-warning-when-fsanitize-trap-is-passed-without-associated-fsanitize/87893 + [6] https://github.com/llvm/llvm-project/pull/154618 + [7] https://github.com/llvm/llvm-project/pull/153845 -[8] https://github.com/llvm/llvm-project/issues/150707 \ No newline at end of file + +[8] https://github.com/llvm/llvm-project/issues/150707 + +[9] https://maskray.me/blog/2023-01-29-all-about-undefined-behavior-sanitizer + +[10] https://github.com/llvm/llvm-project/pull/151231 \ No newline at end of file From ba53586cc2cb60b01692cd906ced32d46714e83f Mon Sep 17 00:00:00 2001 From: Anthony Tran Date: Tue, 2 Sep 2025 21:05:47 -0700 Subject: [PATCH 03/14] Move post into correct dir --- content/{ => posts}/2025-09-01-gsoc-ubsan-trap-messages.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename content/{ => posts}/2025-09-01-gsoc-ubsan-trap-messages.md (100%) diff --git a/content/2025-09-01-gsoc-ubsan-trap-messages.md b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md similarity index 100% rename from content/2025-09-01-gsoc-ubsan-trap-messages.md rename to content/posts/2025-09-01-gsoc-ubsan-trap-messages.md From 7d5dcb7062cd9b822e53013c4062153135eb98d8 Mon Sep 17 00:00:00 2001 From: Anthony Tran Date: Fri, 5 Sep 2025 21:32:19 -0700 Subject: [PATCH 04/14] Address most of feedback, port over more changes --- .../2025-09-01-gsoc-ubsan-trap-messages.md | 133 ++++++++++++------ 1 file changed, 87 insertions(+), 46 deletions(-) diff --git a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md index e36a3100..12310a0e 100644 --- a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md +++ b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md @@ -17,20 +17,53 @@ My name is Anthony and I had the pleasure of working on improving the Undefined Undefined Behavior Sanitizer (UBSan) is a tool for detecting a subset of the undefined behaviors in the C, C++, and Objective-C languages at runtime. This project focused mainly on the trapping variant of UBSan, which is evoked through `-fsanitize-trap=<...>` along with `-fsanitize=<...>`. Trapping UBSan is a lighter-weight version of UBSan because upon detection of undefined behavior a trap instruction is executed rather than calling into a runtime library to handle the undefined behavior. This makes it more appealing for kernel, embedded, and production hardening use cases. +Here are some basic examples of undefined behavior: -However, an issue with trapping UBSan prior to my work was that it was much harder to debug undefined behavior when it is detected when compared to the non-trapping mode. To illustrate this consider this C program that reads integers from the command line arguments and adds them. +**Signed integer overflow** (addition) +``` +#include +int main() { + int overflow = INT_MAX + 1; + return 0; +} ``` -#include -#include +**Shift overflow** (right) +``` +int main() { + signed char x = 1; + int y = x >> 8; + + return 0; +} +``` -int add(int a, int b) { - return a+b; +**Float cast overflow** +``` +#include +#include + +int main() { + float f = 2.0f * INT_MAX; + int i = (int)f; + + return 0; } +``` + +For all other undefined behavior that can be detected by UBSan, check out the [official clang documentation on it](https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html). + +An issue with trapping UBSan prior to my work was that it was much harder to debug undefined behavior when it is detected when compared to the non-trapping mode. To illustrate this consider this C program that reads integers from the command line arguments and adds them. +``` +#include +#include +int add(int a, int b) { + return a + b; +} int main(int argc, const char** argv) { if (argc < 3) return 1; @@ -49,7 +82,7 @@ If this program is compiled and executed using UBSan with its userspace runtime ``` $ bin/clang -fsanitize=undefined add.c -g -o add && ./add 2147483647 1 add.c:5:13: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int' -SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior tmp/add.c:5:13 +SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior add.c:5:13 Added 2147483647 + 1 = -2147483648 ``` @@ -59,7 +92,7 @@ In contrast when using UBSan in trapping mode the program immediately terminates ``` $ clang -fsanitize=undefined -fsanitize-trap=undefined add.c -g -o add && ./add 2147483647 1 -[1] 54357 trace trap ./tmp/add 2147483647 1 +[1] 54357 trace trap ./add 2147483647 1 ``` @@ -68,17 +101,17 @@ This is the expected behavior of trapping mode but how should a developer debug ``` $ lldb ./add -- 2147483647 1 -(lldb) target create "./tmp/add" +(lldb) target create "./add" (lldb) settings set -- target.run-args "2147483647" "1" (lldb) r -Process 17347 launched: '/tmp/add' (arm64) +Process 17347 launched: 'add' (arm64) Process 17347 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BREAKPOINT (code=1, subcode=0x100003d3c) frame #0: 0x0000000100003d3c add`add(a=2147483647, b=1) at add.c:5:13 2 #include 3 4 int add(int a, int b) { --> 5 return a+b; +-> 5 return a + b; 6 } 7 8 int main(int argc, const char** argv) { @@ -87,8 +120,6 @@ Process 17347 stopped * frame #0: 0x0000000100003d3c add`add(a=2147483647, b=1) at add.c:5:13 frame #1: 0x0000000100003eec add`main(argc=3, argv=0x000000016fdff110) at add.c:13:18 frame #2: 0x00000001842bab98 dyld`start + 6076 - - (lldb) dis -p add`add: -> 0x100003d3c <+40>: brk #0x5500 @@ -98,10 +129,10 @@ add`add: ``` -We can see that a `brk` instruction was hit while handling the `a+b` expression but there is no good explanation of what happened. `brk` is the trap instruction on arm64 but it is not particularly clear this has anything to do with UBSan. For this toy example we can speculate that integer overflow occurred because the program was built with trapping UBSan and the trap was hit while handling `a+b`. However, in real programs built with trapping UBSan and potentially other hardening mechanisms it is often far less obvious what happened. +We can see that a `brk` instruction was hit while handling the `a + b` expression but there is no good explanation of what happened. `brk` is the trap instruction on arm64 but it is not particularly clear this has anything to do with UBSan. For this toy example we can speculate that integer overflow occurred because the program was built with trapping UBSan and the trap was hit while handling `a + b`. However, in real programs built with trapping UBSan and potentially other hardening mechanisms it is often far less obvious what happened. -For this particular example. The information that this is an integer overflow UBSan check is actually there but it is not very obvious. On x86_64 and arm64 the reason for trapping is actually encoded in the operand to the trap instruction [9]. In this case the `#0x5500` immediate to the brk instruction encodes that this is a UBSan trap for integer overflow. The UBSan immediate is encoded as `('U' << 8) + SanitizerHandler` where `SanitizerHandler` is the enum value from the `SanitierHandler` enum inside Clang’s internals. +For this particular example. The information that this is an integer overflow UBSan check is actually there but it is not very obvious. On x86_64 and arm64 the reason for trapping is actually [encoded in the operand to the trap instruction](https://maskray.me/blog/2023-01-29-all-about-undefined-behavior-sanitizer ). In this case the `#0x5500` immediate to the brk instruction encodes that this is a UBSan trap for integer overflow. The UBSan immediate is encoded as `('U' << 8) + SanitizerHandler` where `SanitizerHandler` is the enum value from the `SanitizerHandler` enum inside Clang’s internals. As we can see the debugging experience with UBSan traps is not ideal and improving this was the primary goal of the GSoC project. @@ -115,7 +146,7 @@ As we can see the debugging experience with UBSan traps is not ideal and improvi During my GSoC project I implemented support for displaying human readable descriptions of UBSan traps in LLDB to improve the debugging experience. -The approach used is based on how `__builtin_verbose_trap` is currently implemented inside Clang. At a high-level this works by encoding the reason for trapping as a string on the trap instruction in the debug info for the program being compiled. Then when a trap is hit in the debugger, the debugger retrieves this string and shows it as the reason for trapping. +The approach used is based on how `__builtin_verbose_trap` is currently implemented inside Clang [11] [12]. `__builtin_verbose_trap` was implemented in the past for [libc++ hardening](https://discourse.llvm.org/t/rfc-hardening-in-libc/73925). At a high-level this works by encoding the reason for trapping as a string on the trap instruction in the debug info for the program being compiled. Then when a trap is hit in the debugger, the debugger retrieves this string and shows it as the reason for trapping. ### An Alternative approach @@ -144,8 +175,6 @@ If we take the example shown earlier and compile we can see this in the LLVM IR. ``` $ clang -fsanitize=undefined -fsanitize-trap=undefined add.c -g -o - -o - -S -emit-llvm -fsanitize-debug-trap-reasons=basic - - ; Function Attrs: noinline nounwind optnone ssp uwtable(sync) define i32 @add(i32 noundef %a, i32 noundef %b) #0 !dbg !17 !func_sanitize !22 { entry: @@ -162,21 +191,13 @@ entry: %4 = extractvalue { i32, i1 } %2, 1, !dbg !29, !nosanitize !21 %5 = xor i1 %4, true, !dbg !29, !nosanitize !21 br i1 %5, label %cont, label %trap, !dbg !29, !prof !30, !nosanitize !21 - - trap: ; preds = %entry call void @llvm.ubsantrap(i8 0) #4, !dbg !31, !nosanitize !21 unreachable, !dbg !31, !nosanitize !21 - - cont: ; preds = %entry ret i32 %3, !dbg !34 } - - ;... - - !29 = !DILocation(line: 5, column: 13, scope: !17) !30 = !{!"branch_weights", i32 1048575, i32 1} !31 = !DILocation(line: 0, scope: !32, inlinedAt: !29) @@ -184,16 +205,18 @@ cont: ; preds = %entry ``` -The debug metadata for the `@llvm.ubsantrap` call is `!31`. That `DILocation` has the scope of the `DISubprogram` assigned to `!32` which is the artificial function which encodes the trap category (`Undefined Behavior Sanitizer`) and the trap message (`Integer addition overflowed`). Note that the `DILocation` for `!31` has `inlinedAt:` which tells us that the trap was inlined from !32 into the location at !29 which is the location of the `a+b` expression in the `add` function. +The debug metadata for the `@llvm.ubsantrap` call is `!31`. That `DILocation` has the scope of the `DISubprogram` assigned to `!32` which is the artificial function which encodes the trap category (`Undefined Behavior Sanitizer`) and the trap message (`Integer addition overflowed`). Note that the `DILocation` for `!31` has `inlinedAt:` which tells us that the trap was inlined from !32 into the location at !29 which is the location of the `a + b` expression in the `add` function. -I implemented this change in [1]. +I implemented this change on this [PR](https://github.com/llvm/llvm-project/pull/145967). ### Debug info size changes -One concern that a reviewer had was the debug info size difference. Using bloaty, I tested a release build of clang with the `-fsanitize-debug-trap-reasons` flag enabled, and one with it disabled (`-fno-sanitize-debug-trap-reasons`). We found that the size difference was negligible; results are below. +One concern that a reviewer had was the debug info size difference. This was one of the motivations for putting this feature under the new `-fsanitize-debug-trap-reasons` flag because initially (prior to code review), my mentors and I planned to have the trap feature flag accompany the `-fsanitize-trap=` flag. Although the `-fsanitize-debug-trap-reasons` flag is on by default (so long as trapping UBSan is enabled), having the trap reason feature under a flag allows users to opt-out by using the `-fno-sanitize-debug-trap-reasons` flag. + +Using bloaty, I tested a release build of clang with the `-fsanitize-debug-trap-reasons` flag enabled, and one with it disabled (`-fno-sanitize-debug-trap-reasons`). We found that the size difference was negligible; results are below. ``` @@ -215,10 +238,7 @@ One concern that a reviewer had was the debug info size difference. Using bloaty ``` -Note it is likely the code size difference is negligible because because in optimized builds trap instructions in a function get merged together which causes the additional debug info my patch adds to be dropped - - -TODO: We should probably do a comparison in an unoptimized build and show the results there. +Note it is likely the code size difference is negligible because because in optimized builds trap instructions in a function get merged together which causes the additional debug info my patch adds to be dropped. ### Displaying the trap reason in the debugger @@ -236,17 +256,17 @@ So if we take the running example and run it under LLDB, its output now looks li ``` $ clang -fsanitize=undefined -fsanitize-trap=undefined add.c -g -o add $ lldb ./add -- 2147483647 1 -(lldb) target create "tmp/add" +(lldb) target create "add" (lldb) settings set -- target.run-args "2147483647" "1" (lldb) r -Process 81705 launched: '/tmp/add' (arm64) +Process 81705 launched: '/add' (arm64) Process 81705 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = Undefined Behavior Sanitizer: Integer addition overflowed frame #1: 0x0000000100003d3c add`add(a=2147483647, b=1) at add.c:5:13 2 #include 3 4 int add(int a, int b) { --> 5 return a+b; +-> 5 return a + b; 6 } 7 8 int main(int argc, const char** argv) { @@ -265,7 +285,7 @@ add`__clang_trap_msg$Undefined Behavior Sanitizer$Integer addition overflowed: ``` -Notice that +Notice that: The stop reason now shows as `Undefined Behavior Sanitizer: Integer addition overflowed`. Previously no helpful stop reason was shown. @@ -273,7 +293,7 @@ We are stopped with `frame #1` selected and the artificial frame (`frame #0`) is The `dis -pc` output claims we are inside the artificial function. This is an artifact of the implementation that is a little confusing but worth the trade-off. -So for this portion of my GSoC project the only thing I need to do was added a test case to ensure LLDB behaved appropriately. I did this in [10] +So for this portion of my GSoC project the only thing I need to do was added a test case to ensure LLDB behaved appropriately. This was done on [this PR](https://github.com/llvm/llvm-project/pull/151231). @@ -284,7 +304,7 @@ So for this portion of my GSoC project the only thing I need to do was added a t The next part of my GSoC project was to post an RFC and implement a sketch fix for a usability problem with trapping UBSan. -Currently, clang does not warn about cases where `-fsanitize-trap=` does nothing (silent no-op), particularly in the case where `-fsanitize-trap=` is passed without `-fsanitize=` [2]: +Currently, clang does not warn about cases where `-fsanitize-trap=` does nothing (silent no-op), particularly [in the case where `-fsanitize-trap=` is passed without `-fsanitize=`](https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#id5): Ex: @@ -296,7 +316,7 @@ Ex: Emits no warning, even though `-fsanitize-trap=undefined` is not doing anything here. -We thought it would be more user-friendly to add a warning for such cases, but due to some initial community pushback [3], it was decided that an RFC should be opened. I ended up writing a sketch patch that emitted a warning for such cases [4]. +We thought it would be more user-friendly to add a warning for such cases, but due to some [initial community pushback](https://discourse.llvm.org/t/clang-gsoc-2025-usability-improvements-for-trapping-undefined-behavior-sanitizer/84568/11), it was decided that an RFC should be opened. I ended up writing a [sketch patch that emitted a warning for such cases](https://github.com/llvm/llvm-project/pull/147997). Ex: @@ -311,28 +331,39 @@ Would now emit: `warning: -fsanitize-trap=undefined has no effect because the "undefined" sanitizer is disabled; consider passing "-fsanitize=undefined" to enable the sanitizer` -Unfortunately, we found that the emission of such warnings could become exceedingly complicated and a point of contention due to the existence of sanitizer groups, subgroups, and individual sanitizers. Determining the correct behavior for various cases, historical precedence with no-ops, interference with current build systems, prioritization of existing build systems over the user experience, and compatibility with gcc led to the end of the RFC [5]. +Unfortunately, we found that the emission of such warnings could become exceedingly complicated and a point of contention due to the existence of sanitizer groups, subgroups, and individual sanitizers. Determining the correct behavior for various cases, historical precedence with no-ops, interference with current build systems, prioritization of existing build systems over the user experience, and compatibility with gcc led to the end of [the RFC](https://discourse.llvm.org/t/rfc-emit-a-warning-when-fsanitize-trap-is-passed-without-associated-fsanitize/87893). ## Expand upon the hard-coded strings in `-fsanitize-debug-trap-reasons` to be more specific +There were two initial design decisions to pick from here. Either I could: + +**(a)** Use some string formatting, such as LLVM's formatvariadic or raw_ostream. For some implementation context, the function in which the trap messages were generated was called in a function called `EmitTrapCheck`, so the idea was to pass down extra information from earlier in the call stack before `EmitTrapCheck` was called. + +or + +**(b)** Extend clang's diagnostic system to accomodate trap reasons. This is explained further below. + +Due to the time it took to complete the first two tasks, I chose the first option. I deemed the second option to be a large commitment that I wouldn't have been able to do by the end of the GSoC coding period. Additionally, I was unsure if building on top of the diagnostics subsystem would be approved since diagnostics were originally intended to emit messages within the command line, not debug info. -One of my mentors, Dan, created an extension to the clang diagnostics subsystem to work with trap messages [6]. By extending the diagnostics subsystem, it allows us to leverage the powerful string formatting engine of the diagnostics system. We still use what was implemented in the first task as sort of a fallback, meaning that if no additional context was needed for the trap message, then the default hard-coded string will be used instead. +After taking some time to investigate possible cases where extra information in trap messages could be useful, I put up [a PR](https://github.com/llvm/llvm-project/pull/153845). The patch was admittedly quite messy, so to take a cleaner approach that was more aligned with clang's frontend, one of my mentors, Dan, ended up following through with option (b) to [create the extension to the clang diagnostics subsystem to work with trap messages](https://github.com/llvm/llvm-project/pull/154618). By extending the diagnostics subsystem, it allows us to leverage the powerful string formatting engine of the diagnostics system. + +However, this doesn't mean that the effort used to write proper hard-coded trap messages under `-fsanitize-debug-trap-reasons` was thrown away. Rather as of [Dan's patch](https://github.com/llvm/llvm-project/pull/154618), the flag now has two options: `-fsanitize-debug-trap-reasons=basic` for the hard-coded trap messages and `-fsanitize-debug-trap-reasons=detailed` for the detailed trap messages which utilize the trap reasons diagnostics API. This was done in case users did not want to deal with the larger binary sizes that came with the trap reasons. ## What I've Learned -Before I started this GSoC, I barely even knew how to build clang and LLVM or use git in a large open-source project. My mentors showed me the ropes on a lot of things, and I came out of this summer knowing a lot more of how to get my changes properly reviewed and upstreamed. +Before I started this GSoC, I barely even knew how to build clang and LLVM or use git in a large open-source project. My mentors showed me the ropes on a lot of things (particularly how to properly use git and build and configure clang), and I came out of this summer knowing a lot more of how to get my changes properly reviewed and upstreamed. I was also able to get a firmer understanding of the Undefined Behavior Sanitizer, better C++ programming practices, and the LLVM codebase. -## Future Work +## Work to Do -The diagnostics extension for trap messages has been recently upstreamed by Dan [6]. As of right now, only signed and unsigned overflow for addition, subtraction, and multiplication is being used by this system. I've investigated some use cases outside of signed and unsigned overflow, and I plan to implement that within the next week(s). [stub, since I'll probably upstream my changes soon so this part will change] +As stated prior, the diagnostics extension for trap messages has been [upstreamed by Dan](https://github.com/llvm/llvm-project/pull/154618). As of right now, only signed and unsigned overflow for addition, subtraction, and multiplication are being used by this system. I plan to integrate what I found on my [abandoned PR](https://github.com/llvm/llvm-project/pull/153845) by building on top of what Dan has already done. This will be done after the GSoC coding period. -There is also an issue [8] where trap messages are not emitted in cases where they should be due to a null check. The purpose of the null check was to prevent a nullptr dereference that occurred in the debug-info prologue. This is a known issue to which there isn't a concrete solution as of current. +There is [an issue](https://github.com/llvm/llvm-project/issues/150707) where trap messages are not emitted in cases where they should be due to a null check. The purpose of the null check was to prevent a nullptr dereference that occurred in the debug-info prologue. This is a known issue to which there isn't a concrete solution as of current. ## Conclusion @@ -345,6 +376,8 @@ I'm incredibly grateful that I was able to work on this project and I wouldn't h I've had a lot of fun with this project and I hope to contribute more to LLVM, or in open-source in general, in the future. +## Landed PRs +https://github.com/llvm/llvm-project/commits?author=anthonyhatran ## External Links @@ -366,4 +399,12 @@ I've had a lot of fun with this project and I hope to contribute more to LLVM, o [9] https://maskray.me/blog/2023-01-29-all-about-undefined-behavior-sanitizer -[10] https://github.com/llvm/llvm-project/pull/151231 \ No newline at end of file +[10] https://github.com/llvm/llvm-project/pull/151231 + +[11] https://discourse.llvm.org/t/rfc-adding-builtin-verbose-trap-string-literal/75845 + +[12] https://github.com/llvm/llvm-project/pull/79230 + +[13] https://discourse.llvm.org/t/rfc-hardening-in-libc/73925 + +[14] https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html \ No newline at end of file From 81e72cb4db9e8bf984675fc02faf423b361988c3 Mon Sep 17 00:00:00 2001 From: Anthony Tran Date: Wed, 10 Sep 2025 16:16:55 -0700 Subject: [PATCH 05/14] Change one header and remove examples --- .../2025-09-01-gsoc-ubsan-trap-messages.md | 42 +------------------ 1 file changed, 2 insertions(+), 40 deletions(-) diff --git a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md index 12310a0e..610e5da0 100644 --- a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md +++ b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md @@ -15,45 +15,7 @@ My name is Anthony and I had the pleasure of working on improving the Undefined ## Background -Undefined Behavior Sanitizer (UBSan) is a tool for detecting a subset of the undefined behaviors in the C, C++, and Objective-C languages at runtime. This project focused mainly on the trapping variant of UBSan, which is evoked through `-fsanitize-trap=<...>` along with `-fsanitize=<...>`. Trapping UBSan is a lighter-weight version of UBSan because upon detection of undefined behavior a trap instruction is executed rather than calling into a runtime library to handle the undefined behavior. This makes it more appealing for kernel, embedded, and production hardening use cases. - -Here are some basic examples of undefined behavior: - -**Signed integer overflow** (addition) -``` -#include - -int main() { - int overflow = INT_MAX + 1; - - return 0; -} -``` - -**Shift overflow** (right) -``` -int main() { - signed char x = 1; - int y = x >> 8; - - return 0; -} -``` - -**Float cast overflow** -``` -#include -#include - -int main() { - float f = 2.0f * INT_MAX; - int i = (int)f; - - return 0; -} -``` - -For all other undefined behavior that can be detected by UBSan, check out the [official clang documentation on it](https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html). +Undefined Behavior Sanitizer (UBSan) is a tool for detecting a subset of the undefined behaviors in the C, C++, and Objective-C languages at runtime. This project focused mainly on the trapping variant of UBSan, which is evoked through `-fsanitize-trap=<...>` along with `-fsanitize=<...>`. Trapping UBSan is a lighter-weight version of UBSan because upon detection of undefined behavior a trap instruction is executed rather than calling into a runtime library to handle the undefined behavior. This makes it more appealing for kernel, embedded, and production hardening use cases. For cases of undefined behavior that can be detected by UBSan, check out the [official clang documentation](https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html). An issue with trapping UBSan prior to my work was that it was much harder to debug undefined behavior when it is detected when compared to the non-trapping mode. To illustrate this consider this C program that reads integers from the command line arguments and adds them. @@ -149,7 +111,7 @@ During my GSoC project I implemented support for displaying human readable descr The approach used is based on how `__builtin_verbose_trap` is currently implemented inside Clang [11] [12]. `__builtin_verbose_trap` was implemented in the past for [libc++ hardening](https://discourse.llvm.org/t/rfc-hardening-in-libc/73925). At a high-level this works by encoding the reason for trapping as a string on the trap instruction in the debug info for the program being compiled. Then when a trap is hit in the debugger, the debugger retrieves this string and shows it as the reason for trapping. -### An Alternative approach +### Let the debugger handle most of the work An alternative to this approach would be to teach debuggers (e.g. LLDB) to decode the trap reason encoded in trap instructions in the debugger. However, this approach wasn’t taken for several reasons: From 4336551468941db8d051dab50e81422545ec410c Mon Sep 17 00:00:00 2001 From: Anthony Tran Date: Wed, 10 Sep 2025 16:31:14 -0700 Subject: [PATCH 06/14] Modify section to have bullet points --- .../posts/2025-09-01-gsoc-ubsan-trap-messages.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md index 610e5da0..599b76c0 100644 --- a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md +++ b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md @@ -117,12 +117,14 @@ The approach used is based on how `__builtin_verbose_trap` is currently implemen An alternative to this approach would be to teach debuggers (e.g. LLDB) to decode the trap reason encoded in trap instructions in the debugger. However, this approach wasn’t taken for several reasons: -Using the trap reason encoded in trap instructions only works for x86_64 and arm64. The approach that I used works for all targets where debug info is supported (many more). -Relying on decoding the trap reason encoded in the trap instruction creates a tight coupling between the compiler and the debugger because if the encoding ever changes -The debugger would need to be changed to adapt to the new encoding. -Older versions of the debugger would fail to work with binaries using the new encoding. -New versions of the debugger would fail to work with binaries using the old encoding. -In contrast, encoding the trap reason as a string in the debug info is a much looser coupling because the compiler is free to change the trap reason without changes to the debugger. +* Using the trap reason encoded in trap instructions only works for x86_64 and arm64. The approach that I used works for all targets where debug info is supported (many more). + +* Relying on decoding the trap reason encoded in the trap instruction creates a tight coupling between the compiler and the debugger because if the encoding ever changes: + * The debugger would need to be changed to adapt to the new encoding. + + * Older versions of the debugger would fail to work with binaries using the new encoding. + * New versions of the debugger would fail to work with binaries using the old encoding. +* In contrast, encoding the trap reason as a string in the debug info is a much looser coupling because the compiler is free to change the trap reason without changes to the debugger. From eae4e03d0a3518770939f5b15ffc03e0f8c0791d Mon Sep 17 00:00:00 2001 From: Anthony Tran Date: Wed, 10 Sep 2025 16:53:09 -0700 Subject: [PATCH 07/14] Add link to enum --- content/posts/2025-09-01-gsoc-ubsan-trap-messages.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md index 599b76c0..b9e7f2e3 100644 --- a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md +++ b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md @@ -94,7 +94,7 @@ add`add: We can see that a `brk` instruction was hit while handling the `a + b` expression but there is no good explanation of what happened. `brk` is the trap instruction on arm64 but it is not particularly clear this has anything to do with UBSan. For this toy example we can speculate that integer overflow occurred because the program was built with trapping UBSan and the trap was hit while handling `a + b`. However, in real programs built with trapping UBSan and potentially other hardening mechanisms it is often far less obvious what happened. -For this particular example. The information that this is an integer overflow UBSan check is actually there but it is not very obvious. On x86_64 and arm64 the reason for trapping is actually [encoded in the operand to the trap instruction](https://maskray.me/blog/2023-01-29-all-about-undefined-behavior-sanitizer ). In this case the `#0x5500` immediate to the brk instruction encodes that this is a UBSan trap for integer overflow. The UBSan immediate is encoded as `('U' << 8) + SanitizerHandler` where `SanitizerHandler` is the enum value from the `SanitizerHandler` enum inside Clang’s internals. +For this particular example. The information that this is an integer overflow UBSan check is actually there but it is not very obvious. On x86_64 and arm64 the reason for trapping is actually [encoded in the operand to the trap instruction](https://maskray.me/blog/2023-01-29-all-about-undefined-behavior-sanitizer ). In this case the `#0x5500` immediate to the brk instruction encodes that this is a UBSan trap for integer overflow. The UBSan immediate is encoded as `('U' << 8) + SanitizerHandler` where [`SanitizerHandler` is the enum value from the `SanitizerHandler` enum inside Clang’s internals](https://github.com/llvm/llvm-project/blob/96195e7d44613e272475c90df187678036f21966/clang/lib/CodeGen/SanitizerHandler.h#L78). As we can see the debugging experience with UBSan traps is not ideal and improving this was the primary goal of the GSoC project. From 23a6de06cd288ccf96fd9341cceca468d082ad2b Mon Sep 17 00:00:00 2001 From: Anthony Tran Date: Thu, 11 Sep 2025 17:26:28 -0700 Subject: [PATCH 08/14] Restructure LLDB section --- content/posts/2025-09-01-gsoc-ubsan-trap-messages.md | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md index b9e7f2e3..6f98b03f 100644 --- a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md +++ b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md @@ -108,9 +108,6 @@ As we can see the debugging experience with UBSan traps is not ideal and improvi During my GSoC project I implemented support for displaying human readable descriptions of UBSan traps in LLDB to improve the debugging experience. -The approach used is based on how `__builtin_verbose_trap` is currently implemented inside Clang [11] [12]. `__builtin_verbose_trap` was implemented in the past for [libc++ hardening](https://discourse.llvm.org/t/rfc-hardening-in-libc/73925). At a high-level this works by encoding the reason for trapping as a string on the trap instruction in the debug info for the program being compiled. Then when a trap is hit in the debugger, the debugger retrieves this string and shows it as the reason for trapping. - - ### Let the debugger handle most of the work @@ -131,7 +128,7 @@ An alternative to this approach would be to teach debuggers (e.g. LLDB) to decod ### Encoding the trap reason in the debug info -As previously mentioned the approach I took is based on how `__builtin_verbose_trap` encodes its message into debug info. This is done by pretending in the debug info that the trap instruction was inlined from another function, where that function is artificially generated and its name is of the form `__clang_trap_msg$$`, where and are the trap category and message to display when trapping respectively. This function does not actually exist in the compiled program. It only exists in the debug info as a convenient (albeit hacky) way to describe the reason for trapping. +The approach I took is based on how `__builtin_verbose_trap` encodes its message into debug info [11] [12], in which `__builtin_verbose_trap` was implemented in the past for [libc++ hardening](https://discourse.llvm.org/t/rfc-hardening-in-libc/73925). This is done by pretending in the debug info that the trap instruction was inlined from another function, where that function is artificially generated and its name is of the form `__clang_trap_msg$$`, where and are the trap category and message to display when trapping respectively. This function does not actually exist in the compiled program. It only exists in the debug info as a convenient (albeit hacky) way to describe the reason for trapping. When a trap is hit in the debugger, the debugger retrieves this string from the debug info and shows it as the reason for trapping. If we take the example shown earlier and compile we can see this in the LLVM IR. From fab6a9ebbc607e8203388a36e39e9b5ccdbef3ba Mon Sep 17 00:00:00 2001 From: Anthony Tran Date: Thu, 11 Sep 2025 23:08:36 -0700 Subject: [PATCH 09/14] Make modifictions to 'Encoding the trap reason...' section --- .../posts/2025-09-01-gsoc-ubsan-trap-messages.md | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md index 6f98b03f..d774d840 100644 --- a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md +++ b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md @@ -128,11 +128,11 @@ An alternative to this approach would be to teach debuggers (e.g. LLDB) to decod ### Encoding the trap reason in the debug info -The approach I took is based on how `__builtin_verbose_trap` encodes its message into debug info [11] [12], in which `__builtin_verbose_trap` was implemented in the past for [libc++ hardening](https://discourse.llvm.org/t/rfc-hardening-in-libc/73925). This is done by pretending in the debug info that the trap instruction was inlined from another function, where that function is artificially generated and its name is of the form `__clang_trap_msg$$`, where and are the trap category and message to display when trapping respectively. This function does not actually exist in the compiled program. It only exists in the debug info as a convenient (albeit hacky) way to describe the reason for trapping. When a trap is hit in the debugger, the debugger retrieves this string from the debug info and shows it as the reason for trapping. +The approach I took is based on how `__builtin_verbose_trap` encodes its message into debug info [11] [12], a feature which was implemented in the past for [libc++ hardening](https://discourse.llvm.org/t/rfc-hardening-in-libc/73925). The core idea is that the trap reason string gets encoded directly in the trap's debug information. +To accomplish this, we needed to find a place to "stuff" the string in the DWARF DIE tree. Using a `DW_TAG_subprogram` was deemed the most straightforward and space-efficient location. This means we create a synthetic `DISubprogram` which is not a real function in the compiled program; it exists only in the debug info as a container. While the string could have been placed elsewhere, for reasons outside the scope of this blog post, it resides on this fake function DIE, with the trap reason encoded in the `DW_TAG_subprogram`'s name. For a deeper dive into this design decision, you can see [15](https://github.com/llvm/llvm-project/pull/145967#issuecomment-3054319138). -If we take the example shown earlier and compile we can see this in the LLVM IR. - +Let's look at the LLVM IR of the previous example to see how this is implemented: ``` $ clang -fsanitize=undefined -fsanitize-trap=undefined add.c -g -o - -o - -S -emit-llvm -fsanitize-debug-trap-reasons=basic @@ -165,8 +165,9 @@ cont: ; preds = %entry !32 = distinct !DISubprogram(name: "__clang_trap_msg$Undefined Behavior Sanitizer$Integer addition overflowed", scope: !2, file: !2, type: !33, flags: DIFlagArtificial, spFlags: DISPFlagDefinition, unit: !14) ``` +The debug metadata for the `@llvm.ubsantrap` call is `!31`. That `DILocation` has the scope of the `DISubprogram` assigned to `!32` which is the artificial function which encodes the trap category. This function's name is formatted as `__clang_trap_msg$$` to encode the trap category (`Undefined Behavior Sanitizer`) and the specific message (`Integer addition overflowed`). This function does not actually exist in the compiled program. It only exists in the debug info as a convenient (albeit hacky) way to describe the reason for trapping. When a trap is hit in the debugger, the debugger retrieves this string from the debug info and shows it as the reason for trapping. -The debug metadata for the `@llvm.ubsantrap` call is `!31`. That `DILocation` has the scope of the `DISubprogram` assigned to `!32` which is the artificial function which encodes the trap category (`Undefined Behavior Sanitizer`) and the trap message (`Integer addition overflowed`). Note that the `DILocation` for `!31` has `inlinedAt:` which tells us that the trap was inlined from !32 into the location at !29 which is the location of the `a + b` expression in the `add` function. +Note that the `DILocation` for `!31` has `inlinedAt:` which tells us that the trap was inlined from `!32` into the location at `!29` which is the location of the `a + b` expression in the `add` function. I implemented this change on this [PR](https://github.com/llvm/llvm-project/pull/145967). @@ -368,4 +369,6 @@ https://github.com/llvm/llvm-project/commits?author=anthonyhatran [13] https://discourse.llvm.org/t/rfc-hardening-in-libc/73925 -[14] https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html \ No newline at end of file +[14] https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html + +[15] https://github.com/llvm/llvm-project/pull/145967#issuecomment-3054319138 \ No newline at end of file From ba5332860b7e30088f464175fcaececc223f4f66 Mon Sep 17 00:00:00 2001 From: Anthony Tran Date: Thu, 11 Sep 2025 23:26:21 -0700 Subject: [PATCH 10/14] Address debug info size increase --- content/posts/2025-09-01-gsoc-ubsan-trap-messages.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md index d774d840..e7b03370 100644 --- a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md +++ b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md @@ -200,7 +200,9 @@ Using bloaty, I tested a release build of clang with the `-fsanitize-debug-trap- ``` -Note it is likely the code size difference is negligible because because in optimized builds trap instructions in a function get merged together which causes the additional debug info my patch adds to be dropped. +Note it is likely the code size difference is negligible because because in optimized builds trap instructions in a function get merged together which causes the additional debug info my patch adds to be dropped. + +An increase of size would likely be the result of extra bytes per-UBSan trap in debug_info. It would also be contingent on the number of traps emitted since a new `DW_TAG_subprogram` DIE is emitted for each trap with this new feature. [A later comparison on a larger code base ("Big Google Binary")](https://github.com/llvm/llvm-project/pull/154618#issuecomment-3225724300), actually found a rather significant size increase of about 18% with trap reasons enabled. Future work may involve looking into why this is happening, and how such drastic size increases can be reduced. ### Displaying the trap reason in the debugger @@ -322,7 +324,9 @@ Before I started this GSoC, I barely even knew how to build clang and LLVM or us ## Work to Do -As stated prior, the diagnostics extension for trap messages has been [upstreamed by Dan](https://github.com/llvm/llvm-project/pull/154618). As of right now, only signed and unsigned overflow for addition, subtraction, and multiplication are being used by this system. I plan to integrate what I found on my [abandoned PR](https://github.com/llvm/llvm-project/pull/153845) by building on top of what Dan has already done. This will be done after the GSoC coding period. +As stated prior, some research needs to be conducted to figure out how size increase can be minimalized. + +Also stated previously, the diagnostics extension for trap messages has been [upstreamed by Dan](https://github.com/llvm/llvm-project/pull/154618). As of right now, only signed and unsigned overflow for addition, subtraction, and multiplication are being used by this system. I plan to integrate what I found on my [abandoned PR](https://github.com/llvm/llvm-project/pull/153845) by building on top of what Dan has already done. This will be done after the GSoC coding period. There is [an issue](https://github.com/llvm/llvm-project/issues/150707) where trap messages are not emitted in cases where they should be due to a null check. The purpose of the null check was to prevent a nullptr dereference that occurred in the debug-info prologue. This is a known issue to which there isn't a concrete solution as of current. From 94b78c65a999a715d1c1f2e445e3a6a4b21ca362 Mon Sep 17 00:00:00 2001 From: Anthony Tran Date: Fri, 12 Sep 2025 13:04:05 -0700 Subject: [PATCH 11/14] Remove 'hacky' text --- content/posts/2025-09-01-gsoc-ubsan-trap-messages.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md index e7b03370..dbe79422 100644 --- a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md +++ b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md @@ -165,7 +165,7 @@ cont: ; preds = %entry !32 = distinct !DISubprogram(name: "__clang_trap_msg$Undefined Behavior Sanitizer$Integer addition overflowed", scope: !2, file: !2, type: !33, flags: DIFlagArtificial, spFlags: DISPFlagDefinition, unit: !14) ``` -The debug metadata for the `@llvm.ubsantrap` call is `!31`. That `DILocation` has the scope of the `DISubprogram` assigned to `!32` which is the artificial function which encodes the trap category. This function's name is formatted as `__clang_trap_msg$$` to encode the trap category (`Undefined Behavior Sanitizer`) and the specific message (`Integer addition overflowed`). This function does not actually exist in the compiled program. It only exists in the debug info as a convenient (albeit hacky) way to describe the reason for trapping. When a trap is hit in the debugger, the debugger retrieves this string from the debug info and shows it as the reason for trapping. +The debug metadata for the `@llvm.ubsantrap` call is `!31`. That `DILocation` has the scope of the `DISubprogram` assigned to `!32` which is the artificial function which encodes the trap category. This function's name is formatted as `__clang_trap_msg$$` to encode the trap category (`Undefined Behavior Sanitizer`) and the specific message (`Integer addition overflowed`). This function does not actually exist in the compiled program. It only exists in the debug info as a convenient way to describe the reason for trapping. When a trap is hit in the debugger, the debugger retrieves this string from the debug info and shows it as the reason for trapping. Note that the `DILocation` for `!31` has `inlinedAt:` which tells us that the trap was inlined from `!32` into the location at `!29` which is the location of the `a + b` expression in the `add` function. From cd994ac3a86339d50961cedc3f069664b0978935 Mon Sep 17 00:00:00 2001 From: Anthony Tran Date: Fri, 12 Sep 2025 13:41:37 -0700 Subject: [PATCH 12/14] Replace transition --- content/posts/2025-09-01-gsoc-ubsan-trap-messages.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md index dbe79422..d3ac64e9 100644 --- a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md +++ b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md @@ -105,7 +105,7 @@ As we can see the debugging experience with UBSan traps is not ideal and improvi ## Human readable descriptions of UBSan traps in LLDB -During my GSoC project I implemented support for displaying human readable descriptions of UBSan traps in LLDB to improve the debugging experience. +The natural place to tackle the debugging experience was to look at debugger integration. ### Let the debugger handle most of the work From c06688978b6a683ceaa73d3934b9bf631c0e8ec2 Mon Sep 17 00:00:00 2001 From: Anthony Tran Date: Fri, 12 Sep 2025 13:43:56 -0700 Subject: [PATCH 13/14] Reword opener --- content/posts/2025-09-01-gsoc-ubsan-trap-messages.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md index d3ac64e9..671528a5 100644 --- a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md +++ b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md @@ -111,7 +111,7 @@ The natural place to tackle the debugging experience was to look at debugger int ### Let the debugger handle most of the work -An alternative to this approach would be to teach debuggers (e.g. LLDB) to decode the trap reason encoded in trap instructions in the debugger. However, this approach wasn’t taken for several reasons: +One approach to this would be to teach debuggers (e.g. LLDB) to decode the trap reason encoded in trap instructions in the debugger. However, this approach wasn’t taken for several reasons: * Using the trap reason encoded in trap instructions only works for x86_64 and arm64. The approach that I used works for all targets where debug info is supported (many more). From 1637eb1055cbf9072b1c4f0dc2928ed3e7ca099f Mon Sep 17 00:00:00 2001 From: Anthony Tran Date: Sat, 13 Sep 2025 23:28:05 -0700 Subject: [PATCH 14/14] Add Michael's comment --- content/posts/2025-09-01-gsoc-ubsan-trap-messages.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md index 671528a5..c9989796 100644 --- a/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md +++ b/content/posts/2025-09-01-gsoc-ubsan-trap-messages.md @@ -130,7 +130,7 @@ One approach to this would be to teach debuggers (e.g. LLDB) to decode the trap The approach I took is based on how `__builtin_verbose_trap` encodes its message into debug info [11] [12], a feature which was implemented in the past for [libc++ hardening](https://discourse.llvm.org/t/rfc-hardening-in-libc/73925). The core idea is that the trap reason string gets encoded directly in the trap's debug information. -To accomplish this, we needed to find a place to "stuff" the string in the DWARF DIE tree. Using a `DW_TAG_subprogram` was deemed the most straightforward and space-efficient location. This means we create a synthetic `DISubprogram` which is not a real function in the compiled program; it exists only in the debug info as a container. While the string could have been placed elsewhere, for reasons outside the scope of this blog post, it resides on this fake function DIE, with the trap reason encoded in the `DW_TAG_subprogram`'s name. For a deeper dive into this design decision, you can see [15](https://github.com/llvm/llvm-project/pull/145967#issuecomment-3054319138). +To accomplish this, we needed to find a place to "stuff" the string in the DWARF DIE tree. Using a `DW_TAG_subprogram` was deemed the most straightforward and space-efficient location. This means we create a synthetic `DISubprogram` which is not a real function in the compiled program; it exists only in the debug info as a container. While the string could have been placed elsewhere, for reasons outside the scope of this blog post, it resides on this fake function DIE, with the trap reason encoded in the `DW_TAG_subprogram`'s name. For a deeper dive into this design decision, you can see [[15]](https://github.com/llvm/llvm-project/pull/145967#issuecomment-3054319138). Let's look at the LLVM IR of the previous example to see how this is implemented: @@ -202,7 +202,7 @@ Using bloaty, I tested a release build of clang with the `-fsanitize-debug-trap- Note it is likely the code size difference is negligible because because in optimized builds trap instructions in a function get merged together which causes the additional debug info my patch adds to be dropped. -An increase of size would likely be the result of extra bytes per-UBSan trap in debug_info. It would also be contingent on the number of traps emitted since a new `DW_TAG_subprogram` DIE is emitted for each trap with this new feature. [A later comparison on a larger code base ("Big Google Binary")](https://github.com/llvm/llvm-project/pull/154618#issuecomment-3225724300), actually found a rather significant size increase of about 18% with trap reasons enabled. Future work may involve looking into why this is happening, and how such drastic size increases can be reduced. +Realistically this will add a few more abbreviations into .debug_abbrev (the DWARF abbreviation section) and only a few extra bytes per-UBSAN trap (abbreviation code + 1 ULEB128 for the index into the string offset table) into .debug_info (the DWARF debug-info section). The rest of the DW_TAG_subprogram is encoded in the abbreviation for that fake frame [[16]](https://github.com/llvm/llvm-project/pull/145967#issuecomment-3068862442). It would also be contingent on the number of traps emitted since a new `DW_TAG_subprogram` DIE is emitted for each trap with this new feature. [A later comparison on a larger code base ("Big Google Binary")](https://github.com/llvm/llvm-project/pull/154618#issuecomment-3225724300), actually found a rather significant size increase of about 18% with trap reasons enabled. Future work may involve looking into why this is happening, and how such drastic size increases can be reduced. ### Displaying the trap reason in the debugger @@ -375,4 +375,6 @@ https://github.com/llvm/llvm-project/commits?author=anthonyhatran [14] https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html -[15] https://github.com/llvm/llvm-project/pull/145967#issuecomment-3054319138 \ No newline at end of file +[15] https://github.com/llvm/llvm-project/pull/145967#issuecomment-3054319138 + +[16] https://github.com/llvm/llvm-project/pull/145967#issuecomment-3068862442 \ No newline at end of file