SemaOverload: rework IsFunctionConversion #1

brunodf-snps · 2025-09-30T12:42:43Z

Local PR to demonstrate alternate approach.

…ypes (llvm#162278) When we take the following C program: ``` int main() { return 0; } ``` and create a statically-linked executable from it: ``` clang -static -g -o main main.c ``` Then we can observe the following `lldb` behavior: ``` $ lldb (lldb) target create main Current executable set to '.../main' (x86_64). (lldb) breakpoint set --name main Breakpoint 1: where = main`main + 11 at main.c:2:3, address = 0x000000000022aa7b (lldb) process launch Process 3773637 launched: '/home/me/tmp/built-in/main' (x86_64) Process 3773637 stopped * thread #1, name = 'main', stop reason = breakpoint 1.1 frame #0: 0x000000000022aa7b main`main at main.c:2:3 1 int main() { -> 2 return 0; 3 } (lldb) script lldb.debugger.GetSelectedTarget().FindFirstType("__int128").size 0 (lldb) script lldb.debugger.GetSelectedTarget().FindFirstType("unsigned __int128").size 0 (lldb) quit ``` The value return by the `SBTarget::FindFirstType` method is wrong for the `__int128` and `unsigned __int128` basic types. The proposed changes make the `TypeSystemClang::GetBasicTypeEnumeration` method consistent with `gcc` and `clang` C [language extension](https://gcc.gnu.org/onlinedocs/gcc/_005f_005fint128.html) related to 128-bit integer types as well as with the `BuiltinType::getName` method in the LLVM codebase itself. When the above change is applied, the behavior of the `lldb` changes in the following (desired) way: ``` $ lldb (lldb) target create main Current executable set to '.../main' (x86_64). (lldb) breakpoint set --name main Breakpoint 1: where = main`main + 11 at main.c:2:3, address = 0x000000000022aa7b (lldb) process launch Process 3773637 launched: '/home/me/tmp/built-in/main' (x86_64) Process 3773637 stopped * thread #1, name = 'main', stop reason = breakpoint 1.1 frame #0: 0x000000000022aa7b main`main at main.c:2:3 1 int main() { -> 2 return 0; 3 } (lldb) script lldb.debugger.GetSelectedTarget().FindFirstType("__int128").size 16 (lldb) script lldb.debugger.GetSelectedTarget().FindFirstType("unsigned __int128").size 16 (lldb) quit ``` --------- Co-authored-by: Matej Košík <[email protected]>

**Mitigation for:** google/sanitizers#749 **Disclosure:** I'm not an ASan compiler expert yet (I'm trying to learn!), I primarily work in the runtime. Some of this PR was developed with the help of AI tools (primarily as a "fuzzy `grep` engine"), but I've manually refined and tested the output, and can speak for every line. In general, I used it only to orient myself and for "rubberducking". **Context:** The msvc ASan team (👋 ) has received an internal request to improve clang's exception handling under ASan for Windows. Namely, we're interested in **mitigating** this bug: google/sanitizers#749 To summarize, today, clang + ASan produces a false-positive error for this program: ```C++ #include <cstdio> #include <exception> int main() { try { throw std::exception("test"); }catch (const std::exception& ex){ puts(ex.what()); } return 0; } ``` The error reads as such: ``` C:\Users\dajusto\source\repros\upstream>type main.cpp #include <cstdio> #include <exception> int main() { try { throw std::exception("test"); }catch (const std::exception& ex){ puts(ex.what()); } return 0; } C:\Users\dajusto\source\repros\upstream>"C:\Users\dajusto\source\repos\llvm-project\build.runtimes\bin\clang.exe" -fsanitize=address -g -O0 main.cpp C:\Users\dajusto\source\repros\upstream>a.exe ================================================================= ==19112==ERROR: AddressSanitizer: access-violation on unknown address 0x000000000000 (pc 0x7ff72c7c11d9 bp 0x0080000ff960 sp 0x0080000fcf50 T0) ==19112==The signal is caused by a READ memory access. ==19112==Hint: address points to the zero page. #0 0x7ff72c7c11d8 in main C:\Users\dajusto\source\repros\upstream\main.cpp:8 #1 0x7ff72c7d479f in _CallSettingFrame C:\repos\msvc\src\vctools\crt\vcruntime\src\eh\amd64\handlers.asm:49 llvm#2 0x7ff72c7c8944 in __FrameHandler3::CxxCallCatchBlock(struct _EXCEPTION_RECORD *) C:\repos\msvc\src\vctools\crt\vcruntime\src\eh\frame.cpp:1567 llvm#3 0x7ffb4a90e3e5 (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18012e3e5) llvm#4 0x7ff72c7c1128 in main C:\Users\dajusto\source\repros\upstream\main.cpp:6 llvm#5 0x7ff72c7c33db in invoke_main C:\repos\msvc\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78 llvm#6 0x7ff72c7c33db in __scrt_common_main_seh C:\repos\msvc\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288 llvm#7 0x7ffb49b05c06 (C:\WINDOWS\System32\KERNEL32.DLL+0x180035c06) llvm#8 0x7ffb4a8455ef (C:\WINDOWS\SYSTEM32\ntdll.dll+0x1800655ef) ==19112==Register values: rax = 0 rbx = 80000ff8e0 rcx = 27d76d00000 rdx = 80000ff8e0 rdi = 80000fdd50 rsi = 80000ff6a0 rbp = 80000ff960 rsp = 80000fcf50 r8 = 100 r9 = 19930520 r10 = 8000503a90 r11 = 80000fd540 r12 = 80000fd020 r13 = 0 r14 = 80000fdeb8 r15 = 0 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: access-violation C:\Users\dajusto\source\repros\upstream\main.cpp:8 in main ==19112==ABORTING ``` The root of the issue _appears to be_ that ASan's instrumentation is incompatible with Window's assumptions for instantiating `catch`-block's parameters (`ex` in the snippet above). The nitty gritty details are lost on me, but I understand that to make this work without loss of ASan coverage, a "serious" refactoring is needed. In the meantime, users risk false positive errors when pairing ASan + catch-block parameters on Windows. **To mitigate this** I think we should avoid instrumenting catch-block parameters on Windows. It appears to me this is as "simple" as marking catch block parameters as "uninteresting" in `AddressSanitizer::isInterestingAlloca`. My manual tests seem to confirm this. I believe this is strictly better than today's status quo, where the runtime generates false positives. Although we're now explicitly choosing to instrument less, the benefit is that now more programs can run with ASan without _funky_ macros that disable ASan on exception blocks. **This PR:** implements the mitigation above, and creates a simple new test for it. _Thanks!_ --------- Co-authored-by: Antonio Frighetto <[email protected]>

…nteger registers (llvm#163646) Fix the `RegisterValue::SetValueFromData` method so that it works also for 128-bit registers that contain integers. Without this change, the `RegisterValue::SetValueFromData` method does not work correctly for 128-bit registers that contain (signed or unsigned) integers. --- Steps to reproduce the problem: (1) Create a program that writes a 128-bit number to a 128-bit registers `xmm0`. E.g.: ``` #include <stdint.h> int main() { __asm__ volatile ( "pinsrq $0, %[lo], %%xmm0\n\t" // insert low 64 bits "pinsrq $1, %[hi], %%xmm0" // insert high 64 bits : : [lo]"r"(0x7766554433221100), [hi]"r"(0xffeeddccbbaa9988) ); return 0; } ``` (2) Compile this program with LLVM compiler: ``` $ $YOUR/clang -g -o main main.c ``` (3) Modify LLDB so that when it will be reading value from the `xmm0` register, instead of assuming that it is vector register, it will treat it as if it contain an integer. This can be achieved e.g. this way: ``` diff --git a/lldb/source/Utility/RegisterValue.cpp b/lldb/source/Utility/RegisterValue.cpp index 0e99451..a4b51db3e56d 100644 --- a/lldb/source/Utility/RegisterValue.cpp +++ b/lldb/source/Utility/RegisterValue.cpp @@ -188,6 +188,7 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &reg_info, break; case eEncodingUint: case eEncodingSint: + case eEncodingVector: if (reg_info.byte_size == 1) SetUInt8(src.GetMaxU32(&src_offset, src_len)); else if (reg_info.byte_size <= 2) @@ -217,23 +218,6 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &reg_info, else if (reg_info.byte_size == sizeof(long double)) SetLongDouble(src.GetLongDouble(&src_offset)); break; - case eEncodingVector: { - m_type = eTypeBytes; - assert(reg_info.byte_size <= kMaxRegisterByteSize); - buffer.bytes.resize(reg_info.byte_size); - buffer.byte_order = src.GetByteOrder(); - if (src.CopyByteOrderedData( - src_offset, // offset within "src" to start extracting data - src_len, // src length - buffer.bytes.data(), // dst buffer - buffer.bytes.size(), // dst length - buffer.byte_order) == 0) // dst byte order - { - error = Status::FromErrorStringWithFormat( - "failed to copy data for register write of %s", reg_info.name); - return error; - } - } } if (m_type == eTypeInvalid) ``` (4) Rebuild the LLDB. (5) Observe what happens how LLDB will print the content of this register after it was initialized with 128-bit value. ``` $YOUR/lldb --source ./main (lldb) target create main Current executable set to '.../main' (x86_64). (lldb) breakpoint set --file main.c --line 11 Breakpoint 1: where = main`main + 45 at main.c:11:3, address = 0x000000000000164d (lldb) settings set stop-line-count-before 20 (lldb) process launch Process 2568735 launched: '.../main' (x86_64) Process 2568735 stopped * thread #1, name = 'main', stop reason = breakpoint 1.1 frame #0: 0x000055555555564d main`main at main.c:11:3 1 #include <stdint.h> 2 3 int main() { 4 __asm__ volatile ( 5 "pinsrq $0, %[lo], %%xmm0\n\t" // insert low 64 bits 6 "pinsrq $1, %[hi], %%xmm0" // insert high 64 bits 7 : 8 : [lo]"r"(0x7766554433221100), 9 [hi]"r"(0xffeeddccbbaa9988) 10 ); -> 11 return 0; 12 } (lldb) register read --format hex xmm0 xmm0 = 0x7766554433221100ffeeddccbbaa9988 ``` You can see that the upper and lower 64-bit wide halves are swapped. --------- Co-authored-by: Matej Košík <[email protected]>

…lvm#162993) Early if conversion can create instruction sequences such as ``` mov x1, #1 csel x0, x1, x2, eq ``` which could be simplified into the following instead ``` csinc x0, x2, xzr, ne ``` One notable example that generates code like this is `cmpxchg weak`. This is fixed by handling an immediate value of 1 as `add(wzr, 1)` so that the addition can be folded into CSEL by using CSINC instead.

SemaOverload: rework IsFunctionConversion

e482da4

brunodf-snps mentioned this pull request Sep 30, 2025

[clang][Sema] Fix incomplete C mode incompatible ExtInfo/ExtProtoInfo conversion diagnostic llvm/llvm-project#160477

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SemaOverload: rework IsFunctionConversion #1

SemaOverload: rework IsFunctionConversion #1

Uh oh!

brunodf-snps commented Sep 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SemaOverload: rework IsFunctionConversion #1

Are you sure you want to change the base?

SemaOverload: rework IsFunctionConversion #1

Uh oh!

Conversation

brunodf-snps commented Sep 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant