Skip to content

Conversation

@easyonaadit
Copy link
Owner

No description provided.

@easyonaadit easyonaadit force-pushed the test-branch-wave-reduce branch from 7e9cb1c to cbeee96 Compare October 8, 2024 10:23
changes to SIISelLowering.cpp

changes to SIInstructions.td

Code cleanup in tableGen files.
@easyonaadit easyonaadit force-pushed the test-branch-wave-reduce branch from cbeee96 to b2b49a9 Compare October 8, 2024 10:28
@easyonaadit easyonaadit force-pushed the wave-reduce-add-intrinsic branch 4 times, most recently from 0a2d3aa to e58b310 Compare October 9, 2024 12:12
@easyonaadit easyonaadit force-pushed the wave-reduce-add-intrinsic branch 6 times, most recently from 45fc9f9 to 8477064 Compare October 22, 2024 07:02
easyonaadit pushed a commit that referenced this pull request Dec 6, 2024
…ne symbol size as symbols are created (llvm#117079)"

This reverts commit ba668eb.

Below test started failing again on x86_64 macOS CI. We're unsure
if this patch is the exact cause, but since this patch has broken
this test before, we speculatively revert it to see if it was indeed
the root cause.
```
FAIL: lldb-shell :: Unwind/trap_frame_sym_ctx.test (1692 of 2162)
******************** TEST 'lldb-shell :: Unwind/trap_frame_sym_ctx.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 7: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/clang --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-apple-darwin22.6.0 -isysroot /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/call-asm.c /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/trap_frame_sym_ctx.s -o /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/clang --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-apple-darwin22.6.0 -isysroot /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/call-asm.c /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/trap_frame_sym_ctx.s -o /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp
clang: warning: argument unused during compilation: '-fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell' [-Wunused-command-line-argument]
RUN: at line 8: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/lldb --no-lldbinit -S /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/lit-lldb-init-quiet /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp -s /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -o exit | /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/FileCheck /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/lldb --no-lldbinit -S /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/lit-lldb-init-quiet /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp -s /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -o exit
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/FileCheck /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test
/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test:21:10: error: CHECK: expected string not found in input
         ^
<stdin>:26:64: note: scanning from here
 frame #1: 0x0000000100003ee9 trap_frame_sym_ctx.test.tmp`tramp
                                                               ^
<stdin>:27:2: note: possible intended match here
 frame #2: 0x00007ff7bfeff6c0
 ^

Input file: <stdin>
Check file: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
           21:  0x100003ed1 <+0>: pushq %rbp
           22:  0x100003ed2 <+1>: movq %rsp, %rbp
           23: (lldb) thread backtrace -u
           24: * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
           25:  * frame #0: 0x0000000100003ecc trap_frame_sym_ctx.test.tmp`bar
           26:  frame #1: 0x0000000100003ee9 trap_frame_sym_ctx.test.tmp`tramp
check:21'0                                                                    X error: no match found
           27:  frame #2: 0x00007ff7bfeff6c0
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:21'1      ?                             possible intended match
           28:  frame #3: 0x0000000100003ec6 trap_frame_sym_ctx.test.tmp`main + 22
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           29:  frame #4: 0x0000000100003ec6 trap_frame_sym_ctx.test.tmp`main + 22
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           30:  frame #5: 0x00007ff8193cc41f dyld`start + 1903
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           31: (lldb) exit
check:21'0     ~~~~~~~~~~~~
>>>>>>
```
easyonaadit pushed a commit that referenced this pull request Dec 7, 2024
## Description

This PR fixes a segmentation fault that occurs when passing options
requiring arguments via `-Xopenmp-target=<triple>`. The issue was that
the function `Driver::getOffloadArchs` did not properly parse the
extracted option, but instead assumed it was valid, leading to a crash
when incomplete arguments were provided.

## Backtrace

```sh
llvm-project/build/bin/clang++ main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -Xopenmp-target=powerpc64le-ibm-linux-gnu -o 
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: llvm-project/build/bin/clang++ main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -Xopenmp-target=powerpc64le-ibm-linux-gnu -o
1.      Compilation construction
2.      Building compilation actions
 #0 0x0000562fb21c363b llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (llvm-project/build/bin/clang+++0x392f63b)
 #1 0x0000562fb21c0e3c SignalHandler(int) Signals.cpp:0:0
 #2 0x00007fcbf6c81420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #3 0x0000562fb1fa5d70 llvm::opt::Option::matches(llvm::opt::OptSpecifier) const (llvm-project/build/bin/clang+++0x3711d70)
 #4 0x0000562fb2a78e7d clang::driver::Driver::getOffloadArchs(clang::driver::Compilation&, llvm::opt::DerivedArgList const&, clang::driver::Action::OffloadKind, clang::driver::ToolChain const*, bool) const (llvm-project/build/bin/clang+++0x41e4e7d)
 #5 0x0000562fb2a7a9aa clang::driver::Driver::BuildOffloadingActions(clang::driver::Compilation&, llvm::opt::DerivedArgList&, std::pair<clang::driver::types::ID, llvm::opt::Arg const*> const&, clang::driver::Action*) const (.part.1164) Driver.cpp:0:0
 #6 0x0000562fb2a7c093 clang::driver::Driver::BuildActions(clang::driver::Compilation&, llvm::opt::DerivedArgList&, llvm::SmallVector<std::pair<clang::driver::types::ID, llvm::opt::Arg const*>, 16u> const&, llvm::SmallVector<clang::driver::Action*, 3u>&) const (llvm-project/build/bin/clang+++0x41e8093)
 #7 0x0000562fb2a8395d clang::driver::Driver::BuildCompilation(llvm::ArrayRef<char const*>) (llvm-project/build/bin/clang+++0x41ef95d)
 llvm#8 0x0000562faf92684c clang_main(int, char**, llvm::ToolContext const&) (llvm-project/build/bin/clang+++0x109284c)
 llvm#9 0x0000562faf826cc6 main (llvm-project/build/bin/clang+++0xf92cc6)
llvm#10 0x00007fcbf6699083 __libc_start_main /build/glibc-LcI20x/glibc-2.31/csu/../csu/libc-start.c:342:3
llvm#11 0x0000562faf923a5e _start (llvm-project/build/bin/clang+++0x108fa5e)
[1]    2628042 segmentation fault (core dumped)   main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu  -o
```
easyonaadit pushed a commit that referenced this pull request Dec 11, 2024
llvm#118923)

…d reentry.

These utilities provide new, more generic and easier to use support for
lazy compilation in ORC.

LazyReexportsManager is an alternative to LazyCallThroughManager. It
takes requests for lazy re-entry points in the form of an alias map:
lazy-reexports = {
  ( <entry point symbol #1>, <implementation symbol #1> ),
  ( <entry point symbol #2>, <implementation symbol #2> ),
  ...
  ( <entry point symbol #n>, <implementation symbol #n> )
}

LazyReexportsManager then:
1. binds the entry points to the implementation names in an internal
table.
2. creates a JIT re-entry trampoline for each entry point.
3. creates a redirectable symbol for each of the entry point name and
binds redirectable symbol to the corresponding reentry trampoline.

When an entry point symbol is first called at runtime (which may be on
any thread of the JIT'd program) it will re-enter the JIT via the
trampoline and trigger a lookup for the implementation symbol stored in
LazyReexportsManager's internal table. When the lookup completes the
entry point symbol will be updated (via the RedirectableSymbolManager)
to point at the implementation symbol, and execution will proceed to the
implementation symbol.

Actual construction of the re-entry trampolines and redirectable symbols
is delegated to an EmitTrampolines functor and the
RedirectableSymbolsManager respectively.

JITLinkReentryTrampolines.h provides a JITLink-based implementation of
the EmitTrampolines functor. (AArch64 only in this patch, but other
architectures will be added in the near future).

Register state save and reentry functionality is added to the ORC
runtime in the __orc_rt_sysv_resolve and __orc_rt_resolve_implementation
functions (the latter is generic, the former will need custom
implementations for each ABI and architecture to be supported, however
this should be much less effort than the existing OrcABISupport
approach, since the ORC runtime allows this code to be written as native
assembly).

The resulting system:
1. Works equally well for in-process and out-of-process JIT'd code.
2. Requires less boilerplate to set up.

Given an ObjectLinkingLayer and PlatformJD (JITDylib containing the ORC
runtime), setup is just:

```c++
auto RSMgr = JITLinkRedirectableSymbolManager::Create(OLL);
if (!RSMgr)
  return RSMgr.takeError();

auto LRMgr = createJITLinkLazyReexportsManager(OLL, **RSMgr, PlatformJD);
if (!LRMgr)
  return LRMgr.takeError();
```

after which lazy reexports can be introduced with:

```c++
JD.define(lazyReexports(LRMgr, <alias map>));
```

LazyObectLinkingLayer is updated to use this new method, but the LLVM-IR
level CompileOnDemandLayer will continue to use LazyCallThroughManager
and OrcABISupport until the new system supports a wider range of
architectures and ABIs.

The llvm-jitlink utility's -lazy option now uses the new scheme. Since
it depends on the ORC runtime, the lazy-link.ll testcase and associated
helpers are moved to the ORC runtime.
easyonaadit added a commit that referenced this pull request Dec 18, 2024
easyonaadit pushed a commit that referenced this pull request Dec 20, 2024
According to the documentation described at
https://github.com/loongson/la-abi-specs/blob/release/ladwarf.adoc, the
dwarf numbers for floating-point registers range from 32 to 63.

An incorrect dwarf number will prevent the register values from being
properly restored during unwinding.

This test reflects this problem:

```
loongson@linux:~$ cat test.c

void foo() {
  asm volatile ("movgr2fr.d $fs2, $ra":::"$fs2");
}
int main() {
  asm volatile ("movgr2fr.d $fs2, $sp":::"$fs2");
  foo();
  return 0;
}

loongson@linux:~$ clang -g test.c  -o test

```
Without this patch:
```
loongson@linux:~$ ./_build/bin/lldb ./t
(lldb) target create "./t"
Current executable set to
'/home/loongson/llvm-project/_build_lldb/t' (loongarch64).
(lldb) b foo
Breakpoint 1: where = t`foo + 20 at test.c:4:1, address =
0x0000000000000714
(lldb) r
Process 2455626 launched: '/home/loongson/llvm-project/_build_lldb/t' (loongarch64)
Process 2455626 stopped
* thread #1, name = 't', stop reason = breakpoint 1.1
    frame #0: 0x0000555555554714 t`foo at test.c:4:1
   1    #include <stdio.h>
   2
   3    void foo() {
-> 4    asm volatile ("movgr2fr.d $fs2, $ra":::"$fs2");
   5    }
   6    int main() {
   7    asm volatile ("movgr2fr.d $fs2, $sp":::"$fs2");
(lldb) si
Process 2455626 stopped
* thread #1, name = 't', stop reason = instruction step into
    frame #0: 0x0000555555554718 t`foo at test.c:4:1
   1    #include <stdio.h>
   2
   3    void foo() {
-> 4    asm volatile ("movgr2fr.d $fs2, $ra":::"$fs2");
   5    }
   6    int main() {
   7    asm volatile ("movgr2fr.d $fs2, $sp":::"$fs2");
(lldb) f 1
frame #1: 0x0000555555554768 t`main at test.c:8:1
   5    }
   6    int main() {
   7    asm volatile ("movgr2fr.d $fs2, $sp":::"$fs2");
-> 8    foo();
   9    return 0;
   10   }
(lldb) register read -a
General Purpose Registers:
        r1 = 0x0000555555554768  t`main + 40 at test.c:8:1
        r3 = 0x00007ffffffef780
       r22 = 0x00007ffffffef7b0
       r23 = 0x00007ffffffef918
       r24 = 0x0000000000000001
       r25 = 0x0000000000000000
       r26 = 0x000055555555be08  t`__do_global_dtors_aux_fini_array_entry
       r27 = 0x0000555555554740  t`main at test.c:6
       r28 = 0x00007ffffffef928
       r29 = 0x00007ffff7febc88  ld-linux-loongarch-lp64d.so.1`_rtld_global_ro
       r30 = 0x000055555555be08  t`__do_global_dtors_aux_fini_array_entry
        pc = 0x0000555555554768  t`main + 40 at test.c:8:1
33 registers were unavailable.

Floating Point Registers:
       f13 = 0x00007ffffffef780 !!!!! wrong register
       f24 = 0xffffffffffffffff
       f25 = 0xffffffffffffffff
       f26 = 0x0000555555554768  t`main + 40 at test.c:8:1
       f27 = 0xffffffffffffffff
       f28 = 0xffffffffffffffff
       f29 = 0xffffffffffffffff
       f30 = 0xffffffffffffffff
       f31 = 0xffffffffffffffff
32 registers were unavailable.
```
With this patch:
```
The previous operations are the same.
(lldb) register read -a
General Purpose Registers:
        r1 = 0x0000555555554768  t`main + 40 at test.c:8:1
        r3 = 0x00007ffffffef780
       r22 = 0x00007ffffffef7b0
       r23 = 0x00007ffffffef918
       r24 = 0x0000000000000001
       r25 = 0x0000000000000000
       r26 = 0x000055555555be08  t`__do_global_dtors_aux_fini_array_entry
       r27 = 0x0000555555554740  t`main at test.c:6
       r28 = 0x00007ffffffef928
       r29 = 0x00007ffff7febc88  ld-linux-loongarch-lp64d.so.1`_rtld_global_ro
       r30 = 0x000055555555be08  t`__do_global_dtors_aux_fini_array_entry
        pc = 0x0000555555554768  t`main + 40 at test.c:8:1
33 registers were unavailable.

Floating Point Registers:
       f24 = 0xffffffffffffffff
       f25 = 0xffffffffffffffff
       f26 = 0x00007ffffffef780
       f27 = 0xffffffffffffffff
       f28 = 0xffffffffffffffff
       f29 = 0xffffffffffffffff
       f30 = 0xffffffffffffffff
       f31 = 0xffffffffffffffff
33 registers were unavailable.
```

Reviewed By: SixWeining

Pull Request: llvm#120391
easyonaadit pushed a commit that referenced this pull request Jan 21, 2025
Fix for the Coverity hit with CID1579964 in VPlan.cpp.

Coverity message with some context follows.

[Cov] var_compare_op: Comparing TermBr to null implies that TermBr might
be null.
434    } else if (TermBr && !TermBr->isConditional()) {
435      TermBr->setSuccessor(0, NewBB);
436    } else {
437 // Set each forward successor here when it is created, excluding
438 // backedges. A backward successor is set when the branch is
created.
439      unsigned idx = PredVPSuccessors.front() == this ? 0 : 1;
     	
[Cov] CID 1579964: (#1 of 1): Dereference after null check
(FORWARD_NULL)
[Cov] var_deref_model: Passing null pointer TermBr to getSuccessor,
which dereferences it.
easyonaadit pushed a commit that referenced this pull request Jan 21, 2025
This will be sent by Arm's Guarded Control Stack extension when an
invalid return is executed.

The signal does have an address we could show, but it's the PC at which
the fault occured. The debugger has plenty of ways to show you that
already, so I've left it out.

```
(lldb) c
Process 460 resuming
Process 460 stopped
* thread #1, name = 'test', stop reason = signal SIGSEGV: control protection fault
    frame #0: 0x0000000000400784 test`main at main.c:57:1
   54  	  afunc();
   55  	  printf("return from main\n");
   56  	  return 0;
-> 57  	}
(lldb) dis
<...>
->  0x400784 <+100>: ret
```

The new test case generates the signal by corrupting the link register
then attempting to return. This will work whether we manually enable GCS
or the C library does it for us.

(in the former case you could just return from main and it would fault)
easyonaadit pushed a commit that referenced this pull request Jan 23, 2025
llvm#123877)

Reverts llvm#122811 due to buildbot breakage e.g.,
https://lab.llvm.org/buildbot/#/builders/52/builds/5421/steps/11/logs/stdio

ASan output from local re-run:
```
==2780289==ERROR: AddressSanitizer: use-after-poison on address 0x7e0b87e28d28 at pc 0x55a979a99e7e bp 0x7ffe4b18f0b0 sp 0x7ffe4b18f0a8
READ of size 1 at 0x7e0b87e28d28 thread T0
    #0 0x55a979a99e7d in getStorageClass /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/Object/COFF.h:344
    #1 0x55a979a99e7d in isSectionDefinition /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/Object/COFF.h:429:9
    #2 0x55a979a99e7d in getSymbols /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/LLDMapFile.cpp:54:42
    #3 0x55a979a99e7d in lld::coff::writeLLDMapFile(lld::coff::COFFLinkerContext const&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/LLDMapFile.cpp:103:40
    #4 0x55a979a16879 in (anonymous namespace)::Writer::run() /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Writer.cpp:810:3
    #5 0x55a979a00aac in lld::coff::writeResult(lld::coff::COFFLinkerContext&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Writer.cpp:354:15
    #6 0x55a97985f7ed in lld::coff::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Driver.cpp:2826:3
    #7 0x55a97984cdd3 in lld::coff::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Driver.cpp:97:15
    llvm#8 0x55a9797f9793 in lld::unsafeLldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>, bool) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:163:12
    llvm#9 0x55a9797fa3b6 in operator() /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:188:15
    llvm#10 0x55a9797fa3b6 in void llvm::function_ref<void ()>::callback_fn<lld::lldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>)::$_0>(long) /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:46:12
    llvm#11 0x55a97966cb93 in operator() /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:69:12
    llvm#12 0x55a97966cb93 in llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:426:3
    llvm#13 0x55a9797f9dc3 in lld::lldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:187:14
    llvm#14 0x55a979627512 in lld_main(int, char**, llvm::ToolContext const&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/tools/lld/lld.cpp:103:14
    llvm#15 0x55a979628731 in main /usr/local/google/home/thurston/buildbot_repro/llvm_build_asan/tools/lld/tools/lld/lld-driver.cpp:17:10
    llvm#16 0x7ffb8b202c89 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    llvm#17 0x7ffb8b202d44 in __libc_start_main csu/../csu/libc-start.c:360:3
    llvm#18 0x55a97953ef60 in _start (/usr/local/google/home/thurston/buildbot_repro/llvm_build_asan/bin/lld+0x8fd1f60)
```
easyonaadit pushed a commit that referenced this pull request Jan 24, 2025
Prevents avoidable memory leaks.

Looks like exchange added in aa1333a
didn't take "continue" into account.

```
==llc==2150782==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 10 byte(s) in 1 object(s) allocated from:
    #0 0x5f1b0f9ac14a in strdup llvm-project/compiler-rt/lib/asan/asan_interceptors.cpp:593:3
    #1 0x5f1b1768428d in FileToRemoveList llvm-project/llvm/lib/Support/Unix/Signals.inc:105:55
```
easyonaadit pushed a commit that referenced this pull request Feb 10, 2025
…ible (llvm#123752)

This patch adds a new option `-aarch64-enable-zpr-predicate-spills`
(which is disabled by default), this option replaces predicate spills
with vector spills in streaming[-compatible] functions.

For example:

```
str	p8, [sp, #7, mul vl]            // 2-byte Folded Spill
// ...
ldr	p8, [sp, #7, mul vl]            // 2-byte Folded Reload
```

Becomes:

```
mov	z0.b, p8/z, #1
str	z0, [sp]                        // 16-byte Folded Spill
// ...
ldr	z0, [sp]                        // 16-byte Folded Reload
ptrue	p4.b
cmpne	p8.b, p4/z, z0.b, #0
```

This is done to avoid streaming memory hazards between FPR/vector and
predicate spills, which currently occupy the same stack area even when
the `-aarch64-stack-hazard-size` flag is set.

This is implemented with two new pseudos SPILL_PPR_TO_ZPR_SLOT_PSEUDO
and FILL_PPR_FROM_ZPR_SLOT_PSEUDO. The expansion of these pseudos
handles scavenging the required registers (z0 in the above example) and,
in the worst case spilling a register to an emergency stack slot in the
expansion. The condition flags are also preserved around the `cmpne` in
case they are live at the expansion point.
easyonaadit pushed a commit that referenced this pull request Feb 10, 2025
…StrictPackMatch field (llvm#126215)

This addresses the MSAN failure reported
in
llvm#125791 (comment):
```
==5633==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 in clang::ASTNodeImporter::CallOverloadedCreateFun<clang::ClassTemplateSpecializationDecl>::operator()
    #1 in bool clang::ASTNodeImporter::GetImportedOrCreateSpecialDecl<...>
...
```

The ASTImporter reads `D->hasStrictPackMatch()` and forwards it to the
constructor of the destination `ClassTemplateSpecializationDecl`. But if
`D` is a decl that LLDB created from debug-info, it would've been
created using `ClassTemplateSpecializationDecl::CreateDeserialized`,
which doesn't initialize the `StrictPackMatch` field.

This patch just initializes the field to a fixed value of `false`, to
preserve previous behaviour and avoid the use-of-uninitialized-value.

An alternative would be to always initialize it in the
`ClassTemplateSpecializationDecl` constructor, but there were
reservations about providing a default value for it because it might
lead to hard-to-diagnose problems down the line.
easyonaadit pushed a commit that referenced this pull request Apr 8, 2025
…d A520 (llvm#132246)

Inefficient SVE codegen occurs on at least two in-order cores,
those being Cortex-A510 and Cortex-A520. For example a simple vector
add

```
void foo(float a, float b, float dst, unsigned n) {
    for (unsigned i = 0; i < n; ++i)
        dst[i] = a[i] + b[i];
}
```

Vectorizes the inner loop into the following interleaved sequence
of instructions.

```
        add     x12, x1, x10
        ld1b    { z0.b }, p0/z, [x1, x10]
        add     x13, x2, x10
        ld1b    { z1.b }, p0/z, [x2, x10]
        ldr     z2, [x12, #1, mul vl]
        ldr     z3, [x13, #1, mul vl]
        dech    x11
        add     x12, x0, x10
        fadd    z0.s, z1.s, z0.s
        fadd    z1.s, z3.s, z2.s
        st1b    { z0.b }, p0, [x0, x10]
        addvl   x10, x10, #2
        str     z1, [x12, #1, mul vl]
```

By adjusting the target features to prefer fixed over scalable if the
cost is equal we get the following vectorized loop.

```
         ldp q0, q3, [x11, #-16]
         subs    x13, x13, llvm#8
         ldp q1, q2, [x10, #-16]
         add x10, x10, llvm#32
         add x11, x11, llvm#32
         fadd    v0.4s, v1.4s, v0.4s
         fadd    v1.4s, v2.4s, v3.4s
         stp q0, q1, [x12, #-16]
         add x12, x12, llvm#32
```

Which is more efficient.
easyonaadit pushed a commit that referenced this pull request Apr 8, 2025
… A510/A520 (llvm#134606)

Recommit. This work was done by llvm#132246 but failed buildbots due to the
test introduced needing updates

Inefficient SVE codegen occurs on at least two in-order cores, those
being Cortex-A510 and Cortex-A520. For example a simple vector add

```
void foo(float a, float b, float dst, unsigned n) {
    for (unsigned i = 0; i < n; ++i)
        dst[i] = a[i] + b[i];
}
```

Vectorizes the inner loop into the following interleaved sequence of
instructions.

```
        add     x12, x1, x10
        ld1b    { z0.b }, p0/z, [x1, x10]
        add     x13, x2, x10
        ld1b    { z1.b }, p0/z, [x2, x10]
        ldr     z2, [x12, #1, mul vl]
        ldr     z3, [x13, #1, mul vl]
        dech    x11
        add     x12, x0, x10
        fadd    z0.s, z1.s, z0.s
        fadd    z1.s, z3.s, z2.s
        st1b    { z0.b }, p0, [x0, x10]
        addvl   x10, x10, #2
        str     z1, [x12, #1, mul vl]
```

By adjusting the target features to prefer fixed over scalable if the
cost is equal we get the following vectorized loop.

```
         ldp q0, q3, [x11, #-16]
         subs    x13, x13, llvm#8
         ldp q1, q2, [x10, #-16]
         add x10, x10, llvm#32
         add x11, x11, llvm#32
         fadd    v0.4s, v1.4s, v0.4s
         fadd    v1.4s, v2.4s, v3.4s
         stp q0, q1, [x12, #-16]
         add x12, x12, llvm#32
```

Which is more efficient.
easyonaadit pushed a commit that referenced this pull request Apr 17, 2025
…s=128. (llvm#134068)

When compiling with -msve-vector-bits=128 or vscale_range(1, 1) and when
the offsets allow it, we can pair SVE LDR/STR instructions into Neon
LDP/STP.

For example, given:
```cpp
#include <arm_sve.h>

void foo(double const *ldp, double *stp) {
  svbool_t pg = svptrue_b64();
  svfloat64_t ld1 = svld1_f64(pg, ldp);
  svfloat64_t ld2 = svld1_f64(pg, ldp+svcntd());
  svst1_f64(pg, stp, ld1);
  svst1_f64(pg, stp+svcntd(), ld2);
}
```

When compiled with `-msve-vector-bits=128`, we currently generate:
```gas
foo:
        ldr     z0, [x0]
        ldr     z1, [x0, #1, mul vl]
        str     z0, [x1]
        str     z1, [x1, #1, mul vl]
        ret
```

With this patch, we instead generate:
```gas
foo:
        ldp     q0, q1, [x0]
        stp     q0, q1, [x1]
        ret
```

This is an alternative, more targetted approach to llvm#127500.
easyonaadit pushed a commit that referenced this pull request Apr 17, 2025
…ctor-bits=128." (llvm#134997)

Reverts llvm#134068

Caused a stage 2 build failure:
https://lab.llvm.org/buildbot/#/builders/41/builds/6016

```
FAILED: lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o 
/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/lib/Support -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/lib/Support -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/include -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/include -mcpu=neoverse-512tvb -mllvm -scalable-vectorization=preferred -mllvm -treat-scalable-fixed-error-as-warning=false -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Werror=global-constructors -O3 -DNDEBUG -std=c++17 -UNDEBUG  -fno-exceptions -funwind-tables -fno-rtti -MD -MT lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o -MF lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o.d -o lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o -c /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/lib/Support/Caching.cpp
Opcode has unknown scale!
UNREACHABLE executed at ../llvm/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:4530!
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/lib/Support -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/lib/Support -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/include -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/include -mcpu=neoverse-512tvb -mllvm -scalable-vectorization=preferred -mllvm -treat-scalable-fixed-error-as-warning=false -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Werror=global-constructors -O3 -DNDEBUG -std=c++17 -UNDEBUG -fno-exceptions -funwind-tables -fno-rtti -MD -MT lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o -MF lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o.d -o lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o -c /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/lib/Support/Caching.cpp
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module '/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/lib/Support/Caching.cpp'.
4.	Running pass 'AArch64 load / store optimization pass' on function '@"_ZNSt17_Function_handlerIFN4llvm8ExpectedISt8functionIFNS1_ISt10unique_ptrINS0_16CachedFileStreamESt14default_deleteIS4_EEEEjRKNS0_5TwineEEEEEjNS0_9StringRefESB_EZNS0_10localCacheESB_SB_SB_S2_IFvjSB_S3_INS0_12MemoryBufferES5_ISH_EEEEE3$_0E9_M_invokeERKSt9_Any_dataOjOSF_SB_"'
 #0 0x0000b6eae9b67bf0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x81c7bf0)
 #1 0x0000b6eae9b65aec llvm::sys::RunSignalHandlers() (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x81c5aec)
 #2 0x0000b6eae9acd5f4 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #3 0x0000f16c1aff28f8 (linux-vdso.so.1+0x8f8)
 #4 0x0000f16c1aacf1f0 __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #5 0x0000f16c1aa8a67c gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #6 0x0000f16c1aa77130 abort ./stdlib/abort.c:81:7
 #7 0x0000b6eae9ad6628 (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x8136628)
 llvm#8 0x0000b6eae72e95a8 (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x59495a8)
 llvm#9 0x0000b6eae74ca9a8 (anonymous namespace)::AArch64LoadStoreOpt::findMatchingInsn(llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>, (anonymous namespace)::LdStPairFlags&, unsigned int, bool) AArch64LoadStoreOptimizer.cpp:0:0
llvm#10 0x0000b6eae74c85a8 (anonymous namespace)::AArch64LoadStoreOpt::tryToPairLdStInst(llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>&) AArch64LoadStoreOptimizer.cpp:0:0
llvm#11 0x0000b6eae74c624c (anonymous namespace)::AArch64LoadStoreOpt::optimizeBlock(llvm::MachineBasicBlock&, bool) AArch64LoadStoreOptimizer.cpp:0:0
llvm#12 0x0000b6eae74c429c (anonymous namespace)::AArch64LoadStoreOpt::runOnMachineFunction(llvm::MachineFunction&) AArch64LoadStoreOptimizer.cpp:0:0
```
easyonaadit pushed a commit that referenced this pull request Apr 17, 2025
…vailable (llvm#135343)

When a frame is inlined, LLDB will display its name in backtraces as
follows:
```
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.3
  * frame #0: 0x0000000100000398 a.out`func() [inlined] baz(x=10) at inline.cpp:1:42
    frame #1: 0x0000000100000398 a.out`func() [inlined] bar() at inline.cpp:2:37
    frame #2: 0x0000000100000398 a.out`func() at inline.cpp:4:15
    frame #3: 0x00000001000003c0 a.out`main at inline.cpp:7:5
    frame #4: 0x000000026eb29ab8 dyld`start + 6812
```
The longer the names get the more confusing this gets because the first
function name that appears is the parent frame. My assumption (which may
need some more surveying) is that for the majority of cases we only care
about the actual frame name (not the parent). So this patch removes all
the special logic that prints the parent frame.

Another quirk of the current format is that the inlined frame name does
not abide by the `${function.name-XXX}` format variables. We always just
print the raw demangled name. With this patch, we would format the
inlined frame name according to the `frame-format` setting (see the
test-cases).

If we really want to have the `parentFrame [inlined] inlinedFrame`
format, we could expose it through a new `frame-format` variable (e..g.,
`${function.inlined-at-name}` and let the user decide where to place
things.
easyonaadit pushed a commit that referenced this pull request Apr 17, 2025
Currently, given:
```cpp
uint64_t incb(uint64_t x) {
  return x+svcntb();
}
```
LLVM generates:
```gas
incb:
        addvl   x0, x0, #1
        ret
```
Which is equivalent to:
```gas
incb:
        incb    x0
        ret
```

However, on microarchitectures like the Neoverse V2 and Neoverse V3,
the second form (with INCB) can have significantly better latency and
throughput (according to their SWOG). On the Neoverse V2, for example,
ADDVL has a latency and throughput of 2, whereas some forms of INCB
have a latency of 1 and a throughput of 4. The same applies to DECB.
This patch adds patterns to prefer the cheaper INCB/DECB forms over
ADDVL where applicable.
easyonaadit pushed a commit that referenced this pull request Apr 25, 2025
- Avoid dereferencing the end() iterator to get the end pointer, instead
calculate it explicitly
- Fixes a regression introduced in
llvm#136220.
- The windows build failure shows the following call stack:

```
 | Exception Code: 0x80000003
 |  #0 0x00007ff74bc05897 std::_Vector_const_iterator<class std::_Vector_val<struct std::_Simple_types<unsigned char>>>::operator*(void) const C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.37.32822\include\vector:52:0
 |  #1 0x00007ff74bbd3d64 `anonymous namespace'::DecoderEmitter::emitTable D:\buildbot\llvm-worker\clang-cmake-x86_64-avx512-win\llvm\llvm\utils\TableGen\DecoderEmitter.cpp:852:0
```
easyonaadit pushed a commit that referenced this pull request Apr 25, 2025
… collection (llvm#136795)

Fix a [test
failure](llvm#136236 (comment))
in llvm#136236, apply a minor renaming of statistics, and remerge. See
details below.

# Changes in llvm#136236

Currently, `DebuggerStats::ReportStatistics()` calls
`Module::GetSymtab(/*can_create=*/false)`, but then the latter calls
`SymbolFile::GetSymtab()`. This will load symbols if haven't yet. See
stacktrace below.

The problem is that `DebuggerStats::ReportStatistics` should be
read-only. This is especially important because it reports stats for
symtab parsing/indexing time, which could be affected by the reporting
itself if it's not read-only.

This patch fixes this problem by adding an optional parameter
`SymbolFile::GetSymtab(bool can_create = true)` and receiving the
`false` value passed down from `Module::GetSymtab(/*can_create=*/false)`
when the call is initiated from `DebuggerStats::ReportStatistics()`.

---

Notes about the following stacktrace:
1. This can be reproduced. Create a helloworld program on **macOS** with
dSYM, add `settings set target.preload-symbols false` to `~/.lldbinit`,
do `lldb a.out`, then `statistics dump`.
2. `ObjectFile::GetSymtab` has `llvm::call_once`. So the fact that it
called into `ObjectFileMachO::ParseSymtab` means that the symbol table
is actually being parsed.

```
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = step over
    frame #0: 0x0000000124c4d5a0 LLDB`ObjectFileMachO::ParseSymtab(this=0x0000000111504e40, symtab=0x0000600000a05e00) at ObjectFileMachO.cpp:2259:44
  * frame #1: 0x0000000124fc50a0 LLDB`lldb_private::ObjectFile::GetSymtab()::$_0::operator()(this=0x000000016d35c858) const at ObjectFile.cpp:761:9
    frame #5: 0x0000000124fc4e68 LLDB`void std::__1::__call_once_proxy[abi:v160006]<std::__1::tuple<lldb_private::ObjectFile::GetSymtab()::$_0&&>>(__vp=0x000000016d35c7f0) at mutex:652:5
    frame #6: 0x0000000198afb99c libc++.1.dylib`std::__1::__call_once(unsigned long volatile&, void*, void (*)(void*)) + 196
    frame #7: 0x0000000124fc4dd0 LLDB`void std::__1::call_once[abi:v160006]<lldb_private::ObjectFile::GetSymtab()::$_0>(__flag=0x0000600003920080, __func=0x000000016d35c858) at mutex:670:9
    frame llvm#8: 0x0000000124fc3cb0 LLDB`void llvm::call_once<lldb_private::ObjectFile::GetSymtab()::$_0>(flag=0x0000600003920080, F=0x000000016d35c858) at Threading.h:88:5
    frame llvm#9: 0x0000000124fc2bc4 LLDB`lldb_private::ObjectFile::GetSymtab(this=0x0000000111504e40) at ObjectFile.cpp:755:5
    frame llvm#10: 0x0000000124fe0a28 LLDB`lldb_private::SymbolFileCommon::GetSymtab(this=0x0000000104865200) at SymbolFile.cpp:158:39
    frame llvm#11: 0x0000000124d8fedc LLDB`lldb_private::Module::GetSymtab(this=0x00000001113041a8, can_create=false) at Module.cpp:1027:21
    frame llvm#12: 0x0000000125125bdc LLDB`lldb_private::DebuggerStats::ReportStatistics(debugger=0x000000014284d400, target=0x0000000115808200, options=0x000000014195d6d1) at Statistics.cpp:329:30
    frame llvm#13: 0x0000000125672978 LLDB`CommandObjectStatsDump::DoExecute(this=0x000000014195d540, command=0x000000016d35d820, result=0x000000016d35e150) at CommandObjectStats.cpp:144:18
    frame llvm#14: 0x0000000124f29b40 LLDB`lldb_private::CommandObjectParsed::Execute(this=0x000000014195d540, args_string="", result=0x000000016d35e150) at CommandObject.cpp:832:9
    frame llvm#15: 0x0000000124efbd70 LLDB`lldb_private::CommandInterpreter::HandleCommand(this=0x0000000141b22f30, command_line="statistics dump", lazy_add_to_history=eLazyBoolCalculate, result=0x000000016d35e150, force_repeat_command=false) at CommandInterpreter.cpp:2134:14
    frame llvm#16: 0x0000000124f007f4 LLDB`lldb_private::CommandInterpreter::IOHandlerInputComplete(this=0x0000000141b22f30, io_handler=0x00000001419b2aa8, line="statistics dump") at CommandInterpreter.cpp:3251:3
    frame llvm#17: 0x0000000124d7b5ec LLDB`lldb_private::IOHandlerEditline::Run(this=0x00000001419b2aa8) at IOHandler.cpp:588:22
    frame llvm#18: 0x0000000124d1e8fc LLDB`lldb_private::Debugger::RunIOHandlers(this=0x000000014284d400) at Debugger.cpp:1225:16
    frame llvm#19: 0x0000000124f01f74 LLDB`lldb_private::CommandInterpreter::RunCommandInterpreter(this=0x0000000141b22f30, options=0x000000016d35e63c) at CommandInterpreter.cpp:3543:16
    frame llvm#20: 0x0000000122840294 LLDB`lldb::SBDebugger::RunCommandInterpreter(this=0x000000016d35ebd8, auto_handle_events=true, spawn_thread=false) at SBDebugger.cpp:1212:42
    frame llvm#21: 0x0000000102aa6d28 lldb`Driver::MainLoop(this=0x000000016d35ebb8) at Driver.cpp:621:18
    frame llvm#22: 0x0000000102aa75b0 lldb`main(argc=1, argv=0x000000016d35f548) at Driver.cpp:829:26
    frame llvm#23: 0x0000000198858274 dyld`start + 2840
```

# Changes in this PR top of the above

Fix a [test
failure](llvm#136236 (comment))
in `TestStats.py`. The original version of the added test checks that
all modules have symbol count zero when `target.preload-symbols ==
false`. The test failed on macOS. Due to various reasons, on macOS,
symbols can be loaded for dylibs even with that setting, but not for the
main module. For now, the fix of the test is to limit the assertion to
only the main module. The test now passes on macOS. In the future, when
we have a way to control a specific list of plug-ins to be loaded, there
may be a configuration that this test can use to assert that all modules
have symbol count zero.

Apply a minor renaming of statistics, per the
[suggestion](llvm#136226 (comment))
in llvm#136226 after merge.
easyonaadit pushed a commit that referenced this pull request Jun 18, 2025
These were failing on our Windows on Arm bot, or more precisely,
not even completing.

This is because Microsoft's C runtime does extra parameter validation.
So when we called _read with an invalid fd, it called an invalid
parameter handler instead of returning an error.

https://learn.microsoft.com/en-us/%20cpp/c-runtime-library/reference/read?view=msvc-170
https://learn.microsoft.com/en-us/%20cpp/c-runtime-library/parameter-validation?view=msvc-170

(lldb) run
Process 8440 launched: 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\unittests\Host\HostTests.exe' (aarch64)
Process 8440 stopped
* thread #1, stop reason = Exception 0xc0000409 encountered at address 0x7ffb7453564c
    frame #0: 0x00007ffb7453564c ucrtbase.dll`_get_thread_local_invalid_parameter_handler + 652
ucrtbase.dll`_get_thread_local_invalid_parameter_handler:
->  0x7ffb7453564c <+652>: brk    #0xf003

ucrtbase.dll`_invalid_parameter_noinfo:
    0x7ffb74535650 <+0>:   b      0x7ffb745354d8 ; _get_thread_local_invalid_parameter_handler + 280
    0x7ffb74535654 <+4>:   nop
    0x7ffb74535658 <+8>:   nop

You can override this handler but I'm assuming that this reading
after close isn't a crucial feature, so disabling the tests seems
like the way to go.

If it is crucial, we can check the fd before we use it.

Tests added by llvm#143946.
easyonaadit pushed a commit that referenced this pull request Jul 19, 2025
Fix unnecessary conversion of C-String to StringRef in the `Cmp` lambda
inside `lookupLLVMIntrinsicByName`. This both fixes an ASAN error in the
code that happens when the `Name` StringRef passed in is not a Null
terminated StringRef, and additionally can potentially speed up the code
as well by eliminating the unnecessary computation of string length
every time a C String is converted to StringRef in this code (It seems
practically this computation is eliminated in optimized builds, but this
will avoid it in O0 builds as well).

Added a unit test that demonstrates this issue by building LLVM with
these options:

```
CMAKE_BUILD_TYPE=Debug
LLVM_USE_SANITIZER=Address
LLVM_OPTIMIZE_SANITIZED_BUILDS=OFF
```

The error reported is as follows:

```
==462665==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x5030000391a2 at pc 0x56525cc30bbf bp 0x7fff9e4ccc60 sp 0x7fff9e4cc428
READ of size 19 at 0x5030000391a2 thread T0
    #0 0x56525cc30bbe in strlen (upstream-llvm-second/llvm-project/build/unittests/IR/IRTests+0x713bbe) (BuildId: 0651acf1e582a4d2)
    #1 0x7f8ff22ad334 in std::char_traits<char>::length(char const*) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/char_traits.h:399:9
    #2 0x7f8ff22a34a0 in llvm::StringRef::StringRef(char const*) /home/rjoshi/upstream-llvm-second/llvm-project/llvm/include/llvm/ADT/StringRef.h:96:33
    #3 0x7f8ff28ca184 in _ZZL25lookupLLVMIntrinsicByNameN4llvm8ArrayRefIjEENS_9StringRefES2_ENK3$_0clIjPKcEEDaT_T0_ upstream-llvm-second/llvm-project/llvm/lib/IR/Intrinsics.cpp:673:18
```
easyonaadit pushed a commit that referenced this pull request Jul 19, 2025
…lvm#148205)

In the original motivating test case,
[FoldList](https://github.com/llvm/llvm-project/blob/d8a2141ff98ee35cd1886f536ccc3548b012820b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp#L1764)
had entries:
```
  #0: UseMI: %224:sreg_32 = S_OR_B32 %219.sub0:sreg_64, %219.sub1:sreg_64, implicit-def dead $scc
      UseOpNo: 1

  #1: UseMI: %224:sreg_32 = S_OR_B32 %219.sub0:sreg_64, %219.sub1:sreg_64, implicit-def dead $scc
      UseOpNo: 2
```
After calling
[updateOperand(#0)](https://github.com/llvm/llvm-project/blob/d8a2141ff98ee35cd1886f536ccc3548b012820b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp#L1773),
[tryConstantFoldOp(#0.UseMI)](https://github.com/llvm/llvm-project/blob/d8a2141ff98ee35cd1886f536ccc3548b012820b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp#L1786)
removed operand 1, and entry #&llvm#8203;1.UseOpNo was no longer valid,
resulting in an
[assert](https://github.com/llvm/llvm-project/blob/4a35214bddbb67f9597a500d48ab8c4fb25af150/llvm/include/llvm/ADT/ArrayRef.h#L452).

This change defers constant folding until all operands have been updated
so that UseOpNo values remain stable.
easyonaadit pushed a commit that referenced this pull request Oct 3, 2025
Specifically, `X & M ?= C --> (C << clz(M)) ?= (X << clz(M))` where M is
a non-empty sequence of ones starting at the least significant bit with
the remainder zero and C is a constant subset of M that cannot be
materialised into a SUBS (immediate). Proof:
https://alive2.llvm.org/ce/z/haqdJ4.

This improves the comparison in isinf, for example:
```cpp
int isinf(float x) {
  return __builtin_isinf(x);
}
```

Before:
```
isinf:
  fmov    w9, s0
  mov     w8, #2139095040
  and     w9, w9, #0x7fffffff
  cmp     w9, w8
  cset    w0, eq
  ret
```

After:
```
isinf:
  fmov    w9, s0
  mov     w8, #-16777216
  cmp     w8, w9, lsl #1
  cset    w0, eq
  ret
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant