Skip to content

Merge tag 'llvmorg-20.1.8' into emscripten-libs-20 #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 84 commits into from
Jul 21, 2025

Conversation

aheejin
Copy link
Member

@aheejin aheejin commented Jul 21, 2025

No description provided.

tstellar and others added 30 commits April 29, 2025 17:24
925e195 introduced a regression since which we started to accept invalid
trailing commas in many expression lists where they're not allowed by
the grammar. The issue came from the fact that an additional invalid
state - previously handled by ParseExpressionList - was overlooked in
that patch.

Fixes llvm#136254

No release entry because I want to backport it.

(cherry picked from commit c7daab2)
TargetLoweringBase::getIRStackGuard refers to a platform-specific guard
variable. Before this change, TargetLoweringBase::getSDagStackGuard only
referred to a different variable.

This means that SelectionDAGBuilder's getLoadStackGuard does not get
memory operands. However, AArch64InstrInfo::expandPostRAPseudo assumes
that the passed MachineInstr has nonzero memoperands, causing a
segfault.

We have two possible options here: either disabling the LOAD_STACK_GUARD
node entirely in AArch64TargetLowering::useLoadStackGuardNode or just
making the platform-specific values match across TargetLoweringBase.
Here, we try the latter.

(cherry picked from commit c180e24)
This caused ZT0 to be preserved around `__arm_tpidr2_save` in functions
with "aarch64_new_zt0". The block in which `__arm_tpidr2_save` is called
is added by the SMEABIPass and may be reachable in cases where ZA has
not been enabled* (so using `str zt0` is invalid).

* (when za_save_buffer is null and num_za_save_slices is zero)
…m#136726)

In llvm#132722 spills of ZT0 were disabled around all SME ABI routines to
avoid a case where ZT0 is spilled before ZA is enabled (resulting in a
crash).

It turns out that the ABI does not promise that routines will preserve
ZT0 (however in practice they do), so generally disabling ZT0 spills for
ABI routines is not correct.

The case where a crash was possible was "aarch64_new_zt0" functions with
ZA disabled on entry and a ZT0 spill around __arm_tpidr2_save. In this
case, ZT0 will be undefined at the call to __arm_tpidr2_save, so this
patch avoids the ZT0 spill by marking the callsite with
"aarch64_zt0_undef". This attribute only applies to callsites and marks
that at the point the call is made ZT0 is not defined, so does not need
preserving.
A PURE subprogram can't have a local variable with the SAVE attribute.
An ASSOCIATE or SELECT TYPE construct entity whose selector is a
variable will return true from IsSave(); exclude them from the local
variable check.

Fixes llvm#131356.

(cherry picked from commit b99dab2)
llvm#137286)

…dy.py

Currently, run_clang_tidy.py does not correctly display the list of
checks picked up from the top-level .clang-tidy file. The reason for
that is that we are passing an empty string as input file.

However, that's not how we are supposed to use clang-tidy to list
checks. Per
llvm@65eccb4,
we simply should not pass any file at all - the internal code of
clang-tidy will pass a "dummy" file if that's the case and get the
.clang-tidy file from the current working directory.

Fixes llvm#136659

Co-authored-by: Carlos Gálvez <[email protected]>
(cherry picked from commit 014ab73)
…able. (llvm#135769)

If we are attempting to combine shuffle+bitcast but the bitcast is
pairable with a subsequent bitcast, we should not fold the shuffle as
doing so can block further simplifications.

The motivation for this is a long-standing regression affecting SIMDe on
AArch64, introduced indirectly by the AlwaysInliner (1a2e77c). Some
reproducers:
* https://godbolt.org/z/53qx18s6M
* https://godbolt.org/z/o5e43h5M7

(cherry picked from commit c91c3f9)
This reverts commit 6a1bdd9 and re-instate behavior that matches what
MSVC link.exe does, that is, error out when trying to dllimport a symbol
from a static library.

A hint is now displayed in stdout, mentioning that we should rather dllimport the symbol
from a import library.

Fixes llvm#131807
Summary:
Different ordering modes aren't supported for an atomic load, so we just
do an add of zero as the same thing. It's less efficient, but it works.

Fixes llvm#138560

(cherry picked from commit dfcb8cb)
If the LocationSize is larger than the index space of the pointer type,
bail out instead of triggering an APInt assertion.

Fixes the issue reported at
llvm#119365 (comment).

(cherry picked from commit 027b203)
llvm#138091)

Check this error for more context
(https://github.com/compiler-research/CppInterOp/actions/runs/14749797085/job/41407625681?pr=491#step:10:531)

This fails with
```
* thread #1, name = 'CppInterOpTests', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x55500356d6d3)
  * frame #0: 0x00007fffee41cfe3 libclangCppInterOp.so.21.0gitclang::PragmaNamespace::~PragmaNamespace() + 99
    frame #1: 0x00007fffee435666 libclangCppInterOp.so.21.0gitclang::Preprocessor::~Preprocessor() + 3830
    frame #2: 0x00007fffee20917a libclangCppInterOp.so.21.0gitstd::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 58
    frame #3: 0x00007fffee224796 libclangCppInterOp.so.21.0gitclang::CompilerInstance::~CompilerInstance() + 838
    frame #4: 0x00007fffee22494d libclangCppInterOp.so.21.0gitclang::CompilerInstance::~CompilerInstance() + 13
    frame #5: 0x00007fffed95ec62 libclangCppInterOp.so.21.0gitclang::IncrementalCUDADeviceParser::~IncrementalCUDADeviceParser() + 98
    frame #6: 0x00007fffed9551b6 libclangCppInterOp.so.21.0gitclang::Interpreter::~Interpreter() + 102
    frame #7: 0x00007fffed95598d libclangCppInterOp.so.21.0gitclang::Interpreter::~Interpreter() + 13
    frame #8: 0x00007fffed9181e7 libclangCppInterOp.so.21.0gitcompat::createClangInterpreter(std::vector<char const*, std::allocator<char const*>>&) + 2919
```

Problem :

1) The destructor currently handles no clearance for the DeviceParser
and the DeviceAct. We currently only have this

https://github.com/llvm/llvm-project/blob/976493822443c52a71ed3c67aaca9a555b20c55d/clang/lib/Interpreter/Interpreter.cpp#L416-L419

2) The ownership for DeviceCI currently is present in
IncrementalCudaDeviceParser. But this should be similar to how the
combination for hostCI, hostAction and hostParser are managed by the
Interpreter. As on master the DeviceAct and DeviceParser are managed by
the Interpreter but not DeviceCI. This is problematic because :
IncrementalParser holds a Sema& which points into the DeviceCI. On
master, DeviceCI is destroyed before the base class ~IncrementalParser()
runs, causing Parser::reset() to access a dangling Sema (and as Sema
holds a reference to Preprocessor which owns PragmaNamespace) we see
this
```
  * frame #0: 0x00007fffee41cfe3 libclangCppInterOp.so.21.0gitclang::PragmaNamespace::~PragmaNamespace() + 99
    frame #1: 0x00007fffee435666 libclangCppInterOp.so.21.0gitclang::Preprocessor::~Preprocessor() + 3830

```

(cherry picked from commit 529b6fc)
The `_LIBCPP_DISABLE_AVAILABILITY` macro was removed in afae1a5 as an
intended no-op. It turns out that some projects are making use of that
macro to work around a Clang bug with availability annotations that
still exists: llvm#134151.

Since that Clang bug still hasn't been fixed, I feel that we must sill
honor that unfortunate macro until we've figured out how to get rid of
it without breaking code.

(cherry picked from commit 25fc52e)
Futureproofs our single Debian-specific special case for roughly the next 6 years.

See: https://lists.debian.org/debian-devel-announce/2025/01/msg00004.html
(cherry picked from commit 58e6883)
The recently announced IBM z17 processor implements the architecture
already supported as "arch15" in LLVM. This patch adds support for "z17"
as an alternate architecture name for arch15.

This patch also add the scheduler description for the z17 processor,
provided by Jonas Paulsson.

Manual backport of llvm#135254
This avoids type suffixes for integer constants when the type can be
inferred from the template parameter, such as the unsigned parameter
of A<1> and A<2> in the added test.
Follows the discussion here:
llvm#129309

Recently, the test
`TestRtsan.AccessingALargeAtomicVariableDiesWhenRealtime` has been
failing on newer MacOS versions, because the internal locking mechanism
in `std::atomic<T>::load` (for types `T` that are larger than the
hardware lock-free limit), has changed to a function that wasn't being
intercepted by rtsan.

This PR introduces an interceptor for `_os_nospin_lock_lock`, which is
the new internal locking mechanism.

_Note: we'd probably do well to introduce interceptors for
`_os_nospin_lock_unlock` (and `os_unfair_lock_unlock`) too, which also
appear to have blocking implementations. This can follow in a separate
PR._

(cherry picked from commit 481a55a)
…ts (llvm#132867)

These changes align with these lock types and allows builds and tests to
pass with various SDKS.

rdar://147067322
(cherry picked from commit 7cc4472)
…lvm#134970)

Towards

This change moves WasmSym from a static global struct to an instance
owned by Ctx, allowing it to be reset cleanly between linker runs. This
enables safe support for multiple invocations of wasm-ld within the same
process

Changes done

- Converted WasmSym from a static struct to a regular struct with
instance members.

- Added a std::unique_ptr<WasmSym> wasmSym field inside Ctx.

- Reset wasmSym in Ctx::reset() to clear state between links.

- Replaced all WasmSym:: references with ctx.wasmSym->.

- Removed global symbol definitions from Symbols.cpp that are no longer
needed.

Clearing wasmSym in ctx.reset() ensures a clean slate for each link
invocation, preventing symbol leakage across runs—critical when using
wasm-ld/lld as a reentrant library where global state can cause subtle,
hard-to-debug errors.

---------

Co-authored-by: Vassil Vassilev <[email protected]>
(cherry picked from commit 9cbbb74)
…lvm#134930)

This pull request implements mangling for ConstantMatrixType, allowing
matrices to be used on Windows.

Related issues: llvm#53158, llvm#127127

This example code:
```cpp
#include <typeinfo>
#include <stdio.h>

typedef float Matrix4 __attribute__((matrix_type(4, 4)));

int main()
{
  printf("%s\n", typeid(Matrix4).name());
}
```
Outputs this:
```
struct __clang::__matrix<float,4,4>
```

(cherry picked from commit f5a30f1)
FEAT_FP8DOT4 and FEAT_FP8FMA are supported by FUJITSU-MONAKA. These were
previously enabled due to dependencies, but now require explicit
activation due to modifications in the dependencies.

(cherry picked from commit 9d5a542)
…pression` (llvm#139029)

In the original case, the third call to `getCheaperNegatedExpression`
deletes the SDNode returned by the first call.
Similar to 74e6030, this patch uses
`HandleSDNodes` to prevent nodes from being deleted by subsequent calls.
Closes llvm#138944.

(cherry picked from commit 143cce7)
In powerpc64-unknown-linux-musl, signal.h does not include asm/ptrace.h,
which causes "member access into incomplete type 'struct pt_regs'"
errors. Include the header explicitly to fix this.

Also in sanitizer_linux_libcdep.cpp, there is a usage of TlsPreTcbSize
which is not defined in such a platform. Guard the branch with macro.

(cherry picked from commit 801b519)
…ze (llvm#138247)

The order of operation was slightly incorrect, as we were checking for
incomplete types *before* handling reference types.

Fixes llvm#129397

---------

Co-authored-by: Erich Keane <[email protected]>
…2nd attempt) (llvm#127406)

In my previous attempt (llvm#126913) of fixing the flaky case was on a good
track when I used the begin locations as a stable ordering. However, I
forgot to consider the case when the begin locations are the same among
the Exprs.

In an `EXPENSIVE_CHECKS` build, arrays are randomly shuffled prior to
sorting them. This exposed the flaky behavior much more often basically
breaking the "stability" of the vector - as it should.
Because of this, I had to revert the previous fix attempt in llvm#127034.

To fix this, I use this time `Expr::getID` for a stable ID for an Expr.

Hopefully fixes llvm#126619
Hopefully fixes llvm#126804

(cherry picked from commit f378e52)
This matches GCC and we supported it in LLVM 17/18.

Fixes llvm#136803

(cherry picked from commit 6c33735)
uweigand and others added 27 commits June 11, 2025 17:18
…llvm#105651)

There are 2 problems today that this PR resolves:

libcxx tests assume the thousands separator for fr_FR locale is x00A0 on
Windows. This currently fails when run on newer versions of Windows (it
seems to have been updated to the new correct value of 0x202F around
windows 11. The exact windows version where it changed doesn't seem to
be documented anywhere). Depending the OS version, you need different
values.

There are several ifdefs to determine the environment/platform-specific
locale conversion values and it leads to maintenance as things change
over time.

This PR includes the following changes:

- Provide the environment's locale conversion values through a
  substitution. The test can opt in by placing the substitution value in a
  define flag.
- Remove the platform ifdefs (the swapping of values between Windows,
  Linux, Apple, AIX).

This is accomplished through a lit feature action that fetches the
environment's locale conversions (lconv) for members like
'thousands_sep' that we need to provide. This should ensure that we
don't lose the effectiveness of the test itself.

In addition, as a result of the above, this PR:

- Fixes a handful of locale tests which unexpectedly fail on newer
  Windows versions.
- Resolves 3 XFAIL FIX-MEs.

Originally submitted in llvm#86649.

Co-authored-by: Rodrigo Salazar <[email protected]>
(cherry picked from commit f909b22)
…lvm#131675)

The patch that added the new locale Lit features was created before we
switched to a 0-1 macro for _LIBCPP_HAS_WIDE_CHARACTERS, leading to that
patch referring to the obsolete _LIBCPP_HAS_NO_WIDE_CHARACTERS macro
that is never defined nowadays.

(cherry picked from commit 297f6d9)
…pare (llvm#137594)

This breaks the ABI of `flat_{,multi}map::value_compare`, but this type
has only been introduced in LLVM 20, so it should be very unlikely that
we break anybody if we back-port this now.

(cherry picked from commit ed0aa99)
While
https://learn.microsoft.com/en-us/windows/win32/menurc/accelerators-resource
specifies that ALT only applies to virtkeys, this doesn't seem to be the
case in reality.

https://learn.microsoft.com/en-us/windows/win32/menurc/using-keyboard-accelerators
contains an example that uses this combination:

    "B",   ID_ACCEL5, ALT                   ; ALT_SHIFT+B

Also Microsoft also includes such cases in their repo of test cases:
https://github.com/microsoft/Windows-classic-samples/blob/263dd514ad215d0a40d1ec44b4df84b30ec11dcf/Samples/Win7Samples/begin/sdkdiff/sdkdiff.rc#L161-L164

Also MS rc.exe doesn't warn/error about this. However if applying SHIFT
or CONTROL on a non-virtkey accelerator, MS rc.exe does produce this
warning:

    warning RC4203 : SHIFT or CONTROL used without VIRTKEY

Hence, keep the checks for SHIFT and CONTROL, but remove the checks for
ALT, which seems to have been incorrect.

This fixes one aspect of
llvm#143157.

(cherry picked from commit 77347d6)
…142828)

This is a re-application of llvm#142090 without the unit
test changes. A subsequent PR will follow that adds a unit test for
module dependencies.

- Fix dangling string references in the return value of
getAllRequiredModules()
- Change a couple of calls in getOrBuildModuleFile() to use the loop
variable instead of the ModuleName parameter.
The test fails (sometimes); see discussion on llvm#142828
This cherry-picks 3b4c51b and
beffd15 to 20.x.

Which are:
[clang-repl] Fix error recovery while PTU cleanup (llvm#127467)
[clang][Interpreter] Disable part of lambda test on Windows

The latter commit avoids a test failure seen in Windows builds.
On main, I turned off one of the RUN lines for Windows, but reviewers
on the cherry-pick preferred UNSUPPORTED to disable the whole test.

So I have used UNSUPPORTED in this version for 20.x.
…3821)

When '-march' with LASX feature and '-mno-lsx' options are used
together, '-mno-lsx' fails to disable LASX, leaving
'HasFeatureLASX=true' and causing incorrect '__loongarch_sx/asx=1' macro
definition.

Fixes loongson-community/discussions#95

(cherry picked from commit 2ecbfc0)
This commit fixes using inline assembly with v128 results. Previously
this failed with an internal assertion about a failure to legalize a
`CopyFromReg` where the source register was typed `v8f16`. It looks like
the type used for the destination register was whatever was listed first
in the `def V128 : WebAssemblyRegClass` listing, so the types were
shuffled around to have a default-supported type.

A small test was added as well which failed to generate previously and
should now pass in generation. This test passed on LLVM 18 additionally
and regressed by accident in llvm#93228 which was first included in LLVM 19.

(cherry picked from commit a8a9a7f)
For this test, the `xvshuf.d` instruction should not be generated.

This will be fixed later.

(cherry picked from commit a19ddff)
The optimization level should not be restored into O2.

(cherry picked from commit fcc82cf)
Like many other targets did. And see RISCV for similar fix.

Fix llvm#143239

(cherry picked from commit 90a52f4)
Cherry pick a patch from 1.15.0

Add missing include for raise(3)

google/googletest@7f036c5
(cherry picked from commit 68239b7)
We don't have PACKSS for i64->i32.

Fixes: https://godbolt.org/z/qb8nxnPbK, which was introduced by ddd2f57
(cherry picked from commit 3d63191)
…m#144058)

Code originally added in llvm#120995 and later corrected in llvm#130517 but
apparently still not correct according to llvm#141494 and
rust-lang/rust#141913.

Revert the special handling because the test written in llvm#120995 and
llvm#130517 still passes without those changes. Kept the test and improved
it with a `__DATA` section to keep the current behaviour checked in case
other changes modify the behaviour and break this edge case.

(cherry picked from commit a0662ce)
…es (llvm#145648)

A case found from rust-lang/rust#142752:
https://llvm.godbolt.org/z/T7ce9saWh.

We should emit `@bar_0` for the following code:

```llvm
target triple = "x86_64-unknown-linux-gnu"

@rel_0 = private unnamed_addr constant [1 x i32] [
  i32 trunc (i64 sub (i64 ptrtoint (ptr @bar_0 to i64), i64 ptrtoint (ptr @rel_0 to i64)) to i32)]
@bar_0 = internal unnamed_addr constant ptr @foo_0, align 8
@foo_0 = external global ptr, align 8

define void @foo(ptr %arg0) {
  store ptr @bar_0, ptr %arg0, align 8
  ret void
}
```

(cherry picked from commit 630d55c)
…llvm#143371)

Currently, when hazard-padding is enabled a (fixed-size) hazard slot is
placed in the CS area, just after the frame record. The size of this
slot is part of the "CalleeSaveBaseToFrameRecordOffset". The SVE
epilogue emission code assumed this offset was always zero, and
incorrectly setting the stack pointer, resulting in all SVE registers
being reloaded from incorrect offsets.

```
| prev_lr                           |
| prev_fp                           |
| (a.k.a. "frame record")           |
|-----------------------------------| <- fp(=x29)
|   <hazard padding>                |
|-----------------------------------| <- callee-saved base
|                                   |
| callee-saved fp/simd/SVE regs     |
|                                   |
|-----------------------------------| <- SVE callee-save base
```

i.e. in the above diagram, the code assumed `fp == callee-saved base`.
…id generating GOTPCREL relocations (llvm#146068)

The entry in a relative lookup table is a global variable with a
constant offset, such as `@gv`, `GEP @gv, 1`, and so on.

We cannot only consider the case of a trivial global variable. This PR
handles all cases using the existing `IsConstantOffsetFromGlobal`
function.

(cherry picked from commit c43282a)
…vm#135248)

We can't assume MBBI is still pointing at MBB if we've already expanded
a probe. We need to re-query the MBB from MBBI. Fixes llvm#135206

Co-authored-by: Craig Topper <[email protected]>
(cherry picked from commit b3d2dc3)
…es in false arm (llvm#143020)

Fixes llvm#139050.

This patch adds a check to avoid folding min/max reduction into select, which may block loop vectorization.

The issue is that the following snippet:
```
declare i8 @llvm.umin.i8(i8, i8)

define i8 @masked_min_fold_bug(i8 %acc, i8 %val, i8 %mask) {
; CHECK-LABEL: @masked_min_fold_bug(
; CHECK:       %cond = icmp eq i8 %mask, 0
; CHECK:       %masked_val = select i1 %cond, i8 %val, i8 255
; CHECK:       call i8 @llvm.umin.i8(i8 %acc, i8 %masked_val)
;
  %cond = icmp eq i8 %mask, 0
  %masked_val = select i1 %cond, i8 %val, i8 255
  %res = call i8 @llvm.umin.i8(i8 %acc, i8 %masked_val)
  ret i8 %res
}
```

is being optimized to the following code, which can not be vectorized
later.
```
declare i8 @llvm.umin.i8(i8, i8) #0

define i8 @masked_min_fold_bug(i8 %acc, i8 %val, i8 %mask) {
  %cond = icmp eq i8 %mask, 0
  %1 = call i8 @llvm.umin.i8(i8 %acc, i8 %val)
  %res = select i1 %cond, i8 %1, i8 %acc
  ret i8 %res
}

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
```

Expected:
```
declare i8 @llvm.umin.i8(i8, i8) #0

define i8 @masked_min_fold_bug(i8 %acc, i8 %val, i8 %mask) {
  %cond = icmp eq i8 %mask, 0
  %masked_val = select i1 %cond, i8 %val, i8 -1
  %res = call i8 @llvm.umin.i8(i8 %acc, i8 %masked_val)
  ret i8 %res
}

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
```

https://godbolt.org/z/cYMheKE5r
(cherry picked from commit 07fa6d1)
@aheejin aheejin requested a review from sbc100 July 21, 2025 23:21
Copy link

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, but remember to lang this as a merge commit .. if you try to rebase bad things will happen :)

@aheejin aheejin merged commit 34bf114 into emscripten-libs-20 Jul 21, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.