[AutoBump] Merge with fixes of d3446240 (Jan 29) (3) by jorickert · Pull Request #870 · Xilinx/llvm-aie

jorickert · 2026-03-26T07:30:37Z

No description provided.

Running MLIR python tests unders asan currently fails with ``` executed command: 'LD_PRELOAD=$(/usr/bin/clang++-17' '-print-file-name=libclang_rt.asan-x86_64.so)' /scratch/slx-llvm/.venv-3.10/bin/python3.10 /scratch/slx-llvm/mlir/test/python/ir/context_lifecycle.py | 'LD_PRELOAD=$(/usr/bin/clang++-17': command not found ``` because lit doesn't quite understand the syntax. To fix, we resolve the path to `libclang_rt.asan-x86_64.so` within the lit configuration. This has the additional benefit that we don't have to call `clang++` for every test. Co-authored-by: Philipp-Jan Honysz <Philipp.Honysz@amd.com>

…MCRegister to unsigned. NFC

Add a dummy pass skeleton list to help track the progress in porting passes to NPM.

…#124773)

This is a followup to #117152. That patch introduced a check for UB/poison on BEValue. However, the SCEV we're actually going to use is Shifted. In some cases, it's possible for Shifted to contain UB, while BEValue doesn't. In the test case the values are: BEValue: (-1 * (zext i8 (-83 + ((-83 /u {1,+,1}<%loop>) * {-1,+,-1}<%loop>)) to i32))<nuw><nsw> Shifted: (-173 + (-1 * (zext i8 ((-83 /u {0,+,1}<%loop>) * {0,+,-1}<%loop>) to i32))<nuw><nsw>)<nuw><nsw> Fixes llvm/llvm-project#123550.

…4747) There are a lot of tests that do not depend upon the IR output for validation, relying instead on the debug output. For these tests we can add the -disable-output command line argument.

…18947)" (#124804) This reapplies #118947 and adapts to nanobind.

This makes some other problems show up like the fact that we didn't suppress diagnostics during __builtin_constant_p evaluation.

This patch implements support for the UNROLL directive to control how many times a loop should be unrolled. It must be placed immediately before a `DO LOOP` and applies only to the loop that follows. N is an integer that specifying the unrolling factor. This is done by adding an attribute to the branch into the loop in LLVM to indicate that the loop should unrolled. The code pushed to support the directive `VECTOR ALWAYS` has been modified to take account of the fact that several directives can be used before a `DO LOOP`.

This intrinsic is a gnu extension (https://gcc.gnu.org/onlinedocs/gfortran/CHDIR.html) and is used in FLEUR (https://github.com/JuDFTteam/FLEUR).

…rary on ARM64X (#124833)

This commit restricts the use of scalar types in vector math builtins, particularly the `__builtin_elementwise_*` builtins. Previously, small scalar integer types would be promoted to `int`, as per the usual conversions. This would silently do the wrong thing for certain operations, such as `add_sat`, `popcount`, `bitreverse`, and others. Similarly, since unsigned integer types were promoted to `int`, something like `add_sat(unsigned char, unsigned char)` would perform a *signed* operation. With this patch, promotable scalar integer types are not promoted to int, and are kept intact. If any of the types differ in the binary and ternary builtins, an error is issued. Similarly an error is issued if builtins are supplied integer types of different signs. Mixing enums of different types in binary/ternary builtins now consistently raises an error in all language modes. This brings the behaviour surrounding scalar types more in line with that of vector types. No change is made to vector types, which are both not promoted and whose element types must match. Fixes #84047. RFC: https://discourse.llvm.org/t/rfc-change-behaviour-of-elementwise-builtins-on-scalar-integer-types/83725

This resolves the same issue addressed in llvm/llvm-project#124286, but for invoke operations. The issue arose from duplicated logic for both imports. This PR also refactors the common import code for call and invoke instructions to mitigate issues in the future.

…4900)

As decided on https://discourse.llvm.org/t/rfc-lets-document-and-enforce-a-minimum-python-version-for-lldb/82731. LLDB 20 recommended `>= 3.8` but did not remove support for anything earlier. Now we are in what will become LLDB 21, so I'm removing that support and making `>= 3.8` required. See https://docs.python.org/3/c-api/apiabiversion.html#c.PY_VERSION_HEX for the format of PY_VERSION_HEX.

Was flagged in llvm/llvm-project#124735 but done separately so it didn't get in the way of that.

This patch moves up the checks that verify if it is legal to replace the atomic load/store with memcpy. Currently these checks are done after we determine to convert the load/store to memcpy/memmove, which makes the logic a bit confusing. This patch is a prelude to #50892

…subdirectory (#124744) I left these alone in #124463 but I think it makes sense to clean these up as well (which Philip also noted in #124464).

…24789) These were based off instruction count, not throughput - we can probably improve these further, but these throughput numbers match the worse expanded shuffles we see in the vector-shuffle-128-v* codegen tests.

LoopInterchange have converted `DVEntry::LE` and `DVEntry::GE` in direction vectors to '<' and '>' respectively. This handling is incorrect because the information about the '=' it lost. This leads to miscompilation in some cases. To resolve this issue, convert them to '*' instead. Resolve #123920

The PR addresses issues with the filters of 1 x r and of r x 1 and with the tiling. --------- Signed-off-by: Dmitriy Smirnov <dmitriy.smirnov@arm.com>

Thread-local code generation requires constant pools because most of the relocations needed for it operate on data, so it cannot be used with -mexecute-only (or -mpure-code, which is aliased in the driver). Without this we hit an assertion in the backend when trying to generate a constant pool.

When using PAuthLR, the PAUTH_PROLOGUE expands into a sequence of instructions which takes the address of one of those instructions, and uses that address to compute the return address signature. If this is duplicated, there will be two different addresses used in calculating the signature, so the epilogue will only be correct for (at most) one of them. This change also restricts code generation when using v8.3-A return address signing, without PAuthLR. This isn't strictly needed, as duplicating the prologue there would be valid. We could fix this by having two copies of PAUTH_PROLOGUE, with and without isNotDuplicable, but I don't think it's worth adding the extra complexity to a security feature for that.

…640) Add new runPass helpers to run a VPlan transformation. This makes it easier to add additional checks/functionality for each transform run. In this patch, an option is added to run the verifier after each VPlan transform. Follow-ups will use the same helper to also support printing VPlans after each transform. Note that the verifier at the moment requires there to be a canonical IV and vector loop region, so the final lowering transforms aren't run via runPass yet. PR: llvm/llvm-project#123640

Add the implementation of the IERRNO intrinsic to get the last system error number, as given by the C errno variable. This intrinsic is also used in RAMSES (https://github.com/ramses-organisation/ramses/).

Using the `__builtin_elementwise_(add|sub)_sat` functions allows us to directly optimize to the desired intrinsic, and avoid scalarization for vector types.

Adds AVX512 bf16 dot-product operation and defines lowering to LLVM intrinsics. AVX512 intrinsic operation definition is extended with an optional extension field that allows specifying necessary LLVM mnemonic suffix e.g., `"bf16"` for `x86_avx512bf16_` intrinsics.

The base branch was changed.

jorickert · 2026-03-27T09:34:57Z

I merged the other bump-Prs into it, to reduce the review spam a bit

mgehre-amd and others added 30 commits January 29, 2025 08:36

[TableGen] Use MCRegister::id() to avoid an implicit conversion from …

d199732

…MCRegister to unsigned. NFC

[bazel] Introduce MAX_CLANG_ABI_COMPAT_VERSION (for #123998)

267e293

[AMDGPU][NewPM] Sketch out a AMDGPUPassRegistry skeleton (#124785)

71edfd6

Add a dummy pass skeleton list to help track the progress in porting passes to NPM.

[CodeGen] RegisterCoalescer: Remove unused AliasAnalysis dependency (…

a3aa452

…#124773)

[LoopVectorize][NFC] Disable output for tests that don't need it (#12…

c836b89

…4747) There are a lot of tests that do not depend upon the IR output for validation, relying instead on the debug output. For these tests we can add the -disable-output command line argument.

Reapply "[mlir][python] allow DenseIntElementsAttr for index type (#1…

5d3ae51

…18947)" (#124804) This reapplies #118947 and adapts to nanobind.

[lldb] Remove PATH workaround for Android (#124682)

9326633

[clang][bytecode] Fix dummy handling for p2280r4 (#124396)

51c7338

This makes some other problems show up like the fact that we didn't suppress diagnostics during __builtin_constant_p evaluation.

[flang] Implement CHDIR intrinsic (#124280)

5a34e6f

This intrinsic is a gnu extension (https://gcc.gnu.org/onlinedocs/gfortran/CHDIR.html) and is used in FLEUR (https://github.com/JuDFTteam/FLEUR).

[LLD][COFF] Write both native and EC export symbols to the import lib…

e902cf2

…rary on ARM64X (#124833)

[LoopVectorize][NFC] Regenerate some early exit test CHECK lines (#12…

776ef9d

…4900)

[lldb][NFC] Format part of ScriptInterpreterPython.cpp

db567ea

Was flagged in llvm/llvm-project#124735 but done separately so it didn't get in the way of that.

[MCJIT][test] Move remaining MCJIT interpreter tests to Interpreter/ …

e80d934

…subdirectory (#124744) I left these alone in #124463 but I think it makes sense to clean these up as well (which Philip also noted in #124464).

[MLIR][Linalg] Fixes for Winograd decomposition and for tiling (#123675)

f20b8e3

The PR addresses issues with the filters of 1 x r and of r x 1 and with the tiling. --------- Signed-off-by: Dmitriy Smirnov <dmitriy.smirnov@arm.com>

[flang] Implement IERRNO intrinsic (#124281)

ecc71de

Add the implementation of the IERRNO intrinsic to get the last system error number, as given by the C errno variable. This intrinsic is also used in RAMSES (https://github.com/ramses-organisation/ramses/).

[libclc] Move (add|sub)_sat to CLC; optimize (#124903)

12cdf43

Using the `__builtin_elementwise_(add|sub)_sat` functions allows us to directly optimize to the desired intrinsic, and avoid scalarization for vector types.

[X86] vector-idiv-sdiv-512.ll - regenerate VPTERNLOG comments

9534d27

[AutoBump] Merge with fixes of d344624 (Jan 29)

24824f0

jorickert requested review from ehsan-toosi, ljfitz, roberteg16 and ttjost as code owners March 26, 2026 07:30

jorickert added 3 commits March 26, 2026 02:06

[AutoBump] Merge with c836b89 (Jan 29)

d8ce88b

[AutoBump] Merge with fixes of 5d3ae51 (Jan 29)

a8baad4

[AutoBump] Merge with 983562d (Jan 29)

97e1bc9

ttjost previously approved these changes Mar 26, 2026

View reviewed changes

Base automatically changed from bump_to_2ec27848 to aie-public March 27, 2026 08:09

jorickert requested a review from ttjost March 27, 2026 09:04

[AutoBump] Merge with c836b89 (Jan 29) (4) (#872)

89207f4

jorickert requested review from F-Stuckmann, SagarMaheshwari99, abhinay-anubola, abnikant, andcarminati, katerynamuts, khallouh, konstantinschwarz, martien-de-jong, mludevid, niwinanto and stephenneuendorffer as code owners March 27, 2026 09:32

jorickert added 2 commits March 27, 2026 10:33

[AutoBump] Merge with fixes of 5d3ae51 (Jan 29) (5) (#873)

f8117b9

[AutoBump] Merge with 983562d (Jan 29) (6) (#874)

255b0e1

konstantinschwarz merged commit c2f5123 into aie-public Mar 27, 2026
7 of 8 checks passed

konstantinschwarz deleted the bump_to_d3446240 branch March 27, 2026 14:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoBump] Merge with fixes of d3446240 (Jan 29) (3)#870

[AutoBump] Merge with fixes of d3446240 (Jan 29) (3)#870
konstantinschwarz merged 40 commits intoaie-publicfrom
bump_to_d3446240

jorickert commented Mar 26, 2026

Uh oh!

jorickert commented Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

jorickert commented Mar 26, 2026

Uh oh!

jorickert commented Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants