[AutoBump] Merge with fixes of d3446240 (Jan 29) (3)#870
Merged
konstantinschwarz merged 40 commits intoaie-publicfrom Mar 27, 2026
Merged
[AutoBump] Merge with fixes of d3446240 (Jan 29) (3)#870konstantinschwarz merged 40 commits intoaie-publicfrom
konstantinschwarz merged 40 commits intoaie-publicfrom
Conversation
Running MLIR python tests unders asan currently fails with ``` executed command: 'LD_PRELOAD=$(/usr/bin/clang++-17' '-print-file-name=libclang_rt.asan-x86_64.so)' /scratch/slx-llvm/.venv-3.10/bin/python3.10 /scratch/slx-llvm/mlir/test/python/ir/context_lifecycle.py | 'LD_PRELOAD=$(/usr/bin/clang++-17': command not found ``` because lit doesn't quite understand the syntax. To fix, we resolve the path to `libclang_rt.asan-x86_64.so` within the lit configuration. This has the additional benefit that we don't have to call `clang++` for every test. Co-authored-by: Philipp-Jan Honysz <Philipp.Honysz@amd.com>
…MCRegister to unsigned. NFC
Add a dummy pass skeleton list to help track the progress in porting passes to NPM.
This is a followup to #117152. That patch introduced a check for
UB/poison on BEValue. However, the SCEV we're actually going to use is
Shifted. In some cases, it's possible for Shifted to contain UB, while
BEValue doesn't.
In the test case the values are:
BEValue: (-1 * (zext i8 (-83 + ((-83 /u {1,+,1}<%loop>) *
{-1,+,-1}<%loop>)) to i32))<nuw><nsw>
Shifted: (-173 + (-1 * (zext i8 ((-83 /u {0,+,1}<%loop>) *
{0,+,-1}<%loop>) to i32))<nuw><nsw>)<nuw><nsw>
Fixes llvm/llvm-project#123550.
…4747) There are a lot of tests that do not depend upon the IR output for validation, relying instead on the debug output. For these tests we can add the -disable-output command line argument.
…18947)" (#124804) This reapplies #118947 and adapts to nanobind.
This makes some other problems show up like the fact that we didn't suppress diagnostics during __builtin_constant_p evaluation.
This patch implements support for the UNROLL directive to control how many times a loop should be unrolled. It must be placed immediately before a `DO LOOP` and applies only to the loop that follows. N is an integer that specifying the unrolling factor. This is done by adding an attribute to the branch into the loop in LLVM to indicate that the loop should unrolled. The code pushed to support the directive `VECTOR ALWAYS` has been modified to take account of the fact that several directives can be used before a `DO LOOP`.
This intrinsic is a gnu extension (https://gcc.gnu.org/onlinedocs/gfortran/CHDIR.html) and is used in FLEUR (https://github.com/JuDFTteam/FLEUR).
…rary on ARM64X (#124833)
This commit restricts the use of scalar types in vector math builtins, particularly the `__builtin_elementwise_*` builtins. Previously, small scalar integer types would be promoted to `int`, as per the usual conversions. This would silently do the wrong thing for certain operations, such as `add_sat`, `popcount`, `bitreverse`, and others. Similarly, since unsigned integer types were promoted to `int`, something like `add_sat(unsigned char, unsigned char)` would perform a *signed* operation. With this patch, promotable scalar integer types are not promoted to int, and are kept intact. If any of the types differ in the binary and ternary builtins, an error is issued. Similarly an error is issued if builtins are supplied integer types of different signs. Mixing enums of different types in binary/ternary builtins now consistently raises an error in all language modes. This brings the behaviour surrounding scalar types more in line with that of vector types. No change is made to vector types, which are both not promoted and whose element types must match. Fixes #84047. RFC: https://discourse.llvm.org/t/rfc-change-behaviour-of-elementwise-builtins-on-scalar-integer-types/83725
This resolves the same issue addressed in llvm/llvm-project#124286, but for invoke operations. The issue arose from duplicated logic for both imports. This PR also refactors the common import code for call and invoke instructions to mitigate issues in the future.
As decided on https://discourse.llvm.org/t/rfc-lets-document-and-enforce-a-minimum-python-version-for-lldb/82731. LLDB 20 recommended `>= 3.8` but did not remove support for anything earlier. Now we are in what will become LLDB 21, so I'm removing that support and making `>= 3.8` required. See https://docs.python.org/3/c-api/apiabiversion.html#c.PY_VERSION_HEX for the format of PY_VERSION_HEX.
Was flagged in llvm/llvm-project#124735 but done separately so it didn't get in the way of that.
This patch moves up the checks that verify if it is legal to replace the atomic load/store with memcpy. Currently these checks are done after we determine to convert the load/store to memcpy/memmove, which makes the logic a bit confusing. This patch is a prelude to #50892
…subdirectory (#124744) I left these alone in #124463 but I think it makes sense to clean these up as well (which Philip also noted in #124464).
…24789) These were based off instruction count, not throughput - we can probably improve these further, but these throughput numbers match the worse expanded shuffles we see in the vector-shuffle-128-v* codegen tests.
LoopInterchange have converted `DVEntry::LE` and `DVEntry::GE` in direction vectors to '<' and '>' respectively. This handling is incorrect because the information about the '=' it lost. This leads to miscompilation in some cases. To resolve this issue, convert them to '*' instead. Resolve #123920
The PR addresses issues with the filters of 1 x r and of r x 1 and with the tiling. --------- Signed-off-by: Dmitriy Smirnov <dmitriy.smirnov@arm.com>
Thread-local code generation requires constant pools because most of the relocations needed for it operate on data, so it cannot be used with -mexecute-only (or -mpure-code, which is aliased in the driver). Without this we hit an assertion in the backend when trying to generate a constant pool.
When using PAuthLR, the PAUTH_PROLOGUE expands into a sequence of instructions which takes the address of one of those instructions, and uses that address to compute the return address signature. If this is duplicated, there will be two different addresses used in calculating the signature, so the epilogue will only be correct for (at most) one of them. This change also restricts code generation when using v8.3-A return address signing, without PAuthLR. This isn't strictly needed, as duplicating the prologue there would be valid. We could fix this by having two copies of PAUTH_PROLOGUE, with and without isNotDuplicable, but I don't think it's worth adding the extra complexity to a security feature for that.
…640) Add new runPass helpers to run a VPlan transformation. This makes it easier to add additional checks/functionality for each transform run. In this patch, an option is added to run the verifier after each VPlan transform. Follow-ups will use the same helper to also support printing VPlans after each transform. Note that the verifier at the moment requires there to be a canonical IV and vector loop region, so the final lowering transforms aren't run via runPass yet. PR: llvm/llvm-project#123640
Add the implementation of the IERRNO intrinsic to get the last system error number, as given by the C errno variable. This intrinsic is also used in RAMSES (https://github.com/ramses-organisation/ramses/).
Using the `__builtin_elementwise_(add|sub)_sat` functions allows us to directly optimize to the desired intrinsic, and avoid scalarization for vector types.
Adds AVX512 bf16 dot-product operation and defines lowering to LLVM intrinsics. AVX512 intrinsic operation definition is extended with an optional extension field that allows specifying necessary LLVM mnemonic suffix e.g., `"bf16"` for `x86_avx512bf16_` intrinsics.
ttjost
previously approved these changes
Mar 26, 2026
Contributor
Author
|
I merged the other bump-Prs into it, to reduce the review spam a bit |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.