Skip to content

[AutoBump] Merge with fixes of d3446240 (Jan 29) (3)#870

Merged
konstantinschwarz merged 40 commits intoaie-publicfrom
bump_to_d3446240
Mar 27, 2026
Merged

[AutoBump] Merge with fixes of d3446240 (Jan 29) (3)#870
konstantinschwarz merged 40 commits intoaie-publicfrom
bump_to_d3446240

Conversation

@jorickert
Copy link
Copy Markdown
Contributor

No description provided.

mgehre-amd and others added 30 commits January 29, 2025 08:36
Running MLIR python tests unders asan currently fails with
```
executed command: 'LD_PRELOAD=$(/usr/bin/clang++-17' '-print-file-name=libclang_rt.asan-x86_64.so)' /scratch/slx-llvm/.venv-3.10/bin/python3.10 /scratch/slx-llvm/mlir/test/python/ir/context_lifecycle.py

| 'LD_PRELOAD=$(/usr/bin/clang++-17': command not found
```
because lit doesn't quite understand the syntax.
To fix, we resolve the path to `libclang_rt.asan-x86_64.so` within the
lit configuration. This has the additional benefit that we don't have to
call `clang++` for every test.

Co-authored-by: Philipp-Jan Honysz <Philipp.Honysz@amd.com>
Add a dummy pass skeleton list to help track the progress in porting
passes to NPM.
This is a followup to #117152. That patch introduced a check for
UB/poison on BEValue. However, the SCEV we're actually going to use is
Shifted. In some cases, it's possible for Shifted to contain UB, while
BEValue doesn't.

In the test case the values are:

BEValue: (-1 * (zext i8 (-83 + ((-83 /u {1,+,1}<%loop>) *
{-1,+,-1}<%loop>)) to i32))<nuw><nsw>
Shifted: (-173 + (-1 * (zext i8 ((-83 /u {0,+,1}<%loop>) *
{0,+,-1}<%loop>) to i32))<nuw><nsw>)<nuw><nsw>

Fixes llvm/llvm-project#123550.
…4747)

There are a lot of tests that do not depend upon the IR output
for validation, relying instead on the debug output. For these
tests we can add the -disable-output command line argument.
…18947)" (#124804)

This reapplies #118947 and adapts to nanobind.
This makes some other problems show up like the fact that we didn't
suppress diagnostics during __builtin_constant_p evaluation.
This patch implements support for the UNROLL directive to control how
many times a loop should be unrolled.
It must be placed immediately before a `DO LOOP` and applies only to the
loop that follows. N is an integer that specifying the unrolling factor.
This is done by adding an attribute to the branch into the loop in LLVM
to indicate that the loop should unrolled.
The code pushed to support the directive `VECTOR ALWAYS` has been
modified to take account of the fact that several directives can be used
before a `DO LOOP`.
This commit restricts the use of scalar types in vector math builtins,
particularly the `__builtin_elementwise_*` builtins.

Previously, small scalar integer types would be promoted to `int`, as
per the usual conversions. This would silently do the wrong thing for
certain operations, such as `add_sat`, `popcount`, `bitreverse`, and
others. Similarly, since unsigned integer types were promoted to `int`,
something like `add_sat(unsigned char, unsigned char)` would perform a
*signed* operation.

With this patch, promotable scalar integer types are not promoted to
int, and are kept intact. If any of the types differ in the binary and
ternary builtins, an error is issued. Similarly an error is issued if
builtins are supplied integer types of different signs. Mixing enums of
different types in binary/ternary builtins now consistently raises an
error in all language modes.

This brings the behaviour surrounding scalar types more in line with
that of vector types. No change is made to vector types, which are both
not promoted and whose element types must match.

Fixes #84047.

RFC:
https://discourse.llvm.org/t/rfc-change-behaviour-of-elementwise-builtins-on-scalar-integer-types/83725
This resolves the same issue addressed in
llvm/llvm-project#124286, but for invoke
operations. The issue arose from duplicated logic for both imports. This
PR also refactors the common import code for call and invoke
instructions to mitigate issues in the future.
As decided on
https://discourse.llvm.org/t/rfc-lets-document-and-enforce-a-minimum-python-version-for-lldb/82731.

LLDB 20 recommended `>= 3.8` but did not remove support for anything
earlier. Now we are in what will become LLDB 21, so I'm removing that
support and making
`>= 3.8` required.

See https://docs.python.org/3/c-api/apiabiversion.html#c.PY_VERSION_HEX
for the format of PY_VERSION_HEX.
Was flagged in llvm/llvm-project#124735
but done separately so it didn't get in the way of that.
This patch moves up the checks that verify if it is legal to replace the
atomic load/store with memcpy. Currently these checks are done after we
determine to convert the load/store to memcpy/memmove, which makes the
logic a bit confusing.

This patch is a prelude to #50892
…subdirectory (#124744)

I left these alone in #124463 but I think it makes sense to clean these
up as well (which Philip also noted in #124464).
…24789)

These were based off instruction count, not throughput - we can probably improve these further, but these throughput numbers match the worse expanded shuffles we see in the vector-shuffle-128-v* codegen tests.
LoopInterchange have converted `DVEntry::LE` and `DVEntry::GE` in
direction vectors to '<' and '>' respectively. This handling is
incorrect because the information about the '=' it lost. This leads to
miscompilation in some cases. To resolve this issue, convert them to '*'
instead.

Resolve #123920
The PR addresses issues with the filters of 1 x r and of r x 1 and with
the tiling.

---------

Signed-off-by: Dmitriy Smirnov <dmitriy.smirnov@arm.com>
Thread-local code generation requires constant pools because most of the
relocations needed for it operate on data, so it cannot be used with
-mexecute-only (or -mpure-code, which is aliased in the driver).

Without this we hit an assertion in the backend when trying to generate
a constant pool.
When using PAuthLR, the PAUTH_PROLOGUE expands into a sequence of
instructions which takes the address of one of those instructions, and
uses that address to compute the return address signature. If this is
duplicated, there will be two different addresses used in calculating
the signature, so the epilogue will only be correct for (at most) one of
them.

This change also restricts code generation when using v8.3-A return
address signing, without PAuthLR. This isn't strictly needed, as
duplicating the prologue there would be valid. We could fix this by
having two copies of PAUTH_PROLOGUE, with and without isNotDuplicable,
but I don't think it's worth adding the extra complexity to a security
feature for that.
…640)

Add new runPass helpers to run a VPlan transformation. This makes it
easier to add additional checks/functionality for each transform run. In
this patch, an option is added to run the verifier after each VPlan
transform.

Follow-ups will use the same helper to also support printing VPlans
after each transform.

Note that the verifier at the moment requires there to be a canonical IV
and vector loop region, so the final lowering transforms aren't run via
runPass yet.

PR: llvm/llvm-project#123640
Add the implementation of the IERRNO intrinsic to get the last system
error number, as given by the C errno variable.
This intrinsic is also used in RAMSES
(https://github.com/ramses-organisation/ramses/).
Using the `__builtin_elementwise_(add|sub)_sat` functions allows us to
directly optimize to the desired intrinsic, and avoid scalarization for
vector types.
Adds AVX512 bf16 dot-product operation and defines lowering to LLVM
intrinsics.

AVX512 intrinsic operation definition is extended with an optional
extension field that allows specifying necessary LLVM mnemonic suffix
e.g., `"bf16"` for `x86_avx512bf16_` intrinsics.
ttjost
ttjost previously approved these changes Mar 26, 2026
Base automatically changed from bump_to_2ec27848 to aie-public March 27, 2026 08:09
@jorickert jorickert dismissed ttjost’s stale review March 27, 2026 08:09

The base branch was changed.

@jorickert jorickert requested a review from ttjost March 27, 2026 09:04
@jorickert
Copy link
Copy Markdown
Contributor Author

I merged the other bump-Prs into it, to reduce the review spam a bit

@konstantinschwarz konstantinschwarz merged commit c2f5123 into aie-public Mar 27, 2026
7 of 8 checks passed
@konstantinschwarz konstantinschwarz deleted the bump_to_d3446240 branch March 27, 2026 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.