Skip to content

Conversation

pull[bot]
Copy link

@pull pull bot commented Aug 6, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.3)

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull bot locked and limited conversation to collaborators Aug 6, 2025
@pull pull bot added the ⤵️ pull label Aug 6, 2025
ampandey-1995 and others added 28 commits August 22, 2025 18:35
#154937)

casts (#153843)"

Error Fixes.

> Replace ssize_t with sanitizer equivalent ssize.

This reverts commit ee5367b.
…PP_COMPRESSED_PAIR (#154559)

This patch adds unit tests to catch the regression described in #154146.
At the moment, these tests are pinning down the post-break ABI.
Implement olMemFill to support filling device memory with arbitrary
length patterns. AMDGPU support will be added in a follow-up PR.
Fix regression introduced by #154102 - the way offload-tblgen handles
names has changed
- use llvm::endian::read<> to read bit/little endian.
- Range check against size of the lookup tables instead of hardcoded
numbers.
- Make lookup tables constexpr.
- Drop {} for single-statement if-else.
This change adds intrinsics for MMA sparse. The implementation is based on PTX ISA version 8.8.
Using the default constructor makes this header work in both c++17 and
c++20 codebases. Without this, a c++20 codebase will break like this:

```c++
external/llvm-project/mlir/include/mlir/IR/Remarks.h:66:12: error: no matching constructor for initialization of 'RemarkOpts'
   66 |     return RemarkOpts{n, {}, {}, {}};
      |            ^         ~~~~~~~~~~~~~~~
external/llvm-project/mlir/include/mlir/IR/Remarks.h:58:8: note: candidate constructor (the implicit copy constructor) not viable: requires 1 argument, but 4 were provided
   58 | struct RemarkOpts {
      |        ^~~~~~~~~~
external/llvm-project/mlir/include/mlir/IR/Remarks.h:58:8: note: candidate constructor (the implicit move constructor) not viable: requires 1 argument, but 4 were provided
   58 | struct RemarkOpts {
      |        ^~~~~~~~~~
external/llvm-project/mlir/include/mlir/IR/Remarks.h:63:3: note: candidate constructor not viable: requires 0 arguments, but 4 were provided
   63 |   RemarkOpts() = delete;
      |   ^
external/llvm-project/mlir/include/mlir/IR/Remarks.h:65:31: error: constexpr function's return type 'RemarkOpts' is not a literal type
   65 |   static constexpr RemarkOpts name(StringRef n) {
      |                               ^
external/llvm-project/mlir/include/mlir/IR/Remarks.h:58:8: note: 'RemarkOpts' is not literal because it is not an aggregate and has no constexpr constructors other than copy or move constructors
   58 | struct RemarkOpts {
      |        ^
```
These tests don't work due to limitations in backend support, so it's
better to mark them uniformly unsupported on AIX and z/OS.
…b` (#154630)

In Windows, on a MSVC environment (e.g. when linking against the UCRT),
`-nostdlib` is used (for example, by CMake) to prevent linking in
non-existent `glibc`. However, an unintended side-effect is that we end
up never linking in the HIP RT in these circumstances, even when
`--hip-link` is explicitly specified. This breaks `hipstdpar`, where we
implicitly link in the HIP RT when `--hipstdpar` is passed as a link
flag. To fix this, we relax the restriction on linking the HIP RT, for
known MSVC environments.
…n `modulemap`s (#148959)

This PR teaches the modulemap parsing logic to report warnings that
default to errors if the parsing logic sees duplicating link
declarations in the same module. Specifically, duplicating link
declarations means multiple link declarations with the same
string-literal in the same module. No errors are reported if a same link
declaration exist in a submodule and its enclosing module.

The warning can be disabled with `-Wno-module-link-redeclaration`. 

rdar://155880064
Adds a utility getter to `warp_execute_on_lane_0` which simplifies
access to the op's terminator.

Uses are refactored to utilize the new terminator getter.
This patch fixes:

  mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp:132:3: error: default
  label in switch which covers all enumeration values
  [-Werror,-Wcovered-switch-default]
…#153993)

This PR adds the following basic math functions for BFloat16 type along
with the tests:
- nextafterbf16
- nextdownbf16
- nexttowardbf16
- nextupbf16

---------

Signed-off-by: Krishna Pandey <[email protected]>
Co-authored-by: OverMighty <[email protected]>
Constant fold the NVVM intrinsics for add, mul, div, fma with specific
rounding modes.
This migrates the CompletionHandler to structured types and adds a new
CompletionItem and CompletionItemType to the general types.

---------

Co-authored-by: Ebuka Ezike <[email protected]>
Make it clear that other object file formats (e.g. ELF) do not use this
field.
Fixes #150550.

With the test case 
```
void f(unsigned char *x, unsigned char *y, int n) {
  // should have been vectorized into avgr_u instead of seperated vectorized add and logical right shift
  for (int i = 0; i < n; i++)
    x[i] = (x[i] + y[i] + 1) / 2;
}
```

the backend failed to recognize that this can be reduced to avgr_u since
the loop vectorizer doesn't transform into the existing pattern in
tablegen.

This PR sets AVGCEIL_U as legal for v8i16 and v16i8 and selects it to
avgr_u in the tablegen file.
See #140071, I concluded that
removing the typo was the best thing after looking into the commit.
Happy to be told if this is incorrect and a different change would be
better
getAsExpr() already returns Expr *.
getTargetFlags() already returns TargetFlagsType.
skc7 and others added 29 commits August 25, 2025 12:20
…rontend (#154376)

This PR adds workdistribute mlir op in omp dialect and also in llvm
frontend.

The work in this PR is c-p and updated from @ivanradanov commits from coexecute implementation:
flang_workdistribute_iwomp_2024
…oops (#155077)

We cannot guarantee the validity of the interchange if the loops have
iter_args, since the dependence analysis does not take them into
account. Conservatively return false in such cases.

Add an option to check permutation validity in test-loop-permutation
pass to test this change.

Signed-off-by: Prathamesh Tagore <[email protected]>
Improve const-correctness of `CheckerContext` API by defining the missing
`const` overloads to its accessor member functions.
…kend (#154672)

This patch adds support for reading the global timer low register in the
NVVM dialect and NVPTX backend. This change includes adding the
`NVVM_GlobalTimerLoOp` operation to NVVM dialect and 
`int_nvvm_read_ptx_sreg_globaltimer_lo` intrinsic to the NVPTX backend.

All the lit tests have been added.
This was not checking the alignment requirement for 64-bit
operands which accept inline immediates. Not all custom operand
types were handled in the switch, so round out with explicit
handling of all enum values, and change the default to use
the default checks for unhandled cases.

Fixes #155095
Regenerate these with a newer UTC version, so that the function
signature is included. Otherwise we can get some very confusing
naming on updates.
Fix nvgpu mlir file integration test. This PR fixes the bug by removing
memref.get_global and then using memref.view.
This is a follow-up PR for post-commit comments in #121104 .

Details:

- Rename `mergeTwoCounter` to `mergeTwoCounters` (add trailing `s`).
- Avoid duplicated hash lookup.
- Use `///` instead of `//`.
- Fix typo.
…oro.end (#153404)

As mentioned in #151067, current design of `llvm.coro.end` mixes two
functionalities: querying where we are and lowering to some code. This
patch separate these functionalities into independent intrinsics by
introducing a new intrinsic `llvm.coro.is_in_ramp`.
…f llvm.coro.end (#153404)"

This reverts commit 19a4f52.

See test failure in #153404
I noticed a typo in the directory name `refwrap.comparissons`, then did
a quick pass to fix typos elsewhere in the tests.

All fixes were manual (some carefully search-and-replaced); I used
[cspell](https://www.npmjs.com/package/cspell) to find them.
This is meant as the inverse of getNamedOperandIdx and returns the
OpName for a given operand index for a given opcode.

---------

Co-authored-by: Matt Arsenault <[email protected]>
Initialize the `OldToLValue` member with the actual old value of
`ToLValue`.

Pointed out by Shafik in
#153601 (comment)
InstCombine tries to convert `freeze(inst(op))` to `inst(freeze(op))`.
Currently, this is limited to the case where a single operand needs to
be frozen, and all other operands are guaranteed non-poison.

This patch allows the transform even if multiple operands need to be
frozen. The existing limitation makes sure that we do not increase the
total number of freezes, but it also means that that we may fail to
eliminate freezes (via poison flag dropping) and may prevent
optimizations (as analysis generally can't look past freeze). Overall, I
believe that aggressively pushing freezes upwards is more beneficial
than harmful.

This is the middle-end version of #145939 in DAGCombine (which is
currently reverted for SDAG-specific reasons).
#137975)

An authenticated pointer can be explicitly checked by the compiler via a
sequence of instructions that executes BRK on failure. It is important
to recognize such BRK instruction as checking every register (as it is
expected to immediately trigger an abnormal program termination) to
prevent false positive reports about authentication oracles:

      autia   x2, x3
      autia   x0, x1
      ; neither x0 nor x2 are checked at this point
      eor     x16, x0, x0, lsl #1
      tbz     x16, #62, on_success ; marks x0 as checked
      ; end of BB: for x2 to be checked here, it must be checked in both
      ; successor basic blocks
    on_failure:
      brk     0xc470
    on_success:
      ; x2 is checked
      ldr     x1, [x2] ; marks x2 as checked
…atten attribute (#154801)

Fixes #149866

---------

Co-authored-by: Aaron Ballman <[email protected]>
…alization (#151267) (#155214)

This reverts commit c075fb8. This
commit introduces a caching bug that causes undesired collisions.
)

We were sizing the table appropriately for the number of LibcallImpls,
but many of those have identical names which were pushing up the
collision count unnecessarily. This ends up decreasing the table size
slightly, and makes it a bit faster.

BM_LookupRuntimeLibcallByNameRandomCalls improves by ~25% and
BM_LookupRuntimeLibcallByNameSampleData by ~5%.

As a secondary change, align the table size up to the next
power of 2. This makes the table larger than before, but improves
the sample data benchmark by an additional 5%.
…cs (#152575)

Previously, the alignment of pointer arithmetics was inferred from the
pointee type, losing the alignment information from its operands:

https://github.com/llvm/llvm-project/blob/503c0908c3450d228debd64baecf41df8f58476e/clang/lib/CodeGen/CGExpr.cpp#L1446-L1449

This patch preserves alignment information for pointer arithmetics `P
+/- C`, to match the behavior of identical array subscript `&P[C]`:
https://godbolt.org/z/xx1hfTrx4.

Closes #152330. Although the
motivating case can be fixed by
#145733, the alignment cannot
be recovered without a dominating memory access with larger alignment.
We need to use the base offset in both cases.
Also, add additional assertions to make sure we don't miss this case
again.

Fixes #155132
…Self` (#155176)

This patch was a part of
#154375.
Two functional changes:
1. Allow matching other commuted patterns.
2. Allow combining loads even if there are multiple uses on a load. It
is beneficial in practice.
Some buildbots build unittests with `-Werror,-Wsign-compare`.
Improve the documentation of `replaceUsesOfBlockArgument` to clarify its
semantics is rollback mode. Add an assertion to make sure that the same
block argument is not replaced multiple times. That's an API violation
and messes with the internal state of the conversion driver.

This commit is in preparation of adding full support for
`RewriterBase::replaceAllUsesWith`.
…154377)

This PR adds workdistribute parser and semantic support in flang.

The work in this PR is c-p and updated from @ivanradanov commits from coexecute implementation:
flang_workdistribute_iwomp_2024
Following #144344, #152207, #151690, this PR adds the alignment
attribute to the following operations in the vector dialect:

* `compressstore`
* `expandload`
* `vector.scatter`
* `vector.gather`

---------

Co-authored-by: Jakub Kuderski <[email protected]>
@pull pull bot merged commit 7b467bc into Ericsson:main Aug 25, 2025
2 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.