AAarch64 build attributes #118767

sivan-shani · 2024-12-05T08:57:47Z

No description provided.

…max/fmin (llvm#117977) Preserve `nnan` constraint only if present on both `fcmp` and `select`. Alive2: https://alive2.llvm.org/ce/z/ZNDjzt

Alternative for llvm#113764 It builds on a minimalistic approach with the legality check in match and a blind apply. The precise patterns are used for better compile-time and modularity. It also moves the pattern check into combiner. While unary_undef_to_zero and propagate_undef_any_op rely on custom C++ code for pattern matching. Is there a limit on the number of patterns? G_ANYEXT of undef -> undef G_SEXT of undef -> 0 G_ZEXT of undef -> 0 The combine is not a member of the post legalizer combiner for AArch64. Test: llvm/test/CodeGen/AArch64/GlobalISel/combine-cast.mir

This is the same thing for us, except for diagnostic differences.

`!(*this < RHS) && !(RHS < *this)` is difficult for the optimizer to reason about.

Use references instead of pointers for most state and common up some of the initialization between the legacy and new pass manager paths.

…extension. Add RUN lines to float-convert.ll and double-convert.ll without F extension.

…xes. This reapplies aba6bb0, which was reverted in 28e2a89 due to bot failures. It contains fixes to silence warnings for uncovered switches, and for incorrect initializer-symbol handling on ELF and COFF.

…d X, C2` (llvm#118197) Alive2: https://alive2.llvm.org/ce/z/Ffg64g Closes llvm#104772.

…lvm#118442)

When a memmove happens to clobber source data, and such data have been previously memset'd, the memmove may be redundant.

…ts (llvm#117902) Effectively this models all the accesses that occur between the first and second return as happening at the point of the call. Fixes llvm#116668.

This patch extends the optimize bufferization to deal with the new hlfir.eval_in_mem and move the evaluation contained in its body to operate directly over the LHS when it can prove there are no access to the LHS inside the region (and that the LHS is contiguous). This will allow the array function call optimization when lowering is changed to produce an hlfir.eval_in_mem in the next patch.

…18070) This patch encapsulate array function call lowering into hlfir.eval_in_mem and allows directly evaluating the call into the LHS when possible. The conditions are: LHS is contiguous, not accessed inside the function, it is not a whole allocatable, and the function results needs not to be finalized. All these conditions are tested in the previous hlfir.eval_in_mem optimization (llvm#118069) that is leveraging the extension of getModRef to handle function calls(llvm#117164). This yields a 25% speed-up on polyhedron channel2 benchmark (from 1min to 45s measured on an X86-64 Zen 2).

This PR adds two small convenience Vector types: * `ScalableVectorType` and `FixedVectorType`. The goal of these new types is two-fold: * Enable idiomatic checks like `isa<ScalableVectorType>(...)`. * Make the split into "Scalable" and "Fixed-wdith" vectors a bit more explicit and more visible in the code-base. The new types are added in mlir/include/mlir/IR (instead of e.g. mlir/include/mlir/Dialect/Vector) so that the new types can be used without requiring any new dependency (e.g. on the Vector dialect).

…lvm#117532) SBFunction::GetEndAddress doesn't really make sense for discontinuous functions, so I'm declaring it deprecated. GetStartAddress sort of makes sense, if one uses it to find the functions entry point, so I'm keeping that undeprecated. I've made the test a Shell tests because these make it easier to create discontinuous functions regardless of the host os and architecture. They do make testing the python API harder, but I think I've managed to come up with something not entirely unreasonable.

… NFC. (llvm#118438) Fix typos introduced by llvm@d9e8ae7 and llvm@42b6c8e.

) partially fixes llvm#70103 ### Changes * Implemented `GroupMemoryBarrierWithGroupSync` clang builtin * Linked `GroupMemoryBarrierWithGroupSync` clang builtin with `hlsl_intrinsics.h` * Added sema checks for `GroupMemoryBarrierWithGroupSync` to `CheckHLSLBuiltinFunctionCall` in `SemaChecking.cpp` * Add codegen for `GroupMemoryBarrierWithGroupSync` to `EmitHLSLBuiltinExpr` in `CGBuiltin.cpp` * Add codegen tests to `clang/test/CodeGenHLSL/builtins/GroupMemoryBarrierWithGroupSync.hlsl` * Add sema tests to `clang/test/SemaHLSL/BuiltIns/GroupMemoryBarrierWithGroupSync-errors.hlsl` ### Related PRs * [[DXIL] Add GroupMemoryBarrierWithGroupSync intrinsic llvm#111884](llvm#111884) * [[SPIRV] Add GroupMemoryBarrierWithGroupSync intrinsic llvm#111888](llvm#111888)

... just like strlen.

…17996) This is a follow-up/reimplementation of llvm#115730. While working on that patch, I did not realize that the correct (discontinuous) set of ranges is already stored in the block representing the whole function. The catch -- ranges for this block are only set later, when parsing all of the blocks of the function. This patch changes that by populating the function block ranges eagerly -- from within the Function constructor. This also necessitates a corresponding change in all of the symbol files -- so that they stop populating the ranges of that block. This allows us to avoid some unnecessary work (not parsing the function DW_AT_ranges twice) and also results in some simplification of the parsing code.

…llvm#117996)" This reverts commit ba14dac. I guess "has no conflicts" doesn't mean "it will build".

Emit the "cannot be applied to types" warning instead of silently ignoring the attribute when it's attempted to be used on a type (instead of a function argument or the function definition). Before this commit, the warning has been printed when the attribute was (mis)used on a decl-specifier, but not in other places in a declarator. Examples where the warning starts being emitted with this commit: ``` int * [[clang::lifetimebound]] x; void f(int * [[clang::lifetimebound]] x); void g(int * [[clang::lifetimebound]]); ``` Note that the last example is the case of an unnamed function parameter. While in theory Clang could've supported the `[[clang::lifetimebound]]` analysis for unnamed parameters, it doesn't currently, so the commit at least makes the situation better by highlighting this as a warning instead of a silent ignore - which was reported at llvm#96034.

…nt (llvm#118457)

…18464) Reverts llvm#110898 This change has caused a cyclic module dependency `fatal error: cyclic dependency in module 'LLVM_Utils': LLVM_Utils -> LLVM_Config_ABI_Breaking -> LLVM_Utils`. Reverting for now until we the right fix.

When merging STG instructions used for AArch64 stack tagging, we were stopping on reaching a load or store instruction, but not calls, so it was possible for an STG to be moved past a call to memcpy. This test case (reduced from fuzzer-generated C code) was the result of StackColoring merging allocas A and B into one stack slot, and StackSafetyAnalysis proving that B does not need tagging, so we end up with tagged and untagged objects in the same stack slot. The tagged object (A) is live first, so it is important that it's memory is restored to the background tag before it gets reused to hold B.

Proof: https://alive2.llvm.org/ce/z/omnQXt

Proof: https://alive2.llvm.org/ce/z/BYNQ7s

…#118460) To match the diagnostic output of the current interpreter.

llvm#117996)" This reverts commit 2526d5b, reapplying ba14dac after fixing the conflict with llvm#117532. The change is that Function::GetAddressRanges now recomputes the returned value instead of returning the member. This means it now returns a value instead of a reference type.

…coming change

Missed opportunity to avoid use of fpu for store(fabs(load()) style patterns

Reverts llvm#118592

…trinsics (llvm#117752) Adds support for the following MSVC intrinsics: * `__addx18byte` * `__addx18word` * `__addx18dword` * `__addx18qword` * `__incx18byte` * `__incx18word` * `__incx18dword` * `__incx18qword` These are documented at: <https://learn.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170>

) Currently, we support `-wdeprecated-array-compare` for C++20 or above and don't report any warning for older versions, this PR supports `-Warray-compare` for older versions and for GCC compatibility. Fixes llvm#114770

The original implementation rejected some valid constructs. The rule is supposed to be: Gang-on-Kernel cannot have a gang in its region Worker cannot have a worker or gang in its region Vector cannot have worker, gang, or vector in its region. The previous implementation improperly implemented that vector wasnt' allowed in the other two. This patch fixes it and adds testing for it.

llvm#118679

Calls to `@llvm.abs(undef, i1 true)` and `@llvm.abs(INT_MIN, i1 true)` can be optimized to `poison` instead of `undef`. [Alive2](https://alive2.llvm.org/ce/z/Hg-2ug)

…lvm#118597) When we have legal instructions we want to promote to sXLen and let isel pattern matching removing the and/sext_inreg. When using a libcall we want to use a 'si' libcall for small types instead of 'di'. To match the RV64 ABI, we need to sign extend `unsigned int` arguments. We reuse the shouldSignExtendTypeInLibCall hook from SelectionDAG.

Fix after buildbots issue

…lvm#117808) LLDB can crash in TypeSystemClang::GetIndexOfChildMemberWithName, at a point where it pushes an index onto the child_indexes vector, tries to call itself recursively, then tries to pop the entry from child_indexes. The problem is that the recursive call can clear child_indexes, so that this code ends up trying to pop an already empty vector. This change saves the old vector before the push, then restores the saved vector rather than trying to pop.

Avoid iterators, use structured bindings to unpack the [key, value] pairs.

…8681) Currently, the link to the issue tracker takes you to the Github source repository, rather than the Github issue tracker. This fixes the link and includes the lldb-dap label in both the issue and PR URL.

This patch moves the MemProf YAML traits to MemProf.h so that the YAML writer can access them from outside MemProfReader.cpp in the future.

rajatbajpai and others added 30 commits December 4, 2024 20:25

[InstCombine][FP] Fix nnan preservation for transform fcmp + sel => f…

226300a

…max/fmin (llvm#117977) Preserve `nnan` constraint only if present on both `fcmp` and `select`. Alive2: https://alive2.llvm.org/ce/z/ZNDjzt

[clang][bytecode] Handle memmove like memcpy (llvm#118431)

91be432

This is the same thing for us, except for diagnostic differences.

[InstructionCost] Optimize operator==

a0fc29f

`!(*this < RHS) && !(RHS < *this)` is difficult for the optimizer to reason about.

[AMDGPU] Refine AMDGPUAtomicOptimizerImpl class. NFC. (llvm#118302)

38f11cc

Use references instead of pointers for most state and common up some of the initialization between the legacy and new pass manager paths.

[RISCV][GISel] Support f64->f32 fptrunc and f32->f64 fpext without D …

5835af7

…extension. Add RUN lines to float-convert.ll and double-convert.ll without F extension.

Re-apply "[ORC][JITLink] Add jitlink::Scope::SideEffectsOnly" with fi…

5bcebbe

…xes. This reapplies aba6bb0, which was reverted in 28e2a89 due to bot failures. It contains fixes to silence warnings for uncovered switches, and for incorrect initializer-symbol handling on ELF and COFF.

[InstCombine] Fold icmp spred (and X, highmask), C1 into `icmp spre…

9894477

…d X, C2` (llvm#118197) Alive2: https://alive2.llvm.org/ce/z/Ffg64g Closes llvm#104772.

[clang][bytecode][NFC] Diagnose non-constexpr builtin strcmp calls (l…

eb261fd

…lvm#118442)

[MemCpyOpt] Introduce test for PR101930 (NFC)

8997a60

[MemCpyOpt] Drop dead memmove calls on memset'd source data

c26cb3d

When a memmove happens to clobber source data, and such data have been previously memset'd, the memmove may be redundant.

[BasicAA] Treat returns_twice functions as clobbering unescaped objec…

f72a042

…ts (llvm#117902) Effectively this models all the accesses that occur between the first and second return as happening at the point of the call. Fixes llvm#116668.

[ValueTracking] Fix typo in isKnownNegative and MaskedValueIsZero…

a390f64

… NFC. (llvm#118438) Fix typos introduced by llvm@d9e8ae7 and llvm@42b6c8e.

[clang][bytecode] Handle __builtin_wcslen (llvm#118446)

dceb943

... just like strlen.

Revert "[lldb] Use the function block as a source for function ranges (…

43e1fab

…llvm#117996)" This reverts commit ba14dac. I guess "has no conflicts" doesn't mean "it will build".

LV/test: clean up a test and regen with UTC (llvm#118394)

bc36b9c

[clang][bytecode] Initialize elements in __builtin_elementwise_popcou…

2c47595

…nt (llvm#118457)

[InstCombine] Support nusw in icmp of gep with base

77b72b8

Proof: https://alive2.llvm.org/ce/z/omnQXt

[InstCombine] Support nusw in icmp of two geps with same base

2ea2cf7

Proof: https://alive2.llvm.org/ce/z/BYNQ7s

[clang][bytecode] Reject memcpy dummy pointers after null check (llvm…

c14445d

…#118460) To match the diagnostic output of the current interpreter.

hidekisaito and others added 17 commits December 4, 2024 20:26

Fix to account for multiple ISA enumeration (llvm#118676)

0580c70

[X86] fsxor-alignment.ll - add nounwind to prevent cfi noise in an up…

0ec47f9

…coming change

[X86] Add fabs/fneg rmw style test coverage for llvm#117557

bf271d1

Missed opportunity to avoid use of fpu for store(fabs(load()) style patterns

Revert "[flang][cuda] Run target rewrite in gpu.module" (llvm#118679)

861940b

Reverts llvm#118592

[flang][OpenMP] Add comments to IsContiguous, improve formatting, NFC

4ee31b5

Reland "[flang][cuda] Run target rewrite in gpu.module" (llvm#118682)

1dae903

llvm#118679

[InstSimplify] Refine abs(min/undef, true) to poison (llvm#118669)

e9bfea0

Calls to `@llvm.abs(undef, i1 true)` and `@llvm.abs(INT_MIN, i1 true)` can be optimized to `poison` instead of `undef`. [Alive2](https://alive2.llvm.org/ce/z/Hg-2ug)

[flang][cuda] Fix test cuda-target-rewrite.mlir

3fa62e2

Fix after buildbots issue

[flang][OpenMP] Use range-for to iterate over SymbolSourceMap, NFC

5823903

Avoid iterators, use structured bindings to unpack the [key, value] pairs.

[lldb-dap] Fix links to LLVM issue tracker and pull requests (llvm#11…

488d16d

…8681) Currently, the link to the issue tracker takes you to the Github source repository, rather than the Github issue tracker. This fixes the link and includes the lldb-dap label in both the issue and PR URL.

[scudo] Use internal list to manage the LRU cache (llvm#117946)

0b91eea

[memprof] Move YAML traits to MemProf.h (NFC) (llvm#118668)

ebfecbb

This patch moves the MemProf YAML traits to MemProf.h so that the YAML writer can access them from outside MemProfReader.cpp in the future.

sivan-shani requested review from Endilll, aaupov, ayermolo, bcardosolopes, cyndyishida, dcci, lanza, maksfb and rafaelauler as code owners December 5, 2024 08:57

sivan-shani closed this Dec 5, 2024

sivan-shani deleted the AArch64BuildAttributes branch December 5, 2024 08:59

sivan-shani restored the AArch64BuildAttributes branch December 5, 2024 09:11

sivan-shani deleted the AArch64BuildAttributes branch December 5, 2024 09:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AAarch64 build attributes #118767

AAarch64 build attributes #118767

Uh oh!

sivan-shani commented Dec 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

132 participants

AAarch64 build attributes #118767

AAarch64 build attributes #118767

Uh oh!

Conversation

sivan-shani commented Dec 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

132 participants