Pr/svkeerthy/134004 #139480

svkeerthy · 2025-05-11T23:10:45Z

No description provided.

…lvm#138685) With the current implementation only one attribute is attached to the argument and the deserializer fails if more decorations are specified, however I believe that the spec does not prohibit having both `Aliased`/`Restrict` and `RelaxedPrecision`. I am not sure how to attach multiple attributes to a single argument with the current code and at the same time I do not have a use case for it, so I think the patch in the current state is a good starting point and can be extended in the future.

…hruVMV_V_V (llvm#138847) Without clearing kill flags, this pass will generate bad machine code. ``` *** Bad machine code: Using a killed virtual register *** - function: main - basic block: %bb.0 entry (0x437ef928) - instruction: %12:vrn7m1 = INSERT_SUBREG %11:vrn7m1(tied-def 0), %0:vr, %subreg.sub_vrm1_0 - operand 2: %0:vr ```

) Copy the minnum and maxnum tests into versions with minimum/maximum and minimumnum/maximumnum.

…lvm#139136)

)

With the IEEE bit disabled, the hardware instructions have the same behavior as these operations.

…lvm#137259) closes: [126634](llvm#126634) --------- Co-authored-by: joaosaffran <[email protected]>

…vm#137849) Do not suppress the pointer overflow check for the `(i8*) nullptr + N` idiom. Related issue: llvm#137833

Add a new Cygwin toolchain that just goes through the motions to initialize the Generic_GCC base properly. This allows removing some old, almost certainly wrong hard-coded paths from Lex/InitHeaderSearch.cpp. MSYS2 (GCC triple (arch)-pc-msys) is a fork of Cygwin (GCC triple (arch)-pc-cygwin), and this driver can be used for either. Add a simple test for this driver.

…138879) We often see initializers like unsigned a = 10; which take an integer literal and immediately cast it to another type. Recognize this pattern and omit the cast, simply emitting the value as a different type directly. This reduces the instruction count by up to 0.13%: http://llvm-compile-time-tracker.com/compare.php?from=303436c6d16518b35288d63a859506ffcc1681e4&to=648f5202f906d1606390b2d1081e4502dc74acc2&stat=instructions:u

Fixes llvm#138653

Closes llvm#138691

…8673)" This reverts commit d35ad58. This breaks the clang build: https://lab.llvm.org/buildbot/#/builders/132/builds/1033 /home/buildbot-worker/bbroot/clang-riscv-rva23-evl-vec-2stage/stage2/lib/Target/RISCV/RISCVGenGlobalISel.inc:1512:44: note: cannot allocate array; evaluated array bound 2431270 exceeds the limit (1048576); use '-fconstexpr-steps' to increase this limit

…lvm#139167)

The context string can be added to indicate the source of the duplicate definition. E.g. if the context is set to "foo2.o", then: "Duplicate definition of symbol 'foo'" becomes "In foo2.o, duplicate definition of symbol 'foo'". The JITDylib::defineImpl method is updated to use the name of the MaterializationUnit being added as the context string for duplicate definition errors. The JITDylib::defineMaterializing method is updated to use "defineMaterializing operation" as the conext string.

…lvm#138419) The shuffle needn't be twice the original number of vector elements, so the intermediate type used between the shuffle and the intrinsic should use the ShuffleDstTy number of elements. I found this when looking at shuffle costs and do not have test where it alters the output, but have added some cases where the shuffle output is not twice the size of the input.

FEAT_FP8DOT4 and FEAT_FP8FMA are supported by FUJITSU-MONAKA. These were previously enabled due to dependencies, but now require explicit activation due to modifications in the dependencies.

This is needed for forced unwind, for some testcases in libunwind/libcxxabi. This adds an aarch64 case for extracting the LanguageHandler and HandlerData fields from unwind info, in UnwindCursor::getInfoFromSEH, corresponding to the existing case for x86_64. This uses the struct IMAGE_ARM64_RUNTIME_FUNCTION_ENTRY_XDATA; this only became available in WinSDK 10.0.19041.0 and mingw-w64 v11.0 (or a mingw-w64 git snapshot after April 2023). (This is only a build-time requirement though; the format for the unwind data has been fixed since the start of Windows 10 on ARM64, so this doesn't impose any runtime requirement.)

The SCEV multiply by 1 doesn't make sense, because SCEV would fold it: therefore, the OrigPtr == Ptr branch effectively rejects a multiply. However, in this branch, we have a pointer SCEV that cannot be a multiply, and hence the code the code is dead. Strip it.

Now we define FMAXNUM and FMINNUM as IEEE754-2008 with +0.0>-0.0. LoongArch's fmax/fmin just follow this rules full. FMAXNUM_IEEE and FMINNUM_IEEE will be removed in future once: Fixes FMAXNUM/FMINNUM for all targets The use of FMAXNUM_IEEE/FMINNUM_IEEE are not used by middle end anymore.

…8875) This is a fix for the issue llvm#137126 that turned out to be a driver issue. FrontendActions has a loop to process multiple input files and `flang -fc1` accept multiple files, but the semantic, lowering, and llvm codegen actions were not re-entrant, and crash or weird behaviors occurred when processing multiple files with `-fc1`. This patch makes the actions reentrant by cleaning-up the contexts/modules if needed on entry.

It failed on armv7 with "Architecture not supported" which is due to StubTests being not supported on ARM /builds/fossdd/aports/main/llvm20/src/llvm-project-20.1.0.src/llvm/unittests/ExecutionEngine/Orc/ReOptimizeLayerTest.cpp:140: Failure Value of: llvm::detail::TakeError(RM.takeError()) Expected: succeeded Actual: failed (Architecture not supported) (of type llvm::detail::ErrorHolder)

Resolves llvm#137162 For cases when there isn't any `XOR` in the transformation, replace with a zero register.

When diagnosing scheduling issues it can be useful to know which heuristics are driving the scheduler. This adds pre-RA and post-RA statistics for all heuristics.

All immediates are deferred now.

Update recipe construction to use VPBBs to look up masks, in preparation for llvm#128420.

This consolidates node definitions into one place and enables automatic node verification. Part of llvm#119709.

Prefer DenseMap::lookup over DenseMap::find.

Directly compute costs for binary ops and GEPs in VPReplicateRecipe::computeCost. This simply ports the legacy cost computation for uniform/replicating binary ops to the VPlan cost model.

This patch uses consume_back while changing the type of TrailingDot to bool, indicating whether we have consumed "." or not.

Extract values state and operands analysis/building into a separate class. This class allows to localize instrutions state and operands building for future support of copyable elements vectorization. Recommit after revert 10f5120 Recommit after revert 6a2a8eb Reviewers: HanKuanChen, RKSimon Reviewed By: HanKuanChen Pull Request: llvm#138724

…m#139342)

…m#139455) StringRef::substr is shorter here because we can rely on its default second parameter.

github-actions · 2025-05-11T23:11:02Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

IgWod-IMG and others added 30 commits May 11, 2025 19:45

AMDGPU: Add baseline tests for fneg with min/max intrinsics (llvm#139132

b7fe573

) Copy the minnum and maxnum tests into versions with minimum/maximum and minimumnum/maximumnum.

AMDGPU: Handle minimumnum/maximumnum in fneg combines (llvm#139133)

14b505c

AMDGPU: Add baseline tests for min3/max3 from minimumnum/maximumnum (l…

d367b69

…lvm#139136)

AMDGPU: Form min3/max3 from minimumnum/maximumnum (llvm#139137)

ce44290

AMDGPU: Test more subtargets in minimumnum/maximumnum tests (llvm#139144

f7ec5cc

)

AMDGPU: Add minimumnum/maximumnum tests with amdgpu-ieee=0 (llvm#139145)

f1f9095

With the IEEE bit disabled, the hardware instructions have the same behavior as these operations.

[DirectX] Adding support for Root Descriptors in obj2yaml/yaml2obj (l…

b616dde

…lvm#137259) closes: [126634](llvm#126634) --------- Co-authored-by: joaosaffran <[email protected]>

[Clang][CodeGen] Enable pointer overflow check for GCC workaround (ll…

1cfa8bd

…vm#137849) Do not suppress the pointer overflow check for the `(i8*) nullptr + N` idiom. Related issue: llvm#137833

[clang][ExprConst] Check for array size of initlists (llvm#138673)

b99a3cf

Fixes llvm#138653

[gn build] Port 52924a2

454e13c

[NFC] Cleanup dead code in LoadStoreVectorizer.cpp (llvm#139211)

26e60f6

Closes llvm#138691

[lldb-dap] Move the event and progress event threads into DAP (NFC) (l…

1ab3998

…lvm#139167)

[AArch64] Fix feature list for FUJITSU-MONAKA processor (llvm#139212)

59eada7

FEAT_FP8DOT4 and FEAT_FP8FMA are supported by FUJITSU-MONAKA. These were previously enabled due to dependencies, but now require explicit activation due to modifications in the dependencies.

[SCEVPatternMatch] Extend with more matchers (llvm#138836)

62d6683

[AggressiveInstCombine] Add test for shifts from or chains. NFC

df8d2d9

[AArch64] Utilize XAR for certain vector rotates (llvm#137629)

9b1085e

Resolves llvm#137162 For cases when there isn't any `XOR` in the transformation, replace with a zero register.

[MISched] Add statistics for heuristics (llvm#137981)

5c55156

When diagnosing scheduling issues it can be useful to know which heuristics are driving the scheduler. This adds pre-RA and post-RA statistics for all heuristics.

[AMDGPU][NFC] Remove _DEFERRED operands. (llvm#139123)

42a14e3

All immediates are deferred now.

Ankur-0429 and others added 18 commits May 11, 2025 19:46

[CIR] Upstream enum support (llvm#136807)

0d88dab

[VPlan] Use VPBBs to look up masks for newly created recipes (NFC).

c489253

Update recipe construction to use VPBBs to look up masks, in preparation for llvm#128420.

[AVR] TableGen-erate SDNode descriptions (NFC) (llvm#139407)

8c6fbc2

This consolidates node definitions into one place and enables automatic node verification. Part of llvm#119709.

[SCEV] Improve code in SCEVLoopGuardRewriter (NFC) (llvm#139257)

8036023

Prefer DenseMap::lookup over DenseMap::find.

[VPlan] Handle most bin-ops in VPReplicateRecipe::computeCost. (NFC)

215f65d

Directly compute costs for binary ops and GEPs in VPReplicateRecipe::computeCost. This simply ports the legacy cost computation for uniform/replicating binary ops to the VPlan cost model.

[TargetParser] Use StringRef::consume_back (NFC) (llvm#139416)

bb3aee0

[X86] Use StringRef::consume_back (NFC) (llvm#139417)

9162f0d

This patch uses consume_back while changing the type of TrailingDot to bool, indicating whether we have consumed "." or not.

[Bitcode] Use range-based for loops (NFC) (llvm#139421)

3a3493d

[BOLT] Use StringRef::starts_with (NFC) (llvm#139437)

7cb5bdf

[clang] Use std::tie to implement operator< (NFC) (llvm#139438)

ab8e75c

[SLP][NFC]Add a test with ordering of the operands of unordered loads

e7fa33c

[dsymutil] Deduplicate Swift modules by path before copying them (llv…

13611d0

…m#139342)

[Driver] Use StringRef::substr instead of StringRef::slice (NFC) (llv…

ca5f37f

…m#139455) StringRef::substr is shorter here because we can rely on its default second parameter.

[clangd] Use StringRef::consume_back_insensitive (NFC) (llvm#139456)

cacc2b9

[gn build] Port c11aba9

ec89934

[llvm] Use StringRef::consume_front (NFC) (llvm#139458)

3e6f769

Adding doc for IR2Vec

a02d416

svkeerthy requested review from DeinAlptraum, Endilll, aaupov, ayermolo, bcardosolopes, cyndyishida, lanza, maksfb, rafaelauler and yota9 as code owners May 11, 2025 23:10

svkeerthy closed this May 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pr/svkeerthy/134004 #139480

Pr/svkeerthy/134004 #139480

Uh oh!

svkeerthy commented May 11, 2025

Uh oh!

github-actions bot commented May 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

103 participants

Pr/svkeerthy/134004 #139480

Pr/svkeerthy/134004 #139480

Uh oh!

Conversation

svkeerthy commented May 11, 2025

Uh oh!

github-actions bot commented May 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

103 participants