-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Pr/svkeerthy/134004 #139480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pr/svkeerthy/134004 #139480
Conversation
…lvm#138685) With the current implementation only one attribute is attached to the argument and the deserializer fails if more decorations are specified, however I believe that the spec does not prohibit having both `Aliased`/`Restrict` and `RelaxedPrecision`. I am not sure how to attach multiple attributes to a single argument with the current code and at the same time I do not have a use case for it, so I think the patch in the current state is a good starting point and can be extended in the future.
…hruVMV_V_V (llvm#138847) Without clearing kill flags, this pass will generate bad machine code. ``` *** Bad machine code: Using a killed virtual register *** - function: main - basic block: %bb.0 entry (0x437ef928) - instruction: %12:vrn7m1 = INSERT_SUBREG %11:vrn7m1(tied-def 0), %0:vr, %subreg.sub_vrm1_0 - operand 2: %0:vr ```
With the IEEE bit disabled, the hardware instructions have the same behavior as these operations.
…lvm#137259) closes: [126634](llvm#126634) --------- Co-authored-by: joaosaffran <[email protected]>
…vm#137849) Do not suppress the pointer overflow check for the `(i8*) nullptr + N` idiom. Related issue: llvm#137833
Add a new Cygwin toolchain that just goes through the motions to initialize the Generic_GCC base properly. This allows removing some old, almost certainly wrong hard-coded paths from Lex/InitHeaderSearch.cpp. MSYS2 (GCC triple (arch)-pc-msys) is a fork of Cygwin (GCC triple (arch)-pc-cygwin), and this driver can be used for either. Add a simple test for this driver.
…138879) We often see initializers like unsigned a = 10; which take an integer literal and immediately cast it to another type. Recognize this pattern and omit the cast, simply emitting the value as a different type directly. This reduces the instruction count by up to 0.13%: http://llvm-compile-time-tracker.com/compare.php?from=303436c6d16518b35288d63a859506ffcc1681e4&to=648f5202f906d1606390b2d1081e4502dc74acc2&stat=instructions:u
…8673)" This reverts commit d35ad58. This breaks the clang build: https://lab.llvm.org/buildbot/#/builders/132/builds/1033 /home/buildbot-worker/bbroot/clang-riscv-rva23-evl-vec-2stage/stage2/lib/Target/RISCV/RISCVGenGlobalISel.inc:1512:44: note: cannot allocate array; evaluated array bound 2431270 exceeds the limit (1048576); use '-fconstexpr-steps' to increase this limit
The context string can be added to indicate the source of the duplicate definition. E.g. if the context is set to "foo2.o", then: "Duplicate definition of symbol 'foo'" becomes "In foo2.o, duplicate definition of symbol 'foo'". The JITDylib::defineImpl method is updated to use the name of the MaterializationUnit being added as the context string for duplicate definition errors. The JITDylib::defineMaterializing method is updated to use "defineMaterializing operation" as the conext string.
…lvm#138419) The shuffle needn't be twice the original number of vector elements, so the intermediate type used between the shuffle and the intrinsic should use the ShuffleDstTy number of elements. I found this when looking at shuffle costs and do not have test where it alters the output, but have added some cases where the shuffle output is not twice the size of the input.
FEAT_FP8DOT4 and FEAT_FP8FMA are supported by FUJITSU-MONAKA. These were previously enabled due to dependencies, but now require explicit activation due to modifications in the dependencies.
This is needed for forced unwind, for some testcases in libunwind/libcxxabi. This adds an aarch64 case for extracting the LanguageHandler and HandlerData fields from unwind info, in UnwindCursor::getInfoFromSEH, corresponding to the existing case for x86_64. This uses the struct IMAGE_ARM64_RUNTIME_FUNCTION_ENTRY_XDATA; this only became available in WinSDK 10.0.19041.0 and mingw-w64 v11.0 (or a mingw-w64 git snapshot after April 2023). (This is only a build-time requirement though; the format for the unwind data has been fixed since the start of Windows 10 on ARM64, so this doesn't impose any runtime requirement.)
The SCEV multiply by 1 doesn't make sense, because SCEV would fold it: therefore, the OrigPtr == Ptr branch effectively rejects a multiply. However, in this branch, we have a pointer SCEV that cannot be a multiply, and hence the code the code is dead. Strip it.
Now we define FMAXNUM and FMINNUM as IEEE754-2008 with +0.0>-0.0. LoongArch's fmax/fmin just follow this rules full. FMAXNUM_IEEE and FMINNUM_IEEE will be removed in future once: Fixes FMAXNUM/FMINNUM for all targets The use of FMAXNUM_IEEE/FMINNUM_IEEE are not used by middle end anymore.
…8875) This is a fix for the issue llvm#137126 that turned out to be a driver issue. FrontendActions has a loop to process multiple input files and `flang -fc1` accept multiple files, but the semantic, lowering, and llvm codegen actions were not re-entrant, and crash or weird behaviors occurred when processing multiple files with `-fc1`. This patch makes the actions reentrant by cleaning-up the contexts/modules if needed on entry.
It failed on armv7 with "Architecture not supported" which is due to StubTests being not supported on ARM /builds/fossdd/aports/main/llvm20/src/llvm-project-20.1.0.src/llvm/unittests/ExecutionEngine/Orc/ReOptimizeLayerTest.cpp:140: Failure Value of: llvm::detail::TakeError(RM.takeError()) Expected: succeeded Actual: failed (Architecture not supported) (of type llvm::detail::ErrorHolder)
Resolves llvm#137162 For cases when there isn't any `XOR` in the transformation, replace with a zero register.
When diagnosing scheduling issues it can be useful to know which heuristics are driving the scheduler. This adds pre-RA and post-RA statistics for all heuristics.
All immediates are deferred now.
Update recipe construction to use VPBBs to look up masks, in preparation for llvm#128420.
This consolidates node definitions into one place and enables automatic node verification. Part of llvm#119709.
Prefer DenseMap::lookup over DenseMap::find.
Directly compute costs for binary ops and GEPs in VPReplicateRecipe::computeCost. This simply ports the legacy cost computation for uniform/replicating binary ops to the VPlan cost model.
This patch uses consume_back while changing the type of TrailingDot to bool, indicating whether we have consumed "." or not.
Extract values state and operands analysis/building into a separate class. This class allows to localize instrutions state and operands building for future support of copyable elements vectorization. Recommit after revert 10f5120 Recommit after revert 6a2a8eb Reviewers: HanKuanChen, RKSimon Reviewed By: HanKuanChen Pull Request: llvm#138724
…m#139455) StringRef::substr is shorter here because we can rely on its default second parameter.
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
No description provided.