-
Notifications
You must be signed in to change notification settings - Fork 809
Fix for cmplrllvm 71955 #20960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Fix for cmplrllvm 71955 #20960
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Opcodes 0xA0-0xA3 can access a 64-bit absolute address. Before this change, LLVM would require this to be written as `movabs`, and writing it as `mov` would silently truncate the address. After this change, if `mov moffset` is used with a constant expression which evaluates to a value that doesn't fit in 32 bits, the instruction will automatically be changed to `movabs`. This should match the behavior of more recent versions of gas. The one existing test which expected a silent truncation + sign-extend is removed. This change does not affect `mov` opcodes that reference an external symbol. Using `mov` will continue to generate a 32-bit address and reloc_signed_4byte, and `movabs` is required to specify a 64-bit address. Fixes #73481
…160900) Refactor the discriminator emission in `AArch64AsmPrinter`: * factor out ad-hoc "X16 or X17 or not isX16X17Safer" checks into a dedicated `isPtrauthRegSafe` function * assert that `Disc` is uint16 once in `emitPtrauthDiscriminator` instead of in all its callers * update the comments and assertions for readability * rename `MayUseAddrAsScratch` argument to `MayClobberAddrDisc`, as it better reflects the intention
The unordered containers re-use the formatters for `std::list` which were fixed for PDB with #166953. This should be the last fix for PDB in MSVC STL tests. Unfortunately, the type names here are very long, because the types of keys/values are repeated in the template (for hash/eq/allocator).
…66802) Fix missing propagation of fast-math flags in algebraic simplification patterns of the MLIR math dialect.
Previously, while gas syntax had checks for offsets which do not fit within 32 bits, Intel syntax did not and would silently truncate without warning. Apply the same checks in both modes.
…(#160901) Separate the low-level emission of the appropriate variants of `AUT*`, `PAC*` and `B(L)RA*` instructions from the high-level logic of pseudo instruction expansion. Introduce `getBranchOpcodeForKey` helper function by analogy to `get(AUT|PAC)OpcodeForKey`.
This patch adds support for consuming a `char` from the front of a
`StringRef`. Most of the time a user wanting to consume a single
character off the front can just wrap the character in a string literal
(i.e., `consume_front("a")`). But this doesn't work if we don't have a
`char` literal, but instead a variable of type `char`. I.e., `char c =
'a'; str.consume_front(c)`. There's at least one helper in LLDB that
does this. Also there's plenty of example of `consume_front` being
passed a single character via string literal. This patch adds the `char`
overload. We already have a `starts_with(char)` overload, so there's at
least some related precedent.
…2901) Add ability to defer parsing and re-enqueueing oneself. This enables changing CallSiteLoc parsing to not recurse as deeply: previously this could fail (especially on large inputs in debug mode the recursion could overflow). Add a default depth cutoff, this could be a parameter later if needed. Roll-forward of #170993 with relatively direct change such that if processing while not resolving/when parsing property it eagerly resolves.
`iterVarKind` is effectively const but not marked as such. Co-authored-by: Jeremy Kun <[email protected]>
…168) This PR implement the following papers: [P1857R3 Modules Dependency Discovery](https://wg21.link/p1857r3). [P3034R1 Module Declarations Shouldn’t be Macros](https://wg21.link/P3034R1). [CWG2947](https://cplusplus.github.io/CWG/issues/2947.html). At the start of phase 4 an import or module token is treated as starting a directive and are converted to their respective keywords iff: - After skipping horizontal whitespace are - at the start of a logical line, or - preceded by an export at the start of the logical line. - Are followed by an identifier pp token (before macro expansion), or - <, ", or : (but not ::) pp tokens for import, or - ; for module Otherwise the token is treated as an identifier. Additionally: - The entire import or module directive (including the closing ;) must be on a single logical line and for module must not come from an #include. - The expansion of macros must not result in an import or module directive introducer that was not there prior to macro expansion. - A module directive may only appear as the first preprocessing tokens in a file (excluding the global module fragment.) - Preprocessor conditionals shall not span a module declaration. After this patch, we handle C++ module-import and module-declaration as a real pp-directive in preprocessor. Additionally, we refactor module name lexing, remove the complex state machine and read full module name during module/import directive handling. Possibly we can introduce a tok::annot_module_name token in the future, avoid duplicatly parsing module name in both preprocessor and parser, but it's makes error recovery much diffcult(eg. import a; import b; in same line). This patch also introduce 2 new keyword `__preprocessed_module` and `__preprocessed_import`. These 2 keyword was generated during `-E` mode. This is useful to avoid confusion with `module` and `import` keyword in preprocessed output: ```cpp export module m; struct import {}; #define EMPTY EMPTY import foo; ``` Fixes llvm/llvm-project#54047 --------- Signed-off-by: yronglin <[email protected]> Signed-off-by: Wang, Yihan <[email protected]>
Static analysis flagged BuildLockset as not following the Rule of Three, so I just added deleted copy ctor and copy assignment.
Many neon right shift intrinsics were not supported by GlobalISel, mainly due to a lack of legalisation logic. This logic has now been implemented. Some intrinsics involving a narrow lower to two separate GI nodes, which may then be re-combined later into a single assembly instruction.
After enabling DFLTCC in zlib-ng for s390x this test starts failing, because slightly better compression is produced at level 1. Add 1c as a permissible output.
…EG. (#171619)
This fixes an assertion failure in `SelectionDAG::getNode` on AArch64
during Type Legalization.
The crash is triggered by an `ANY_EXTEND_VECTOR_INREG` operation
involving small vector types (e.g., v16i1).
The crash occurs when the Type Legalizer processes a vector extend
operation where the input vector uses small elements, specifically in
the case of a ShuffleVector that generates a mask vector.
1. **Original Node**: `any_extend_vector_inreg (v16i1) -> v2i16`. (This
is physically valid: 16 bits < 32 bits).
2. **Promotion Issue**: When both the input and result types are
promoted for legality:
* The **Result** (`v2i16`) is promoted to a larger legal type, e.g.,
`v2i32` (**64 bits**).
* The **Input** (`v16i1`) is promoted to `v16i8` (**128 bits**) due to
the necessary scalar promotion of `i1` to `i8`.
3. The legalizer then attempts to create the new node:
`any_extend_vector_inreg (v16i8) -> v2i32`.
4. Since $128 \text{ bits} > 64 \text{ bits}$, the physical constraint
of the `EXTEND_VECTOR_INREG` operation is violated, causing the
assertion to fail.
### Solution
In `DAGTypeLegalizer::PromoteIntRes_EXTEND_VECTOR_INREG`, when the size
of the promoted input vector (`Promoted`) is found to be greater than
the size of the promoted result vector (`NVT`):
We explicitly truncates the promoted input to the size of the result
type (`NVT`), ensuring the final `*_EXTEND_VECTOR_INREG` node satisfies
the size constraint before it is created.
This behavior aligns with the fact that `*_EXTEND_VECTOR_INREG`
typically only requires the low-order lanes of the input vector.
**Test Added**: `llvm/test/CodeGen/AArch64/issue-171032.ll`
Fixes: #171032
Some downstreams depend on our libraries having the uniform prefix.
…ct conversion (#172883) `ConversionPatternRewriter::replaceUsesWithIf` does not support the `allUsesReplaced` flag and asserts that it's not set. However the `ValueRange` overload of `RewriterBase::replaceUsesWithIf` always passes this flag when calling the virtual overload of `replaceUsesWithIf`. This means calls made to `RewriterBase::replaceUsesWithIf` from a `ConversionPattern` crash, whether the `allUsesReplaced` flag is set or not. This change tweaks `RewriterBase::replaceUsesWithIf` to only pass that flag if the callee has set it.
…172975) Fixes #172964
When matching callsite profile info, we synthesize VP metadata for matched indirect calls from the CalleeGuids recorded with the CallSite profile info. However, those are the callee guids of the leaf-most frame in the callsite. In cases where we match to a portion of the frames, not including the leaf, the callee guid should instead be synthesized from the next leaf-most frame in the list. This addresses the case where indirect call promotion was applied in the profiled binary during SamplePGO matching in a ThinLTO backend, where we didn't have VP metadata.
This in preparation to adding a DAG combiner for turning INSERT_VECTOR_ELT(undef, ...) -> VECTOR_SPLAT
… constants on RV32. (#172802) This causes a regression due to incomplete handling of RV32 for the P extension in RISCVMatInt. I'll fix that in a follow up.
Adds initial support for the ext-shape extension, including the operations: - ADD_SHAPE - SUB_SHAPE - MUL_SHAPE - DIV_FLOOR_SHAPE - DIV_CEIL_SHAPE to align with the spec change: arm/tosa-specification@efc88a1. This includes the operator definition, same rank checks and level checks during validation. It does not currently include support for folding or shape inference. This will be added in a later commit. Based on work originally implemented by @Tai78641.
…here applicable (#172947) Fixing the instances found by `misc-use-internal-linkage` in #172797.
…MULLQ targets (or when VPMULLQ is unavailable) (#171760) This pull request introduces a new tuning flag "TuningSlowPMULLQ" and uses it to optimize 64-bit vector multiplication on Intel targets where "VPMULLQ" is slow. On recent Intel microarchitectures , the "VPMULLQ" instruction has a high latency of 15 cycles . In contrast, the "VPMADD52LUQ" instruction (available via AVX512IFMA) performs a similar operation with a latency of only 4 cycles . Reference data from uops.info (Ice Lake): "VPMULLQ" : Latency 15, TP 1.5 "VPMADD52LUQ" : Latency 4, TP 0.5 Fixes #158854
This is a small cleanup based on post-commit review of #172921 See https://github.com/llvm/llvm-project/pull/172921/changes#r2635381097
…2080) Add definitions of the remaining OpenMP 6.0 clauses to the OMP.td file. Implement the bare-bones skeleton in flang to support the new definitions. Adding a clause to OMP.td automatically generates some flang code which requires manual completion to even compile. This PR adds the absolute minimum for all 6.0 clauses that were still missing. This minimum does not implement any OpenMP functionality, it just allows flang to compile and run. As a benefit, any future clause-related clang work will not require any changes to flang.
…1718) i1 type load / store lowering does not work anymore for SPIR-V kernel Rewrite test cases such that it does not use i1 load / store.
…72661) This reduces unnecessary string allocations and copies when handling the variables request.
``TranslationUnit.reparse`` will now throw an exception when an error occurs. Previously, errors were silently ignored.
Previously git-llvm-push would convert all remote URLs to HTTPS, including SSH remotes for reasons not motivated in the original PR. This would cause issues in some setups where the HTTPs remote would be read-only. This patch makes it so that git-llvm-push does not convert SSH remotes to HTTPS remotes, preserving what the user originally intended. Fixes #172828.
Add handling for more complicated cases than simple arrays.
…#172535) When we duplicate contexts (due to clones e.g. matching different inlined instances), we were propagating the allocation type but not the ContextSizeInfo, which is used for -memprof-report-hinted-sizes. This meant that we never reported hinting for any of the duplicated contexts, which can result in conservative results as in some cases only the duplicated contexts are able to be cloned and hinted. Note that this change could result in overly optimistic reporting in some cases.
Add test case that triggered revert f42af14.
…st." (#173170) This reverts commit f42af14 and re-applies llvm/llvm-project#172915. It has an additional check if the condition is a live-in, which makes sure we preserve the original behavior in that case. This should fix the crash that caused the revert. Original commit message: Instead of looking up the predicate from the VPValue condition instead of the underlying IR. This improves cost modeling in some cases, e.g. when we can fold operations like negations in compares. On AArch64, this leads to additional vectorization in a few cases in practice. Example lowering for the modified test case: https://llvm.godbolt.org/z/6nc6jo5eG
Top-level (binary) functions don't have a unique GUID mapping, with different causes namely coroutine fragments sharing the same parent source function GUID. Replace the top-level inline tree node GUID lookup with probe lookup coupled with walk up the inline tree. Test Plan: added test-coro-probes.yaml
…pec. (#171549) In llvm/llvm-project#170523 it was pointed out that the spec does specifically specify that launch/attach should not respond until configurationDone is handled. This means we do need to support async request handlers. To better align with the spec, I've added a new `lldb_dap::AsyncRequestHandler`. This is an additional handler type that allows us to respond at a later point. Additionally, I refactored `launch` and `attach` to only respond once the `configurationDone` is complete, specifically during the `PostRun` operation of the `configurationDone` handler. I merged some of the common behavior between `RequestHandler` and `AsyncRequestHandler` into their common `BaseRequestHandler`. The flow should now be: ``` <-> initialize request / response --> launch/attach request <-- event initialized ... optionally ... <-> setBreakpoints request / response <-> setFunctionBreakpoints request / response <-> setExceptionBreakpoints request / response <-> setInstructionBreakpoints request / response ... finally ... <-> configurationDone request / response <-- launch/attach response ``` --------- Co-authored-by: Jonas Devlieghere <[email protected]>
Add OpenACCUtilsLoop.h/.cpp with utilities for converting acc.loop operations to SCF dialect operations: - convertACCLoopToSCFFor: Convert structured acc.loop to scf.for with loop collapsing support - convertACCLoopToSCFParallel: Convert acc.loop to scf.parallel - convertUnstructuredACCLoopToSCFExecuteRegion: Convert unstructured acc.loop (multi-block) to scf.execute_region Key features: - Automatic type conversion between integer types and index - Inclusive-to-exclusive upper bound conversion - Trip count calculation with clamping for negative counts - Constant folding via createOrFold for cleaner IR - Assertions to prevent misuse (e.g., builder inside loop region) - Error emission for unsupported cases (loops with results) Comprehensive unit tests covering these APIs are also added. --------- Co-authored-by: Scott Manley <[email protected]>
…after-move` (#172784) Closes #170635
… storage (#173258) When setName() is called with a StringRef derived from the current name, it results in a use-after-free error reported by AddressSanitizer. A newly added test ValueTest.setNameShrink demonstrates the issue (configure LLVM with -DLLVM_USE_SANITIZER=Address). Fix by creating the new ValueName before removing/destroying the old one.
ffe973a changed some of the internal APIs to return a tuple instead of just the report. This callsite was never updated which resulted in the tuple being printed to the summary view when we only wanted the report.
Fixes: #172965 In fact MipsAsmParser::expandDivRem is in a so bad status: 1. Div may not execute at all in most case ``` .set reorder bnez $3, $tmp0 div $zero, $2, $3 break 7 $tmp0: ``` `.set reorder` may insert a nop after bnez, which will skip `div` if $3 is not zero. 2. `break 6` is wrong here.
First, this moves the removal of operands from use lists from `User::operator delete` to `User::~User`. This is straightforward, and nothing blocks that. Second, this makes LLVM more compatible with bug finding tools like MSan, GCC `-flifetime-dse`, and forthcoming enhancements to Clang itself through `dead_on_return` annotations. However, the complication is that `User::operator delete` needs to recover the start of the allocation, and it needs to recover that information somehow without examining the fields of the `User` object. The natural way to handle this is for the destructor to return an adjusted `this` pointer, and that's in fact how deleting destructors are often implemented, but it requires making assumptions about the C++ ABI. Another solution to this problem in C++20 would be to use [destroying delete](https://en.cppreference.com/w/cpp/memory/new/destroying_delete_t), which should be on our roadmap, since it would allow us to eliminate `deleteValue`, and move that polymorphic switch into the destroying delete operator, instead of having to use this funky method. Since we don't have C++20 yet, it seems practical to store the information into the operand memory, to the left of `this`, and to reload the start of the allocation from `((void**)this)[-1]` after the destructor runs. The downside is that zero-operand Users such as `ret void`, `unreachable`, `fence`, and `ConstantInt` must allocate one more pointer worth of memory to the left of the main allocation, just to thread this information through to `User::operator delete`. This change avoids increasing the effective size of all `ConstantData` instances by specializing `ConstantData` new and delete, and adding a type check to `~User`. When we have C++20, we should definitely replace all of this with the destroying delete solution, which is much clearer, but for now, this is a low-cost fix to long-standing UB and it unblocks other work, so it deserves to land. Fixes #24952
…159480) When building rustc std for arm64e, core fails to compile successfully with the error: ``` Constant ValueID not recognized. UNREACHABLE executed at rust/src/llvm-project/llvm/lib/Transforms/Utils/FunctionComparator.cpp:523! ``` This is a result of function merging so I modified FunctionComparator.cpp as the ConstantPtrAuth value would go unchecked in the switch statement. The test case is a reduction from the failure in core and fails on main with: ``` ******************** FAIL: LLVM :: Transforms/MergeFunc/ptrauth-const-compare.ll (59809 of 59995) ******************** TEST 'LLVM :: Transforms/MergeFunc/ptrauth-const-compare.ll' FAILED ******************** Exit Code: 2 Command Output (stdout): -- # RUN: at line 3 /Users/oskarwirga/llvm-project/build/bin/opt -S -passes=mergefunc < /Users/oskarwirga/llvm-project/llvm/test/Transforms/MergeFunc/ptrauth-const-compare.ll | /Users/oskarwirga/llvm-project/build/bin/FileCheck /Users/oskarwirga/llvm-project/llvm/test/Transforms/MergeFunc/ptrauth-const-compare.ll # executed command: /Users/oskarwirga/llvm-project/build/bin/opt -S -passes=mergefunc # .---command stderr------------ # | Constant ValueID not recognized. # | UNREACHABLE executed at /Users/oskarwirga/llvm-project/llvm/lib/Transforms/Utils/FunctionComparator.cpp:523! # | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace and instructions to reproduce the bug. # | Stack dump: # | 0. Program arguments: /Users/oskarwirga/llvm-project/build/bin/opt -S -passes=mergefunc # | 1. Running pass "mergefunc" on module "<stdin>" # | #0 0x0000000103335770 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/Users/oskarwirga/llvm-project/build/bin/opt+0x102651770) # | #1 0x00000001033336bc llvm::sys::RunSignalHandlers() (/Users/oskarwirga/llvm-project/build/bin/opt+0x10264f6bc) # | #2 0x0000000103336218 SignalHandler(int, __siginfo*, void*) (/Users/oskarwirga/llvm-project/build/bin/opt+0x102652218) # | #3 0x000000018e6c16a4 (/usr/lib/system/libsystem_platform.dylib+0x1804ad6a4) # | #4 0x000000018e68788c (/usr/lib/system/libsystem_pthread.dylib+0x18047388c) # | #5 0x000000018e590a3c (/usr/lib/system/libsystem_c.dylib+0x18037ca3c) # | #6 0x00000001032a84bc llvm::install_out_of_memory_new_handler() (/Users/oskarwirga/llvm-project/build/bin/opt+0x1025c44bc) # | #7 0x00000001033b37c0 llvm::FunctionComparator::cmpMDNode(llvm::MDNode const*, llvm::MDNode const*) const (/Users/oskarwirga/llvm-project/build/bin/opt+0x1026cf7c0) # | #8 0x00000001033b4d90 llvm::FunctionComparator::cmpBasicBlocks(llvm::BasicBlock const*, llvm::BasicBlock const*) const (/Users/oskarwirga/llvm-project/build/bin/opt+0x1026d0d90) # | #9 0x00000001033b5234 llvm::FunctionComparator::compare() (/Users/oskarwirga/llvm-project/build/bin/opt+0x1026d1234) # | #10 0x0000000102d6d868 (anonymous namespace)::MergeFunctions::insert(llvm::Function*) (/Users/oskarwirga/llvm-project/build/bin/opt+0x102089868) # | #11 0x0000000102d6bc0c llvm::MergeFunctionsPass::runOnModule(llvm::Module&) (/Users/oskarwirga/llvm-project/build/bin/opt+0x102087c0c) # | #12 0x0000000102d6b430 llvm::MergeFunctionsPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/Users/oskarwirga/llvm-project/build/bin/opt+0x102087430) # | #13 0x0000000102b90558 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/Users/oskarwirga/llvm-project/build/bin/opt+0x101eac558) # | #14 0x0000000103734bc4 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::__1::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool, bool) (/Users/oskarwirga/llvm-project/build/bin/opt+0x102a50bc4) # | #15 0x000000010373cc28 optMain (/Users/oskarwirga/llvm-project/build/bin/opt+0x102a58c28) # | #16 0x000000018e2e6b98 # `----------------------------- # error: command failed with exit status: -6 # executed command: /Users/oskarwirga/llvm-project/build/bin/FileCheck /Users/oskarwirga/llvm-project/llvm/test/Transforms/MergeFunc/ptrauth-const-compare.ll # .---command stderr------------ # | FileCheck error: '<stdin>' is empty. # | FileCheck command line: /Users/oskarwirga/llvm-project/build/bin/FileCheck /Users/oskarwirga/llvm-project/llvm/test/Transforms/MergeFunc/ptrauth-const-compare.ll # `----------------------------- # error: command failed with exit status: 2 ```
Now that #24952 has been fixed by #170575, we no longer need to specify -fno-lifetime-dse when building with gcc.
CONFLICT (content): Merge conflict in llvm/lib/CMakeLists.txt CONFLICT (content): Merge conflict in llvm/tools/llvm-lto2/CMakeLists.txt
CONFLICT (content): Merge conflict in llvm/cmake/modules/HandleLLVMOptions.cmake
Contributor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zizmor found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.