Merging upstream 91cdd350 [clang] Improve nested name specifier AST representation#249
Open
YukinoHayakawa wants to merge 4828 commits intobloomberg:p2996from
Open
Merging upstream 91cdd350 [clang] Improve nested name specifier AST representation#249YukinoHayakawa wants to merge 4828 commits intobloomberg:p2996from
YukinoHayakawa wants to merge 4828 commits intobloomberg:p2996from
Conversation
... so we don't have to create Pointer instances when we don't need them.
llvm#152457) Judging from the reaction to llvm#152302, we are not ready to make this a fatal error. Remove the specific version number, and update the libc message to match the others' wording.
This fixes llvm#152097 This commit fixes two instances of a (somewhat) recently enabled assertion. One with a test, the other I can't reproduce (might be dead code) but certainly looks like an instance of the same problem. The PR that introduced the regression: llvm#117558 With this patch, the AVR backend is usable again for TinyGo.
This patch extends llvm#149095 for EOR and ORR. It uses a simple partition scheme to try to find two suitable disjoint bitmasks that can be used with EOR/ORR to reconstruct the original mask. Fixes: llvm#148987.
To avoid noise in PRs such as in llvm#146383.
…lvm#151940) We need to reject plans that contain recipes with invalid costs. LICM can move recipes with invalid costs out of the loop region, which then get missed by the main cost computation. Extend the logic to check recipes for invalid cost currently only covering the middle block to include all skeleton blocks. Fixes llvm#144358 Fixes llvm#151664 PR: llvm#151940
…xpr (llvm#152363) Closes llvm#152324. Part of llvm#30794. This PR adds `constexpr` support for the following AVX512 integer reduction intrinsics: - `_mm512_reduce_add_epi32` - `_mm512_reduce_add_epi64` - `_mm512_reduce_mul_epi32` - `_mm512_reduce_mul_epi64` - `_mm512_reduce_and_epi32` - `_mm512_reduce_and_epi64` - `_mm512_reduce_or_epi32` - `_mm512_reduce_or_epi64` - `_mm512_reduce_max_epi32` - `_mm512_reduce_max_epi64` - `_mm512_reduce_min_epi32` - `_mm512_reduce_min_epi64` - `_mm512_reduce_max_epu32` - `_mm512_reduce_max_epu64` - `_mm512_reduce_min_epu32` - `_mm512_reduce_min_epu64` --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
Auto-generate checks for llvm#151925. Also update some naming to make more consistent with other tests.
…lvm#151995) Make it easier for us to add ABI versions. Close llvm#144332
Added by llvm#150846. Checks the size of a structure, which is only correct for 64-bit systems.
…ntrinsics to be used in constexpr (llvm#152435) Fixed llvm#152313 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
Previously, specializing the GraphWriter class required a full class specialization. This change introduces CRTP for GraphWriter, allowing for partial specialization. This change is in support of printing the module dependency graph as part of the RFC for driver-managed module builds, for which we want to print the graph nodes in a more human-readable format by: - Printing descriptive IDs instead of pointer addresses as node labels. - Printing the full node labels separately from the node relations to avoid clutter. With this approach, only GraphWriter::writeNodes() needs to be specialized (, aside from DOTGraphTraits). RFC for driver-managed module builds: https://discourse.llvm.org/t/rfc-modules-support-simple-c-20-modules-use-from-the-clang-driver-without-a-build-system
Desc is only used once and we can get that information from the Block as well.
[NVPTX] Add Prefetch tensormap intrinsics This PR adds prefetch intrinsics with the relevant tensormap_space. * Lit tests are added as part of prefetch.ll * The generated PTX is verified with a 12.3 ptxas executable. * Added docs for these intrinsics in NVPTXUsage.rst. For more information, refer to the PTX ISA for prefetch intrinsic : [Prefetch Tensormap](https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-prefetch-prefetchu) @durga4github @schwarzschild-radius
Changes: The original patch, landed as 1336675, was reverted due to a bug in LoopVectorize resulting in a crash. The bug has now been fixed by 95c32bf ([VPlan] Return invalid cost if any skeleton block has invalid costs), and this reland is identical to the original patch.
…ops. (llvm#148424) Adds `linalg-morph-ops` pass to convert an op from one representation to another: named-op <--> category_op (elementwise, contraction, ..) <--> generic e.g. ```mlir %exp = linalg.exp ins(%A : tensor<16x8xf32>) outs(%B : tensor<16x8xf32>) -> tensor<16x8xf32> ``` After `mlir-opt -linalg-morph-ops=named-to-category ..` ```mlir %0 = linalg.elementwise kind=#linalg.elementwise_kind<exp> ins(%arg0 : tensor<16x8xf32> .. Note: this is generalization of `--linalg-generalize-named-ops` is the path `named-op --> generic-op` `--linalg-specialize-generic-ops` is the path `named-op <-- generic-op` email: quic_mabsar@quicinc.com
…ows (llvm#152318) Currrently flang-rt assumes that LLVM was always built with the dynamic MSVC runtime. This may not be the case, if the user has specified a different runtime with -DCMAKE_MSVC_RUNTIME_LIBRARY. Since this flag is implied by -DLLVM_ENABLE_RPMALLOC=On, which is used by the Windows release script, this is causing that script to fail. Fixes llvm#151920
Split out from llvm#150248: Use the size of the alloca instead of the size passed to the lifetime intrinsic. As a bonus, this handles dynamic allocas correctly (see the added test) instead of doing a memset with size -1...
…llvm#152478) Adds missing C++ run lines to test files containing `constexpr` tests. Also adds missing 32/64-bit test coverage to the following tests: - `clang/test/CodeGen/X86/avx512-reduceIntrin.c` - `clang/test/CodeGen/X86/avx512-reduceMinMaxIntrin.c` - `clang/test/CodeGen/X86/avx512vpopcntdq-builtins.c` - `clang/test/CodeGen/X86/avx512vpopcntdqvl-builtins.c` Additionally, fixes a `_mm512_popcnt_epi64` `constexpr` test that incorrectly assumed 32-bit integers, leading to incorrect bit counts. This change updates the test result to assume 64-bit integers.
We currently log every single test that we run in premerge. This leads to gigantic logs (200k+ lines on Linux) that can be difficult to parse through. Having an indicator of progress is nice, especially for the LLVM tests, but is not strictly necessary and not often used (I imagine). Having a progress indicator from lit that works in CI cases is on my TODO list. For the rare cases where someone does need to see the list of tests that run, the JUnit XML emitted by lit is available in the artifacts.
…lvm#152007) This allows not having the END CRITICAL directive in certain situations. Update semantic checks and symbol resolution.
…m#152466) `add_conformance_test` checks for libc and prints a warning if it is not found. However, this warning ends up being printed once for each test, spamming the cmake log. Moving it up to the folder cmake allows it to be reported only once.
llvm#152483) This way all the tracking is self-contained in `TrackingOutputBuffer` and we can test the `SuffixRange` properly.
Added initial check for potential fmad conversion in reductions and operands vectorization.
When `LLVM_INSTALL_TOOLCHAIN_ONLY=ON`, the MLIR shared library (`libMLIR*`) is not installed even though it is built with the `INSTALL_WITH_TOOLCHAIN` argument to the `add_mlir_library` cmake function. This patch ensures that `libMLIR*` is installed when `LLVM_INSTALL_TOOLCHAIN_ONLY=ON`. Patch verified [here](llvm#151247 (comment)). fixes llvm#151247
This patch teaches moveFromOldBuckets to take an iterator_range so that it can use a range-based for loop.
getActiveBits() already returns unsigned.
pattern is already of const char *.
…ctions (llvm#152784) Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
…lvm#152038) These are the strided versions of `riscv.segN.store.mask` intrinsics.
Instead of using the word 'offset' it's probably better to just say 'stride'. NFC.
Removes the `(batch_)matmul_transpose_{a|b}` variants from OpDSL and
replace it with `matmul affine_maps [...]` whenever appropriate. This is
in line with the
[plan](https://discourse.llvm.org/t/rfc-op-explosion-in-linalg/82863),
and can be done since llvm#104783 merged.
See:
https://discourse.llvm.org/t/deprecate-batch-matmul-transpose-a-b-linalg-operations/87245
Issues investigated:
* pad transform tests that could use `matmul` instead, so change to
that.
* ArmSME test using transpose actually needed it, so changed to `matmul`
+ affine maps.
Arm tests validated by @banach-space (thanks!!).
Unlike ptrtoint, ptrtoaddr does not capture provenance, only the address. Note: As defined by the LangRef, we always treat `ptrtoaddr` as a location-independent address capture since it is a direct inspection of the pointer address. Reviewed By: nikic Pull Request: llvm#152221
…e argument (llvm#152791) Fixes llvm#152754 - Fixes the ArgOperand index in `DXILOpLowering.cpp` used to obtain the pointer operand of a lifetime intrinsic. - Updates the tests `llvm/test/CodeGen/DirectX/legalize-lifetimes-valver-1.5.ll`, `llvm/test/CodeGen/DirectX/legalize-lifetimes-valver-1.6.ll`, `llvm/test/CodeGen/DirectX/ShaderFlags/lifetimes-noint64op.ll`, and `llvm/test/tools/dxil-dis/lifetimes.ll` to use the new size-less lifetime intrinsic - Removes lifetime intrinsics from the test `llvm/test/CodeGen/DirectX/legalize-memset.ll` to be consistent with the corresponding memcpy test which does not have lifetime intrinsics. (Removal of lifetime intrinsics from tests like this was suggested here in the past: llvm#139173 (comment)) - Rewrites the lifetime legalization functions in the EmbedDXILPass to re-add the explicit size argument for DXIL
This patch fixes: lldb/unittests/DAP/ProtocolTypesTest.cpp:112:67: error: missing field 'adapterData' initializer [-Werror,-Wmissing-field-initializers] lldb/unittests/DAP/ProtocolTypesTest.cpp:571:70: error: missing field 'adapterData' initializer [-Werror,-Wmissing-field-initializers]
Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
…m#152805) This test runs `mlir-opt %s | mlir-opt %s | FileCheck` to test the round trip behavior, but the second command takes input from the pipe, not the lit test, so it should be `mlir-opt %s | mlir-opt | FileCheck`. For some reason I haven't figured out, this causes ~50% flakiness when testing in certain environments (not reproducible in my shell, but reproduces in an internal buildbot), due to the pipeline raising `SIGPIPE`. Test added in llvm#148424.
…152593) Static analysis complained that: child_range(&Init, &Init+1); in the children member function was potentially out of bounds. This is false b/c it is forming an iterator range but it would be invalid if Init was a nullptr. I add an assertion in the constructor for this and remove to FIXME checks that are related to this. I checked the various usages and we always valid the argument is not nullptr.
Adds missing 16-bit test cases to the test that src mods are not applied to integers in instructions with canonicalizing patterns.
llvm#152813) We'll remove the size estimator after, this change is to get the `ml-*` build bots green after the aforementioned PR. We never used the size estimator again after the initial DQN-based training. Should we want to again, we now have IR2Vec, which the old estimator was approximating in functionality.
Summary: Small fix that just ignores all the extra lanes if we're running the server from a platform that potentially has more.
Without linker relaxation enabled for a particular relocatable file or section (e.g., using .option norelax), the assembler will not generate R_RISCV_ALIGN relocations for alignment directives. This becomes problematic in a two-stage linking process: ``` ld -r a.o b.o -o ab.o // b.o is norelax. Its alignment information is lost in ab.o. ld ab.o -o ab ``` When ab.o is linked into an executable, the preceding relaxed section (a.o's content) might shrink. Since there's no R_RISCV_ALIGN relocation in b.o for the linker to act upon, the `.word 0x3a393837` data in b.o may end up unaligned in the final executable. To address the issue, this patch inserts NOP bytes and synthesizes an R_RISCV_ALIGN relocation at the beginning of a text section when the alignment >= 4. For simplicity, when RVC is disabled, we synthesize an ALIGN relocation (addend: 2) for a 4-byte aligned section, allowing the linker to trim the excess 2 bytes. See also https://sourceware.org/bugzilla/show_bug.cgi?id=33236 Pull Request: llvm#151639
This patch defines a couple of helper functions so that we can convert four loops to range-based for loops.
This is a major change on how we represent nested name qualifications in the AST. * The nested name specifier itself and how it's stored is changed. The prefixes for types are handled within the type hierarchy, which makes canonicalization for them super cheap, no memory allocation required. Also translating a type into nested name specifier form becomes a no-op. An identifier is stored as a DependentNameType. The nested name specifier gains a lightweight handle class, to be used instead of passing around pointers, which is similar to what is implemented for TemplateName. There is still one free bit available, and this handle can be used within a PointerUnion and PointerIntPair, which should keep bit-packing aficionados happy. * The ElaboratedType node is removed, all type nodes in which it could previously apply to can now store the elaborated keyword and name qualifier, tail allocating when present. * TagTypes can now point to the exact declaration found when producing these, as opposed to the previous situation of there only existing one TagType per entity. This increases the amount of type sugar retained, and can have several applications, for example in tracking module ownership, and other tools which care about source file origins, such as IWYU. These TagTypes are lazily allocated, in order to limit the increase in AST size. This patch offers a great performance benefit. It greatly improves compilation time for [stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for `test_on2.cpp` in that project, which is the slowest compiling test, this patch improves `-c` compilation time by about 7.2%, with the `-fsyntax-only` improvement being at ~12%. This has great results on compile-time-tracker as well:  This patch also further enables other optimziations in the future, and will reduce the performance impact of template specialization resugaring when that lands. It has some other miscelaneous drive-by fixes. About the review: Yes the patch is huge, sorry about that. Part of the reason is that I started by the nested name specifier part, before the ElaboratedType part, but that had a huge performance downside, as ElaboratedType is a big performance hog. I didn't have the steam to go back and change the patch after the fact. There is also a lot of internal API changes, and it made sense to remove ElaboratedType in one go, versus removing it from one type at a time, as that would present much more churn to the users. Also, the nested name specifier having a different API avoids missing changes related to how prefixes work now, which could make existing code compile but not work. How to review: The important changes are all in `clang/include/clang/AST` and `clang/lib/AST`, with also important changes in `clang/lib/Sema/TreeTransform.h`. The rest and bulk of the changes are mostly consequences of the changes in API. PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just for easier to rebasing. I plan to rename it back after this lands. Fixes llvm#136624 Fixes llvm#43179 Fixes llvm#68670 Fixes llvm#92757
Notable upstream changes: - Rearrangement of NamespaceDecl, NamespaceAlias removed. - Improved consteval propagation. # Conflicts: # clang/include/clang/Basic/DeclNodes.td # clang/include/clang/Basic/LangOptions.def # clang/include/clang/Serialization/TypeBitCodes.def # clang/lib/AST/DeclCXX.cpp # clang/lib/Driver/ToolChains/Clang.cpp # clang/lib/Parse/ParseExprCXX.cpp # clang/lib/Parse/ParseTemplate.cpp # clang/lib/Sema/SemaCXXScopeSpec.cpp # clang/lib/Sema/SemaDeclCXX.cpp # clang/lib/Sema/SemaExprCXX.cpp # clang/lib/Sema/SemaLambda.cpp
…Initializer because removing it would break using std::meta::info to initialize constexpr variables in non-const-evaluated functions.
… when it fails to deduce a temp variable's type made from it. e.g. when reflecting `^^decltype([](this auto &&) { })::operator()`
…erge-upstream-91cdd350-20251223 fixing allocation alignment problems caused by NNS refactoring # Conflicts: # clang/include/clang/AST/AbstractBasicReader.h # clang/include/clang/AST/AbstractBasicWriter.h # clang/include/clang/AST/NestedNameSpecifier.h # clang/include/clang/AST/RecursiveASTVisitor.h # clang/include/clang/AST/Type.h # clang/lib/AST/ASTContext.cpp # clang/lib/AST/ASTImporter.cpp # clang/lib/AST/ASTStructuralEquivalence.cpp # clang/lib/AST/ComputeDependence.cpp # clang/lib/AST/ItaniumMangle.cpp # clang/lib/AST/NestedNameSpecifier.cpp # clang/lib/AST/ODRHash.cpp # clang/lib/AST/Type.cpp # clang/lib/Index/IndexTypeSourceInfo.cpp # clang/lib/Sema/SemaCXXScopeSpec.cpp # clang/lib/Sema/SemaExpr.cpp # clang/lib/Sema/SemaExprCXX.cpp # clang/lib/Sema/SemaHLSL.cpp # clang/lib/Sema/SemaTemplate.cpp # clang/lib/Sema/TreeTransform.h # clang/lib/Serialization/ASTReader.cpp # clang/lib/Serialization/ASTWriter.cpp
Author
|
Seems CI is failing a job not caused by my changes. I hope it has been fixed in the upstream. I'll continue to try to merge the upstream. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR merges upstream 91cdd35 [clang] Improve nested name specifier AST representation which, bumped clang version to 22 and did a painful refactoring on
NestedNameSpecifierfor which I am not really sure how to handle this change best.Many major changes were introduced up to this commit including a refactoring of the type system, removing ElaboratedType, Identifier as SpecifierKind, and so on, and replaced many types as simply a
TagType<>etc.NestedNameSpecifiergot compressed from 24 bytes to 8 bytes thanks to their aggressive bitwise operations which forced me to set the alignment of affected types to16to accommodate the newStoredKindbits. I am not sure whether this is the best way to deal with it.Again, I tested on my own test code and game code. For test cases that don't compile, I cross checked with the compiler on CE and it seems that errors are identical. Currently I haven't triggered any ICE but I suppose there would be. I simply tried to stick with the original reflection logic and tried not to break them.
If you are interested, please perform some thorough tests on this merge.
For code I've touched, I have all left
todo [merge:yukino...markers for you to inspect.This PR concludes #248
and can replace
#243
#244