-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[LoopVectorize] Add the cost of VPInstruction::AnyOf to vplan #127085
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
While attempting to teach ScalarEvolution about samesign in another effort, a complicated testcase with nested loops, and zero-extends of the induction-variable regresses, but only when the target datalayout is present. The regression was originally reported on IndVarSimplify, but an improvement of symbolic BTC was also observed on SCEV. Check in the test into both IndVarSimplify and SCEV, to ease investigation and further development.
DataLayout is already available as a member variable.
… value in libcxx container summary (llvm#125294) This has two changes: 1. Set show value for libcxx and libstdcxx summary provider. This will print the pointer value for both pointer type and reference type. 2. Remove pointer value printing in libcxx container summary. Discussion: https://discourse.llvm.org/t/lldb-hides-raw-pointer-value-for-libcxx-and-libstdcxx-pointer-types-in-summary-string/84226
This shows missed opportunity to fold (fshl ld1, ld0, c) -> (ld0[ofs]) if the load chain results are used.
…d0[ofs]) combine. (llvm#124871) Happened to notice some odd things related to chains in this code. The code calls hasOneUse on LoadSDNode* which will check users of the data and the chain. I think this was trying to check that the data had one use so one of the loads would definitely be removed by the transform. Load chains don't always have users so our testing may not have noticed that the chains being used would block the transform. The code makes all users of ld1's chain use the new load's chain, but we don't know that ld1 becomes dead. This can cause incorrect dependencies if ld1's chain is used and it isn't deleted. I think the better thing to do is use makeEquivalentMemoryOrdering to make all users of ld0 and ld1 depend on the new load and the original loads. If the olds loads become dead, their chain will be cleaned up later. I'm having trouble getting a test for any ordering issue with the current code. areNonVolatileConsecutiveLoads requires the two loads to have the same input chain. Given that, I don't know how to use one of the load chain results without also using the other. If they are both used we don't do the transform because SDNode::hasOneUse will return false for both.
…m#125526) This is attempt 2 to merge this, the first one is llvm#117622. This properly disables the tests when building for playstation, since the warning is disabled there. When a hidden object is built into multiple shared libraries, each instance of the library will get its own copy. If the object was supposed to be globally unique (e.g. a global variable or static data member), this can cause very subtle bugs. An object might be incorrectly duplicated if it: Is defined in a header (so it might appear in multiple TUs), and Has external linkage (otherwise it's supposed to be duplicated), and Has hidden visibility (or else the dynamic linker will handle it) The duplication is only a problem semantically if one of the following is true: The object is mutable (the copies won't be in sync), or Its initialization has side effects (it may now run more than once), or The value of its address is used (different copies have different addresses). To detect this, we add a new -Wunique-object-duplication warning. It warns on cases (1) and (2) above. To be conservative, we only warn in case (2) if we are certain the initializer has side effects, and we don't warn on new because the only side effect is some extra memory usage. We don't currently warn on case (3) because doing so is prone to false positives: there are many reasons for taking the address which aren't inherently problematic (e.g. passing to a function that expects a pointer). We only run into problems if the code inspects the value of the address. The check is currently disabled for windows, which uses its own analogue of visibility (declimport/declexport). The check is also disabled inside templates, since it can give false positives if a template is never instantiated. Resolving the warning The warning can be fixed in several ways: If the object in question doesn't need to be mutable, it should be made const. Note that the variable must be completely immutable, e.g. we'll warn on const int* p because the pointer itself is mutable. To silence the warning, it should instead be const int* const p. If the object must be mutable, it (or the enclosing function, in the case of static local variables) should be made visible using __attribute((visibility("default"))) If the object is supposed to be duplicated, it should be be given internal linkage. Testing I've tested the warning by running it on clang itself, as well as on chromium. Compiling clang resulted in [10 warnings across 6 files](https://github.com/user-attachments/files/17908069/clang-warnings.txt), while Chromium resulted in [160 warnings across 85 files](https://github.com/user-attachments/files/17908072/chromium-warnings.txt), mostly in third-party code. Almost all warnings were due to mutable variables. I evaluated the warnings by manual inspection. I believe all the resulting warnings are true positives, i.e. they represent potentially-problematic code where duplication might cause a problem. For the clang warnings, I also validated them by either adding const or visibility annotations as appropriate. Limitations I am aware of four main limitations with the current warning: We do not warn when the address of a duplicated object is taken, since doing so is prone to false positives. I'm hopeful that we can create a refined version in the future, however. We only warn for side-effectful initialization if we are certain side effects exist. Warning on potential side effects produced a huge number of false positives; I don't expect there's much that can be done about this in modern C++ code bases, since proving a lack of side effects is difficult. Windows uses a different system (declexport/import) instead of visibility. From manual testing, it seems to behave analogously to the visibility system for the purposes of this warning, but to keep things simple the warning is disabled on windows for now. We don't warn on code inside templates. This is unfortuate, since it masks many real issues, e.g. a templated variable which is implicitly instantiated the same way in multiple TUs should be globally unique, but may accidentally be duplicated. Unfortunately, we found some potential false positives during testing that caused us to disable the warning for now.
Note that PointerUnion::dyn_cast has been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> Literal migration would result in dyn_cast_if_present (see the definition of PointerUnion::dyn_cast), but this patch uses dyn_cast because we expect E to be nonnull.
Note that PointerUnion::dyn_cast has been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> Literal migration would result in dyn_cast_if_present (see the definition of PointerUnion::dyn_cast), but this patch uses dyn_cast because we expect typeDecl to be nonnull. Note that getObjCInterfaceType starts out dereferencing Decl.
Note that PointerUnion::dyn_cast has been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> Literal migration would result in dyn_cast_if_present (see the definition of PointerUnion::dyn_cast), but this patch uses dyn_cast because we expect referent to be nonnull.
…25378) This way we don't need to duplicate the list of supported targets in the release-tasks workflow.
Signed-off-by: Mikhail R. Gadelha <[email protected]>
This also changes the container version numbers in the tag from unix timestamps to the abbreviated commit hash for the workflow. This ensures that the amd64 and arm64 containers have the same tag. For amd64 we now generate 4 tags: * ghcr.io/llvm/ci-ubuntu-22.04:latest * ghcr.io/llvm/ci-ubuntu-22.04:$GITHUB_SHA * ghcr.io/llvm/amd64/ci-ubuntu-22.04:latest * ghcr.io/llvm/amd64/ci-ubuntu-22.04:$GITHUB_SHA For arm64 we generate 2 tags: * ghcr.io/tstellar/arm64v8/ci-ubuntu-22.04:latest * ghcr.io/tstellar/arm64v8/ci-ubuntu-22.04:$GITHUB_SHA
…5260) This change introduces lowering from CIR to LLVM IR of global integer and floating-point variables, using defaults for attributes that aren't yet implemented.
…ationship (llvm#125300) This enables finding the backed thread from the backing thread without going through the thread list, and it will be useful for subsequent commits.
… in getSingleShuffleSrc. (llvm#125455) I have been unsuccessful at further reducing the test. The failure requires a shuffle with 2 scalable->fixed extracts with the same source. 0 is the only valid index for a scalable->fixed extract so the 2 sources must be the same extract. Shuffles with the same source are aggressively canonicalized to a unary shuffle. So it requires the extracts to become identical through other optimizations without the shuffle being canonicalized before it is lowered. Fixes llvm#125306.
…vm#115099) This is similar in spirit to previous changes to make _mm_mfence builtins to avoid conflicts with winnt.h and other MSVC ecosystem headers that pre-declare compiler intrinsics as extern "C" symbols. Also update the feature flag for _mm_prefetch to sse, which is more accurate than mmx. This should fix issue llvm#87515.
Summary: This only shows up during the build of the server, silence it.
Split off from llvm#124432 as suggested. Adds VPBuilder::insert, inspired by IRBuilderBase.
LLDB: correct event when removing all watchpoints
Previously we incorrectly checked for a "breakpoint changed" event
listener removing all watchpoints (e.g. via
SBTarget::DeleteAllWatchpoints()), although we would emit a "watchpoint
changed" event if there were a listener for 'breakpoint changed'.
This meant that we might not emit a "watchpoint changed" event if there
was a listener for this event.
Correct it to check for the "watchpoint changed" event.
---
Updated regression tests which were also incorrectly peeking for the
wrong event type. The 'remove' action actually triggers 2 events which
the test didn't allow, so I updated it to allow specifically what was
requested.
The test fails (expectedly) at the line following "DeleteAllWatchpoints"
prior to this patch, and passes after.
…#125302) Generally speaking, process plugins (e.g. ProcessGDBRemote) should not be aware of OS plugin threads. However, ProcessGDBRemote attempts to check for the existence of OS threads when calculating stop info. When OS threads are present, it sets the stop info directly on the OS plugin thread and leaves the ThreadGDBRemote without a StopInfo. This is problematic for a few reasons: 1. No other process plugins do this, as they shouldn't. They should set the stop info for their own process threads, and let the abstractions built on top propagate StopInfos. 2. This conflicts with the expectations of ThreadMemory, which checks for the backing threads's info, and then attempts to propagate it (in the future, it should probably ask the plugin itself too...). We see this happening in the code below. The `if` condition will not trigger, because `backing_stop_info_sp` will be null (remember, ProcessGDB remote is ignoring its own threads), and then this method returns false. ``` bool ThreadMemory::CalculateStopInfo() { ... lldb::StopInfoSP backing_stop_info_sp( m_backing_thread_sp->GetPrivateStopInfo()); if (backing_stop_info_sp && backing_stop_info_sp->IsValidForOperatingSystemThread(*this)) { backing_stop_info_sp->SetThread(shared_from_this()); ``` ``` Thread::GetPrivateStopInfo ... if (!CalculateStopInfo()) SetStopInfo(StopInfoSP()); ``` To solve this, we change ProcessGDB remote so that it does the principled thing: it now only sets the stop info of its own threads. This change by itself breaks the tests TestPythonOSPlugin.py and TestOSPluginStepping.py and probably explains why ProcessGDB had originally "violated" this isolation of layers. To make this work, BreakpointSites must be aware of BackingThreads when answering the question: "Is this breakpoint valid for this thread?". Why? Breakpoints are created on top of the OS threads (that's what the user sees), but breakpoints are hit by process threads. In the presence of OS threads, a TID-specific breakpoint is valid for a process thread if it is backing an OS thread with that TID.
Using 'compiler-rt' in 'LLVM_ENABLE_PROJECTS' causes the clang runtime libraries to be build and installed with arch suffix names i.e ```'*-<arch>.a'``` and ```'*-<arch>.so'```.
This patch adds an initial implementation of VPInstruction::computeCost with support for only one instruction so far - VPInstruction::AnyOf. This is only used when vectorising loops with uncountable early exits. Change-Id: I5076e92f2cfe6df3d65c5c9f9fbb1731e919b912
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch adds an initial implementation of VPInstruction::computeCost with support for only one instruction so far - VPInstruction::AnyOf. This is only used when vectorising loops with uncountable early exits.