-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[NFC][SpecialCaseList] Convert preprocess into LazyInit
#167281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NFC][SpecialCaseList] Convert preprocess into LazyInit
#167281
Conversation
Created using spr 1.3.7 [skip ci]
Created using spr 1.3.7
|
@llvm/pr-subscribers-llvm-support Author: Vitaly Buka (vitalybuka) ChangesCurrently SpecialCaseList created at least twice, Also, deppending on enabled sanitizers, not all In both cases there is unnecessary RadixTree This patch changes And remove empty one from This saves saves 0.5% of clang time building large project. Full diff: https://github.com/llvm/llvm-project/pull/167281.diff 1 Files Affected:
diff --git a/llvm/lib/Support/SpecialCaseList.cpp b/llvm/lib/Support/SpecialCaseList.cpp
index 2ba53d63365d2..79032cbb07f3f 100644
--- a/llvm/lib/Support/SpecialCaseList.cpp
+++ b/llvm/lib/Support/SpecialCaseList.cpp
@@ -20,6 +20,7 @@
#include "llvm/ADT/StringMap.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/iterator_range.h"
+#include "llvm/Support/Compiler.h"
#include "llvm/Support/GlobPattern.h"
#include "llvm/Support/LineIterator.h"
#include "llvm/Support/MemoryBuffer.h"
@@ -42,10 +43,9 @@ namespace {
class RegexMatcher {
public:
Error insert(StringRef Pattern, unsigned LineNumber);
- void preprocess();
-
unsigned match(StringRef Query) const;
+private:
struct Reg {
Reg(StringRef Name, unsigned LineNo, Regex &&Rg)
: Name(Name), LineNo(LineNo), Rg(std::move(Rg)) {}
@@ -60,10 +60,9 @@ class RegexMatcher {
class GlobMatcher {
public:
Error insert(StringRef Pattern, unsigned LineNumber);
- void preprocess();
-
unsigned match(StringRef Query) const;
+private:
struct Glob {
Glob(StringRef Name, unsigned LineNo, GlobPattern &&Pattern)
: Name(Name), LineNo(LineNo), Pattern(std::move(Pattern)) {}
@@ -72,15 +71,20 @@ class GlobMatcher {
GlobPattern Pattern;
};
+ void LazyInit() const;
+
std::vector<GlobMatcher::Glob> Globs;
- RadixTree<iterator_range<StringRef::const_iterator>,
- RadixTree<iterator_range<StringRef::const_reverse_iterator>,
- SmallVector<int, 1>>>
+ mutable RadixTree<iterator_range<StringRef::const_iterator>,
+ RadixTree<iterator_range<StringRef::const_reverse_iterator>,
+ SmallVector<int, 1>>>
PrefixSuffixToGlob;
- RadixTree<iterator_range<StringRef::const_iterator>, SmallVector<int, 1>>
+ mutable RadixTree<iterator_range<StringRef::const_iterator>,
+ SmallVector<int, 1>>
SubstrToGlob;
+
+ mutable bool Initialized = false;
};
/// Represents a set of patterns and their line numbers
@@ -89,7 +93,6 @@ class Matcher {
Matcher(bool UseGlobs, bool RemoveDotSlash);
Error insert(StringRef Pattern, unsigned LineNumber);
- void preprocess();
unsigned match(StringRef Query) const;
bool matchAny(StringRef Query) const { return match(Query); }
@@ -122,8 +125,6 @@ Error RegexMatcher::insert(StringRef Pattern, unsigned LineNumber) {
return Error::success();
}
-void RegexMatcher::preprocess() {}
-
unsigned RegexMatcher::match(StringRef Query) const {
for (const auto &R : reverse(RegExes))
if (R.Rg.match(Query))
@@ -142,7 +143,10 @@ Error GlobMatcher::insert(StringRef Pattern, unsigned LineNumber) {
return Error::success();
}
-void GlobMatcher::preprocess() {
+void GlobMatcher::LazyInit() const {
+ if (LLVM_LIKELY(Initialized))
+ return;
+ Initialized = true;
for (const auto &[Idx, G] : enumerate(Globs)) {
StringRef Prefix = G.Pattern.prefix();
StringRef Suffix = G.Pattern.suffix();
@@ -167,6 +171,8 @@ void GlobMatcher::preprocess() {
}
unsigned GlobMatcher::match(StringRef Query) const {
+ LazyInit();
+
int Best = -1;
if (!PrefixSuffixToGlob.empty()) {
for (const auto &[_, SToGlob] : PrefixSuffixToGlob.find_prefixes(Query)) {
@@ -224,10 +230,6 @@ Error Matcher::insert(StringRef Pattern, unsigned LineNumber) {
return std::visit([&](auto &V) { return V.insert(Pattern, LineNumber); }, M);
}
-void Matcher::preprocess() {
- return std::visit([&](auto &V) { return V.preprocess(); }, M);
-}
-
unsigned Matcher::match(StringRef Query) const {
if (RemoveDotSlash)
Query = llvm::sys::path::remove_leading_dotslash(Query);
@@ -237,7 +239,6 @@ unsigned Matcher::match(StringRef Query) const {
class SpecialCaseList::Section::SectionImpl {
friend class SpecialCaseList;
- void preprocess();
const Matcher *findMatcher(StringRef Prefix, StringRef Category) const;
public:
@@ -410,9 +411,6 @@ bool SpecialCaseList::parse(unsigned FileIdx, const MemoryBuffer *MB,
}
}
- for (Section &S : Sections)
- S.Impl->preprocess();
-
return true;
}
@@ -462,13 +460,6 @@ SpecialCaseList::Section::SectionImpl::findMatcher(StringRef Prefix,
return &II->second;
}
-void SpecialCaseList::Section::SectionImpl::preprocess() {
- SectionMatcher.preprocess();
- for (auto &[K1, E] : Entries)
- for (auto &[K2, M] : E)
- M.preprocess();
-}
-
unsigned SpecialCaseList::Section::getLastMatch(StringRef Prefix,
StringRef Query,
StringRef Category) const {
|
Currently SpecialCaseList created at least twice, one on by `Driver`, for diagnostics only, and then the real one by the `ASTContext`. Also, deppending on enabled sanitizers, not all sections will be used. In both cases there is unnecessary RadixTree construction. This patch changes `GlobMatcher` to do initialization lazily only when needed. And remove empty one from `RegexMatcher`. This saves saves 0.5% of clang time building large project. Pull Request: llvm#167281
Created using spr 1.3.7 [skip ci]
Created using spr 1.3.7 [skip ci]
Currently SpecialCaseList created at least twice, one on by `Driver`, for diagnostics only, and then the real one by the `ASTContext`. Also, deppending on enabled sanitizers, not all sections will be used. In both cases there is unnecessary RadixTree construction. This patch changes `GlobMatcher` to do initialization lazily only when needed. And remove empty one from `RegexMatcher`. This saves saves 0.5% of clang time building large project. Pull Request: llvm#167281
Created using spr 1.3.7 [skip ci]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR converts the preprocess() method in SpecialCaseList to use lazy initialization (LazyInit), aiming to improve performance by avoiding unnecessary RadixTree construction. The change eliminates eager preprocessing when SpecialCaseList instances are created but not fully utilized (e.g., when created by Driver for diagnostics only, or when certain sections are never queried).
Key changes:
- Removed
preprocess()methods from RegexMatcher, GlobMatcher, Matcher, and Section::SectionImpl classes - Implemented lazy initialization in GlobMatcher via a new
LazyInit()const method - Made RadixTree members and the
Initializedflag mutable to support lazy initialization from const methods
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/10/builds/17365 Here is the relevant piece of the build log for the reference |
Currently SpecialCaseList created at least twice,
one on by
Driver, for diagnostics only, and thenthe real one by the
ASTContext.Also, deppending on enabled sanitizers, not all
sections will be used.
In both cases there is unnecessary RadixTree
construction.
This patch changes
GlobMatcherto do initializationlazily only when needed.
And remove empty one from
RegexMatcher.This saves saves 0.5% of clang time building large project.