Skip to content

Commit 47ed51f

Browse files
DingdWangJaddyen
authored andcommitted
[MemDep] Optimize SortNonLocalDepInfoCache sorting strategy for large caches with few unsorted entries (llvm#143107)
During compilation of large files with many branches, I observed that the function `SortNonLocalDepInfoCache` in `MemoryDependenceAnalysis` becomes a significant performance bottleneck. This is because `Cache.size()` can be very large (around 20,000), but only a small number of entries (approximately 5 to 8) actually need sorting. The original implementation performs a full sort in all cases, which is inefficient. This patch introduces a lightweight heuristic to quickly estimate the number of unsorted entries and choose a more efficient sorting method accordingly. As a result, the GVN pass runtime on a large file is reduced from approximately 26.3 minutes to 16.5 minutes.
1 parent b0e1f74 commit 47ed51f

File tree

1 file changed

+24
-20
lines changed

1 file changed

+24
-20
lines changed

llvm/lib/Analysis/MemoryDependenceAnalysis.cpp

Lines changed: 24 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -983,33 +983,37 @@ MemDepResult MemoryDependenceResults::getNonLocalInfoForBlock(
983983
static void
984984
SortNonLocalDepInfoCache(MemoryDependenceResults::NonLocalDepInfo &Cache,
985985
unsigned NumSortedEntries) {
986-
switch (Cache.size() - NumSortedEntries) {
987-
case 0:
988-
// done, no new entries.
989-
break;
990-
case 2: {
991-
// Two new entries, insert the last one into place.
992-
NonLocalDepEntry Val = Cache.back();
993-
Cache.pop_back();
994-
MemoryDependenceResults::NonLocalDepInfo::iterator Entry =
995-
std::upper_bound(Cache.begin(), Cache.end() - 1, Val);
996-
Cache.insert(Entry, Val);
997-
[[fallthrough]];
986+
987+
// If only one entry, don't sort.
988+
if (Cache.size() < 2)
989+
return;
990+
991+
unsigned s = Cache.size() - NumSortedEntries;
992+
993+
// If the cache is already sorted, don't sort it again.
994+
if (s == 0)
995+
return;
996+
997+
// If no entry is sorted, sort the whole cache.
998+
if (NumSortedEntries == 0) {
999+
llvm::sort(Cache);
1000+
return;
9981001
}
999-
case 1:
1000-
// One new entry, Just insert the new value at the appropriate position.
1001-
if (Cache.size() != 1) {
1002+
1003+
// If the number of unsorted entires is small and the cache size is big, using
1004+
// insertion sort is faster. Here use Log2_32 to quickly choose the sort
1005+
// method.
1006+
if (s < Log2_32(Cache.size())) {
1007+
while (s > 0) {
10021008
NonLocalDepEntry Val = Cache.back();
10031009
Cache.pop_back();
10041010
MemoryDependenceResults::NonLocalDepInfo::iterator Entry =
1005-
llvm::upper_bound(Cache, Val);
1011+
std::upper_bound(Cache.begin(), Cache.end() - s + 1, Val);
10061012
Cache.insert(Entry, Val);
1013+
s--;
10071014
}
1008-
break;
1009-
default:
1010-
// Added many values, do a full scale sort.
1015+
} else {
10111016
llvm::sort(Cache);
1012-
break;
10131017
}
10141018
}
10151019

0 commit comments

Comments
 (0)