-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[MemDep] Optimize SortNonLocalDepInfoCache sorting strategy for large caches with few unsorted entries #143107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -983,33 +983,41 @@ MemDepResult MemoryDependenceResults::getNonLocalInfoForBlock( | |||||
| static void | ||||||
| SortNonLocalDepInfoCache(MemoryDependenceResults::NonLocalDepInfo &Cache, | ||||||
| unsigned NumSortedEntries) { | ||||||
| switch (Cache.size() - NumSortedEntries) { | ||||||
| case 0: | ||||||
| // done, no new entries. | ||||||
| break; | ||||||
| case 2: { | ||||||
| // Two new entries, insert the last one into place. | ||||||
| NonLocalDepEntry Val = Cache.back(); | ||||||
| Cache.pop_back(); | ||||||
| MemoryDependenceResults::NonLocalDepInfo::iterator Entry = | ||||||
| std::upper_bound(Cache.begin(), Cache.end() - 1, Val); | ||||||
| Cache.insert(Entry, Val); | ||||||
| [[fallthrough]]; | ||||||
|
|
||||||
| // Output number of sorted entries and size of cache for each sort. | ||||||
| LLVM_DEBUG(dbgs() << "NumSortedEntries: " << NumSortedEntries | ||||||
| << ", Cache.size: " << Cache.size() << "\n"); | ||||||
|
|
||||||
| // If only one entry, don't sort. | ||||||
| if (Cache.size() < 2) | ||||||
| return; | ||||||
|
|
||||||
| unsigned s = Cache.size() - NumSortedEntries; | ||||||
|
|
||||||
| // If the cache is already sorted, don't sort it again. | ||||||
| if (s == 0) | ||||||
| return; | ||||||
|
|
||||||
| // If no entry is sorted, sort the whole cache. | ||||||
| if (NumSortedEntries == 0) { | ||||||
| llvm::sort(Cache); | ||||||
| return; | ||||||
| } | ||||||
| case 1: | ||||||
| // One new entry, Just insert the new value at the appropriate position. | ||||||
| if (Cache.size() != 1) { | ||||||
|
|
||||||
| // If the number of unsorted entires is small and the cache size is big, use | ||||||
|
||||||
| // If the number of unsorted entires is small and the cache size is big, use | |
| // If the number of unsorted entires is small and the cache size is big, using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The choice of using log2 here is based on empirical experience. The main goal is to have a relatively fast way to determine whether the number of unsorted entries is significantly smaller than the cache size. To tune this condition, I experimented with the following four options, and based on the timing results, using log2 proved to be the fastest. The benchmark results are as follows:
- s < NumSortedEntries: https://llvm-compile-time-tracker.com/compare.php?from=26f3f24a4f0a67eb23d255aba7a73a12bee1db11&to=e174118c88ee3d9d31fb3ed4e29b9ae2fcac46fa&stat=instructions%3Au
- s < Log2_32(Cache.size()) * llvm::numbers::ln2 / llvm::numbers::ln10: https://llvm-compile-time-tracker.com/compare.php?from=26f3f24a4f0a67eb23d255aba7a73a12bee1db11&to=9368621b42fa8b68e1e3081110f82ae9a5d57458&stat=instructions%3Au
- s < Log2_32(Cache.size()) * llvm::numbers::ln2: https://llvm-compile-time-tracker.com/compare.php?from=26f3f24a4f0a67eb23d255aba7a73a12bee1db11&to=19a8584d14dbb95b4a71a92a43da3a2c5d5e550a&stat=instructions%3Au
- s < Log2_32(Cache.size()): https://llvm-compile-time-tracker.com/compare.php?from=26f3f24a4f0a67eb23d255aba7a73a12bee1db11&to=0fa6bc6bdf1c9c5464e81970e973f2c43edac874&stat=instructions%3Au
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please drop this debug output, it will be very spammy for anyone not specifically trying to optimize this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done