-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[GlobalMerge]Prefer use global-merge-max-offset instead of the target-specific constant offset. #165591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…stant offset. In the Dhrystone benchmark, I find some adjacent global not be merged, on the contrary the GCC's anchor optimize is work. Use global-merge-max-offset to set the max offset can yield similar results (still slightly different)
davemgreen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds OK to me. Could we add a test to show the option overriding the target offset?
llvm/lib/CodeGen/GlobalMerge.cpp
Outdated
| ? GlobalMergeAllConst | ||
| : MergeConstAggressiveByDefault; | ||
| return new GlobalMerge(TM, Offset, OnlyOptimizeForSize, MergeExternal, | ||
| unsigned PreferOffset = GlobalMergeMaxOffset ? GlobalMergeMaxOffset : Offset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use GlobalMergeOffset.getNumOccurrences() > 0
davemgreen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, could we delete this code too? I don't think it is ever used.
explicit GlobalMerge() : FunctionPass(ID) {
Opt.MaxOffset = GlobalMergeMaxOffset;
Opt.MergeConstantGlobals = EnableGlobalMergeOnConst;
Opt.MergeConstAggressive = GlobalMergeAllConst;
initializeGlobalMergePass(*PassRegistry::getPassRegistry());
}
|
@llvm/pr-subscribers-backend-arm @llvm/pr-subscribers-backend-aarch64 Author: None (hstk30-hw) ChangesIn the Dhrystone benchmark, I find some adjacent global not be merged, on the contrary the GCC's anchor optimize is work. Use global-merge-max-offset to set the max offset can yield similar results (still slightly different, at least we can control the offset). Full diff: https://github.com/llvm/llvm-project/pull/165591.diff 3 Files Affected:
diff --git a/llvm/lib/CodeGen/GlobalMerge.cpp b/llvm/lib/CodeGen/GlobalMerge.cpp
index e58d7e344c28b..0f398b375d799 100644
--- a/llvm/lib/CodeGen/GlobalMerge.cpp
+++ b/llvm/lib/CodeGen/GlobalMerge.cpp
@@ -196,13 +196,6 @@ class GlobalMerge : public FunctionPass {
public:
static char ID; // Pass identification, replacement for typeid.
- explicit GlobalMerge() : FunctionPass(ID) {
- Opt.MaxOffset = GlobalMergeMaxOffset;
- Opt.MergeConstantGlobals = EnableGlobalMergeOnConst;
- Opt.MergeConstAggressive = GlobalMergeAllConst;
- initializeGlobalMergePass(*PassRegistry::getPassRegistry());
- }
-
explicit GlobalMerge(const TargetMachine *TM, unsigned MaximalOffset,
bool OnlyOptimizeForSize, bool MergeExternalGlobals,
bool MergeConstantGlobals, bool MergeConstAggressive)
@@ -772,6 +765,8 @@ Pass *llvm::createGlobalMergePass(const TargetMachine *TM, unsigned Offset,
bool MergeConstAggressive = GlobalMergeAllConst.getNumOccurrences() > 0
? GlobalMergeAllConst
: MergeConstAggressiveByDefault;
- return new GlobalMerge(TM, Offset, OnlyOptimizeForSize, MergeExternal,
+ unsigned PreferOffset = GlobalMergeMaxOffset.getNumOccurrences() > 0 ?
+ GlobalMergeMaxOffset : Offset;
+ return new GlobalMerge(TM, PreferOffset, OnlyOptimizeForSize, MergeExternal,
MergeConstant, MergeConstAggressive);
}
diff --git a/llvm/test/CodeGen/AArch64/global-merge.ll b/llvm/test/CodeGen/AArch64/global-merge.ll
index f2826e4cb00cb..723b033720937 100644
--- a/llvm/test/CodeGen/AArch64/global-merge.ll
+++ b/llvm/test/CodeGen/AArch64/global-merge.ll
@@ -1,17 +1,22 @@
; RUN: llc < %s -mtriple=aarch64-none-linux-gnu -O0 | FileCheck --check-prefix=NO-MERGE %s
+; RUN: llc < %s -mtriple=aarch64-none-linux-gnu -O1 | FileCheck --check-prefix=NO-MERGE %s
+; RUN: llc < %s -mtriple=aarch64-none-linux-gnu -O2 | FileCheck --check-prefix=NO-MERGE %s
+; RUN: llc < %s -mtriple=aarch64-none-linux-gnu -O3 | FileCheck %s
+; RUN: llc < %s -mtriple=aarch64-none-linux-gnu -O3 -global-merge-max-offset=0 | FileCheck %s --check-prefix=NO-MERGE
; RUN: llc < %s -mtriple=aarch64-none-linux-gnu -O0 -global-merge-on-external=true | FileCheck --check-prefix=NO-MERGE %s
; RUN: llc < %s -mtriple=aarch64-apple-ios -O0 | FileCheck %s --check-prefix=CHECK-APPLE-IOS-NO-MERGE
+; RUN: llc < %s -mtriple=aarch64-apple-ios -O1 | FileCheck %s --check-prefix=CHECK-APPLE-IOS-NO-MERGE
+; RUN: llc < %s -mtriple=aarch64-apple-ios -O2 | FileCheck %s --check-prefix=CHECK-APPLE-IOS-NO-MERGE
+; RUN: llc < %s -mtriple=aarch64-apple-ios -O3 | FileCheck %s --check-prefix=CHECK-APPLE-IOS
; RUN: llc < %s -mtriple=aarch64-apple-ios -O0 -global-merge-on-external=true | FileCheck %s --check-prefix=CHECK-APPLE-IOS-NO-MERGE
-; FIXME: add O1/O2 test for aarch64-none-linux-gnu and aarch64-apple-ios
-
@m = internal global i32 0, align 4
@n = internal global i32 0, align 4
define void @f1(i32 %a1, i32 %a2) {
; CHECK-LABEL: f1:
-; CHECK: adrp x{{[0-9]+}}, _MergedGlobals
+; CHECK: adrp x{{[0-9]+}}, .L_MergedGlobals
; CHECK-NOT: adrp
; CHECK-APPLE-IOS-LABEL: f1:
@@ -22,9 +27,9 @@ define void @f1(i32 %a1, i32 %a2) {
ret void
}
-; CHECK: .local _MergedGlobals
-; CHECK: .comm _MergedGlobals,8,8
+; CHECK: .local .L_MergedGlobals
+; CHECK: .comm .L_MergedGlobals,8,4
; NO-MERGE-NOT: .local _MergedGlobals
-; CHECK-APPLE-IOS: .zerofill __DATA,__bss,__MergedGlobals,8,3
-; CHECK-APPLE-IOS-NO-MERGE-NOT: .zerofill __DATA,__bss,__MergedGlobals,8,3
+; CHECK-APPLE-IOS: .zerofill __DATA,__bss,__MergedGlobals,8,2
+; CHECK-APPLE-IOS-NO-MERGE-NOT: .zerofill __DATA,__bss,__MergedGlobals,8,2
diff --git a/llvm/test/CodeGen/ARM/global-merge-1.ll b/llvm/test/CodeGen/ARM/global-merge-1.ll
index 05719ae4eb37d..fe42f7e54b192 100644
--- a/llvm/test/CodeGen/ARM/global-merge-1.ll
+++ b/llvm/test/CodeGen/ARM/global-merge-1.ll
@@ -7,6 +7,7 @@
; RUN: llc %s -O3 -o - | FileCheck -check-prefix=MERGE %s
; RUN: llc %s -O3 -o - -arm-global-merge=false | FileCheck -check-prefix=NO-MERGE %s
; RUN: llc %s -O3 -o - -arm-global-merge=true | FileCheck -check-prefix=MERGE %s
+; RUN: llc %s -O3 -o - -arm-global-merge=true -global-merge-max-offse=0 | FileCheck -check-prefix=NO-MERGE %s
; MERGE-NOT: .zerofill __DATA,__bss,_bar,20,2
; MERGE-NOT: .zerofill __DATA,__bss,_baz,20,2
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
This is for |
I see - perhaps something we can remove when it moves to the NPM. |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/174/builds/27572 Here is the relevant piece of the build log for the reference |
In the Dhrystone benchmark, I find some adjacent global not be merged, on the contrary the GCC's anchor optimize is work. Use global-merge-max-offset to set the max offset can yield similar results (still slightly different, at least we can control the offset).