-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[Inliner] Add option (default off) to inline all calls regardless of the cost #152365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a new command-line option inline-all-viable-calls that allows the LLVM inliner to inline all viable function calls regardless of their cost/benefit analysis or cost threshold calculations. The option is disabled by default.
- Adds a new command-line flag
-inline-all-viable-callsto bypass cost-based inlining decisions - Modifies the inline cost calculation to return "always inline" for viable calls when the flag is enabled
- Includes comprehensive test coverage for the new functionality
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| llvm/lib/Analysis/InlineCost.cpp | Adds the command-line option and logic to bypass cost calculations for viable calls |
| llvm/test/Transforms/Inline/inline-all-viable-calls.ll | Test file verifying the option works correctly with various inlining scenarios |
|
@llvm/pr-subscribers-llvm-analysis Author: Justin Fargnoli (justinfargnoli) ChangesAdd a default off option to the inline cost calculation to always inline all viable calls regardless of the cost/benefit and cost/threshold calculations. Full diff: https://github.com/llvm/llvm-project/pull/152365.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/InlineCost.cpp b/llvm/lib/Analysis/InlineCost.cpp
index 22f4d08448a22..6c28306896e18 100644
--- a/llvm/lib/Analysis/InlineCost.cpp
+++ b/llvm/lib/Analysis/InlineCost.cpp
@@ -180,6 +180,10 @@ static cl::opt<bool> DisableGEPConstOperand(
"disable-gep-const-evaluation", cl::Hidden, cl::init(false),
cl::desc("Disables evaluation of GetElementPtr with constant operands"));
+static cl::opt<bool> InlineAllViableCalls(
+ "inline-all-viable-calls", cl::Hidden, cl::init(false),
+ cl::desc("Inline all viable calls, even if they exceed the inlining "
+ "threshold"));
namespace llvm {
std::optional<int> getStringFnAttrAsInt(const Attribute &Attr) {
if (Attr.isValid()) {
@@ -3272,6 +3276,9 @@ InlineCost llvm::getInlineCost(
return llvm::InlineCost::getNever(UserDecision->getFailureReason());
}
+ if (InlineAllViableCalls && isInlineViable(*Callee).isSuccess())
+ return llvm::InlineCost::getAlways("inline all viable calls");
+
LLVM_DEBUG(llvm::dbgs() << " Analyzing call of " << Callee->getName()
<< "... (caller:" << Call.getCaller()->getName()
<< ")\n");
diff --git a/llvm/test/Transforms/Inline/inline-all-viable-calls.ll b/llvm/test/Transforms/Inline/inline-all-viable-calls.ll
new file mode 100644
index 0000000000000..2104a30f76db9
--- /dev/null
+++ b/llvm/test/Transforms/Inline/inline-all-viable-calls.ll
@@ -0,0 +1,115 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes=inline -inline-threshold=0 -inline-all-viable-calls -S < %s | FileCheck %s
+
+; Check that viable calls that are beyond the cost threshold are still inlined.
+define i32 @callee_simple(i32 %x) {
+ %1 = add i32 %x, 1
+ %2 = mul i32 %1, 2
+ %3 = sub i32 %2, 1
+ %4 = add i32 %3, 3
+ %5 = mul i32 %4, 2
+ %6 = sub i32 %5, 2
+ %7 = add i32 %6, 1
+ ret i32 %7
+}
+
+; Check that user decisions are respected.
+define i32 @callee_alwaysinline(i32 %x) alwaysinline {
+ %sub = sub i32 %x, 3
+ ret i32 %sub
+}
+
+define i32 @callee_noinline(i32 %x) noinline {
+ %div = sdiv i32 %x, 2
+ ret i32 %div
+}
+
+define i32 @callee_optnone(i32 %x) optnone noinline {
+ %rem = srem i32 %x, 2
+ ret i32 %rem
+}
+
+define i32 @caller(i32 %a) {
+; CHECK-LABEL: define i32 @caller(
+; CHECK-SAME: i32 [[A:%.*]]) {
+; CHECK-NEXT: [[TMP7:%.*]] = add i32 [[A]], 1
+; CHECK-NEXT: [[TMP8:%.*]] = mul i32 [[TMP7]], 2
+; CHECK-NEXT: [[TMP3:%.*]] = sub i32 [[TMP8]], 1
+; CHECK-NEXT: [[TMP4:%.*]] = add i32 [[TMP3]], 3
+; CHECK-NEXT: [[TMP5:%.*]] = mul i32 [[TMP4]], 2
+; CHECK-NEXT: [[TMP6:%.*]] = sub i32 [[TMP5]], 2
+; CHECK-NEXT: [[ADD_I:%.*]] = add i32 [[TMP6]], 1
+; CHECK-NEXT: [[SUB_I:%.*]] = sub i32 [[ADD_I]], 3
+; CHECK-NEXT: [[TMP1:%.*]] = call i32 @callee_noinline(i32 [[SUB_I]])
+; CHECK-NEXT: [[TMP2:%.*]] = call i32 @callee_optnone(i32 [[TMP1]])
+; CHECK-NEXT: [[SUM:%.*]] = add i32 [[TMP2]], [[TMP1]]
+; CHECK-NEXT: ret i32 [[SUM]]
+;
+ %1 = call i32 @callee_simple(i32 %a)
+ %2 = call i32 @callee_alwaysinline(i32 %1)
+ %3 = call i32 @callee_noinline(i32 %2)
+ %4 = call i32 @callee_optnone(i32 %3)
+ %sum = add i32 %4, %3
+ ret i32 %sum
+}
+
+; Check that non-viable calls are not inlined
+
+; Test recursive function is not inlined
+define i32 @recursive(i32 %n) {
+entry:
+ %cmp = icmp eq i32 %n, 0
+ br i1 %cmp, label %base, label %recurse
+
+base:
+ ret i32 0
+
+recurse:
+ %dec = sub i32 %n, 1
+ %rec = call i32 @recursive(i32 %dec)
+ %add = add i32 %rec, 1
+ ret i32 %add
+}
+
+define i32 @call_recursive(i32 %x) {
+; CHECK-LABEL: define i32 @call_recursive(
+; CHECK-SAME: i32 [[X:%.*]]) {
+; CHECK-NEXT: [[R:%.*]] = call i32 @recursive(i32 [[X]])
+; CHECK-NEXT: ret i32 [[R]]
+;
+ %r = call i32 @recursive(i32 %x)
+ ret i32 %r
+}
+
+; Test indirectbr prevents inlining
+define void @has_indirectbr(ptr %ptr, i32 %cond) {
+entry:
+ switch i32 %cond, label %default [
+ i32 0, label %target0
+ i32 1, label %target1
+ ]
+
+target0:
+ br label %end
+
+target1:
+ br label %end
+
+default:
+ br label %end
+
+end:
+ indirectbr ptr %ptr, [label %target0, label %target1]
+ ret void
+}
+
+define void @call_indirectbr(ptr %p, i32 %c) {
+; CHECK-LABEL: define void @call_indirectbr(
+; CHECK-SAME: ptr [[P:%.*]], i32 [[C:%.*]]) {
+; CHECK-NEXT: call void @has_indirectbr(ptr [[P]], i32 [[C]])
+; CHECK-NEXT: ret void
+;
+ call void @has_indirectbr(ptr %p, i32 %c)
+ ret void
+}
+
|
|
@llvm/pr-subscribers-llvm-transforms Author: Justin Fargnoli (justinfargnoli) ChangesAdd a default off option to the inline cost calculation to always inline all viable calls regardless of the cost/benefit and cost/threshold calculations. Full diff: https://github.com/llvm/llvm-project/pull/152365.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/InlineCost.cpp b/llvm/lib/Analysis/InlineCost.cpp
index 22f4d08448a22..6c28306896e18 100644
--- a/llvm/lib/Analysis/InlineCost.cpp
+++ b/llvm/lib/Analysis/InlineCost.cpp
@@ -180,6 +180,10 @@ static cl::opt<bool> DisableGEPConstOperand(
"disable-gep-const-evaluation", cl::Hidden, cl::init(false),
cl::desc("Disables evaluation of GetElementPtr with constant operands"));
+static cl::opt<bool> InlineAllViableCalls(
+ "inline-all-viable-calls", cl::Hidden, cl::init(false),
+ cl::desc("Inline all viable calls, even if they exceed the inlining "
+ "threshold"));
namespace llvm {
std::optional<int> getStringFnAttrAsInt(const Attribute &Attr) {
if (Attr.isValid()) {
@@ -3272,6 +3276,9 @@ InlineCost llvm::getInlineCost(
return llvm::InlineCost::getNever(UserDecision->getFailureReason());
}
+ if (InlineAllViableCalls && isInlineViable(*Callee).isSuccess())
+ return llvm::InlineCost::getAlways("inline all viable calls");
+
LLVM_DEBUG(llvm::dbgs() << " Analyzing call of " << Callee->getName()
<< "... (caller:" << Call.getCaller()->getName()
<< ")\n");
diff --git a/llvm/test/Transforms/Inline/inline-all-viable-calls.ll b/llvm/test/Transforms/Inline/inline-all-viable-calls.ll
new file mode 100644
index 0000000000000..2104a30f76db9
--- /dev/null
+++ b/llvm/test/Transforms/Inline/inline-all-viable-calls.ll
@@ -0,0 +1,115 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes=inline -inline-threshold=0 -inline-all-viable-calls -S < %s | FileCheck %s
+
+; Check that viable calls that are beyond the cost threshold are still inlined.
+define i32 @callee_simple(i32 %x) {
+ %1 = add i32 %x, 1
+ %2 = mul i32 %1, 2
+ %3 = sub i32 %2, 1
+ %4 = add i32 %3, 3
+ %5 = mul i32 %4, 2
+ %6 = sub i32 %5, 2
+ %7 = add i32 %6, 1
+ ret i32 %7
+}
+
+; Check that user decisions are respected.
+define i32 @callee_alwaysinline(i32 %x) alwaysinline {
+ %sub = sub i32 %x, 3
+ ret i32 %sub
+}
+
+define i32 @callee_noinline(i32 %x) noinline {
+ %div = sdiv i32 %x, 2
+ ret i32 %div
+}
+
+define i32 @callee_optnone(i32 %x) optnone noinline {
+ %rem = srem i32 %x, 2
+ ret i32 %rem
+}
+
+define i32 @caller(i32 %a) {
+; CHECK-LABEL: define i32 @caller(
+; CHECK-SAME: i32 [[A:%.*]]) {
+; CHECK-NEXT: [[TMP7:%.*]] = add i32 [[A]], 1
+; CHECK-NEXT: [[TMP8:%.*]] = mul i32 [[TMP7]], 2
+; CHECK-NEXT: [[TMP3:%.*]] = sub i32 [[TMP8]], 1
+; CHECK-NEXT: [[TMP4:%.*]] = add i32 [[TMP3]], 3
+; CHECK-NEXT: [[TMP5:%.*]] = mul i32 [[TMP4]], 2
+; CHECK-NEXT: [[TMP6:%.*]] = sub i32 [[TMP5]], 2
+; CHECK-NEXT: [[ADD_I:%.*]] = add i32 [[TMP6]], 1
+; CHECK-NEXT: [[SUB_I:%.*]] = sub i32 [[ADD_I]], 3
+; CHECK-NEXT: [[TMP1:%.*]] = call i32 @callee_noinline(i32 [[SUB_I]])
+; CHECK-NEXT: [[TMP2:%.*]] = call i32 @callee_optnone(i32 [[TMP1]])
+; CHECK-NEXT: [[SUM:%.*]] = add i32 [[TMP2]], [[TMP1]]
+; CHECK-NEXT: ret i32 [[SUM]]
+;
+ %1 = call i32 @callee_simple(i32 %a)
+ %2 = call i32 @callee_alwaysinline(i32 %1)
+ %3 = call i32 @callee_noinline(i32 %2)
+ %4 = call i32 @callee_optnone(i32 %3)
+ %sum = add i32 %4, %3
+ ret i32 %sum
+}
+
+; Check that non-viable calls are not inlined
+
+; Test recursive function is not inlined
+define i32 @recursive(i32 %n) {
+entry:
+ %cmp = icmp eq i32 %n, 0
+ br i1 %cmp, label %base, label %recurse
+
+base:
+ ret i32 0
+
+recurse:
+ %dec = sub i32 %n, 1
+ %rec = call i32 @recursive(i32 %dec)
+ %add = add i32 %rec, 1
+ ret i32 %add
+}
+
+define i32 @call_recursive(i32 %x) {
+; CHECK-LABEL: define i32 @call_recursive(
+; CHECK-SAME: i32 [[X:%.*]]) {
+; CHECK-NEXT: [[R:%.*]] = call i32 @recursive(i32 [[X]])
+; CHECK-NEXT: ret i32 [[R]]
+;
+ %r = call i32 @recursive(i32 %x)
+ ret i32 %r
+}
+
+; Test indirectbr prevents inlining
+define void @has_indirectbr(ptr %ptr, i32 %cond) {
+entry:
+ switch i32 %cond, label %default [
+ i32 0, label %target0
+ i32 1, label %target1
+ ]
+
+target0:
+ br label %end
+
+target1:
+ br label %end
+
+default:
+ br label %end
+
+end:
+ indirectbr ptr %ptr, [label %target0, label %target1]
+ ret void
+}
+
+define void @call_indirectbr(ptr %p, i32 %c) {
+; CHECK-LABEL: define void @call_indirectbr(
+; CHECK-SAME: ptr [[P:%.*]], i32 [[C:%.*]]) {
+; CHECK-NEXT: call void @has_indirectbr(ptr [[P]], i32 [[C]])
+; CHECK-NEXT: ret void
+;
+ call void @has_indirectbr(ptr %p, i32 %c)
+ ret void
+}
+
|
|
Ping @mingmingl-llvm @kazutakahirata @aeubanks for review. |
Artem-B
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK with this as a hidden option needed to address a special use case. I'll hold off on the approval stamp so others can chime in, as this is a change in a common LLVM code I can claim no ownership of.
It would be great if you could expand the PR description and elaborate on that intended use case so there's a clear rationale for existence of such an option.
Artem-B
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have AMDGPUAlwaysInlinePass already (which I want to delete without replacement) |
So is the intention here that end-users are going to do something like Another option would be to use |
Yes, that's correct.
I considered this option, but it seemed simpler to implement this in LLVM rather than all of the relevant frontends.
I like the idea behind this approach. However, it looks like |
|
Assuming I understood @nikic's comment correctly, there are no outstanding concerns with this PR. Enabling auto-merge. |
This is just a bug in the pass that is fixable |
Add a default off option to the inline cost calculation to always inline all viable calls regardless of the cost/benefit and cost/threshold calculations.
For performance reasons, some users require that all calls be inlined. Rather than forcing them to adjust the inlining threshold to an arbitrarily high value, offer an option to inline all calls.