Skip to content

Conversation

@justinfargnoli
Copy link
Contributor

@justinfargnoli justinfargnoli commented Aug 6, 2025

Add a default off option to the inline cost calculation to always inline all viable calls regardless of the cost/benefit and cost/threshold calculations.

For performance reasons, some users require that all calls be inlined. Rather than forcing them to adjust the inlining threshold to an arbitrarily high value, offer an option to inline all calls.

@justinfargnoli justinfargnoli requested a review from Copilot August 6, 2025 18:51
@justinfargnoli justinfargnoli self-assigned this Aug 6, 2025
@llvmbot llvmbot added llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Aug 6, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new command-line option inline-all-viable-calls that allows the LLVM inliner to inline all viable function calls regardless of their cost/benefit analysis or cost threshold calculations. The option is disabled by default.

  • Adds a new command-line flag -inline-all-viable-calls to bypass cost-based inlining decisions
  • Modifies the inline cost calculation to return "always inline" for viable calls when the flag is enabled
  • Includes comprehensive test coverage for the new functionality

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
llvm/lib/Analysis/InlineCost.cpp Adds the command-line option and logic to bypass cost calculations for viable calls
llvm/test/Transforms/Inline/inline-all-viable-calls.ll Test file verifying the option works correctly with various inlining scenarios

@llvmbot
Copy link
Member

llvmbot commented Aug 6, 2025

@llvm/pr-subscribers-llvm-analysis

Author: Justin Fargnoli (justinfargnoli)

Changes

Add a default off option to the inline cost calculation to always inline all viable calls regardless of the cost/benefit and cost/threshold calculations.


Full diff: https://github.com/llvm/llvm-project/pull/152365.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/InlineCost.cpp (+7)
  • (added) llvm/test/Transforms/Inline/inline-all-viable-calls.ll (+115)
diff --git a/llvm/lib/Analysis/InlineCost.cpp b/llvm/lib/Analysis/InlineCost.cpp
index 22f4d08448a22..6c28306896e18 100644
--- a/llvm/lib/Analysis/InlineCost.cpp
+++ b/llvm/lib/Analysis/InlineCost.cpp
@@ -180,6 +180,10 @@ static cl::opt<bool> DisableGEPConstOperand(
     "disable-gep-const-evaluation", cl::Hidden, cl::init(false),
     cl::desc("Disables evaluation of GetElementPtr with constant operands"));
 
+static cl::opt<bool> InlineAllViableCalls(
+    "inline-all-viable-calls", cl::Hidden, cl::init(false),
+    cl::desc("Inline all viable calls, even if they exceed the inlining "
+             "threshold"));
 namespace llvm {
 std::optional<int> getStringFnAttrAsInt(const Attribute &Attr) {
   if (Attr.isValid()) {
@@ -3272,6 +3276,9 @@ InlineCost llvm::getInlineCost(
     return llvm::InlineCost::getNever(UserDecision->getFailureReason());
   }
 
+  if (InlineAllViableCalls && isInlineViable(*Callee).isSuccess())
+    return llvm::InlineCost::getAlways("inline all viable calls");
+
   LLVM_DEBUG(llvm::dbgs() << "      Analyzing call of " << Callee->getName()
                           << "... (caller:" << Call.getCaller()->getName()
                           << ")\n");
diff --git a/llvm/test/Transforms/Inline/inline-all-viable-calls.ll b/llvm/test/Transforms/Inline/inline-all-viable-calls.ll
new file mode 100644
index 0000000000000..2104a30f76db9
--- /dev/null
+++ b/llvm/test/Transforms/Inline/inline-all-viable-calls.ll
@@ -0,0 +1,115 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes=inline -inline-threshold=0 -inline-all-viable-calls -S < %s | FileCheck %s
+
+; Check that viable calls that are beyond the cost threshold are still inlined.
+define i32 @callee_simple(i32 %x) {
+  %1 = add i32 %x, 1
+  %2 = mul i32 %1, 2
+  %3 = sub i32 %2, 1
+  %4 = add i32 %3, 3
+  %5 = mul i32 %4, 2
+  %6 = sub i32 %5, 2
+  %7 = add i32 %6, 1
+  ret i32 %7
+}
+
+; Check that user decisions are respected.
+define i32 @callee_alwaysinline(i32 %x) alwaysinline {
+  %sub = sub i32 %x, 3
+  ret i32 %sub
+}
+
+define i32 @callee_noinline(i32 %x) noinline {
+  %div = sdiv i32 %x, 2
+  ret i32 %div
+}
+
+define i32 @callee_optnone(i32 %x) optnone noinline {
+  %rem = srem i32 %x, 2
+  ret i32 %rem
+}
+
+define i32 @caller(i32 %a) {
+; CHECK-LABEL: define i32 @caller(
+; CHECK-SAME: i32 [[A:%.*]]) {
+; CHECK-NEXT:    [[TMP7:%.*]] = add i32 [[A]], 1
+; CHECK-NEXT:    [[TMP8:%.*]] = mul i32 [[TMP7]], 2
+; CHECK-NEXT:    [[TMP3:%.*]] = sub i32 [[TMP8]], 1
+; CHECK-NEXT:    [[TMP4:%.*]] = add i32 [[TMP3]], 3
+; CHECK-NEXT:    [[TMP5:%.*]] = mul i32 [[TMP4]], 2
+; CHECK-NEXT:    [[TMP6:%.*]] = sub i32 [[TMP5]], 2
+; CHECK-NEXT:    [[ADD_I:%.*]] = add i32 [[TMP6]], 1
+; CHECK-NEXT:    [[SUB_I:%.*]] = sub i32 [[ADD_I]], 3
+; CHECK-NEXT:    [[TMP1:%.*]] = call i32 @callee_noinline(i32 [[SUB_I]])
+; CHECK-NEXT:    [[TMP2:%.*]] = call i32 @callee_optnone(i32 [[TMP1]])
+; CHECK-NEXT:    [[SUM:%.*]] = add i32 [[TMP2]], [[TMP1]]
+; CHECK-NEXT:    ret i32 [[SUM]]
+;
+  %1 = call i32 @callee_simple(i32 %a)
+  %2 = call i32 @callee_alwaysinline(i32 %1)
+  %3 = call i32 @callee_noinline(i32 %2)
+  %4 = call i32 @callee_optnone(i32 %3)
+  %sum = add i32 %4, %3
+  ret i32 %sum
+}
+
+; Check that non-viable calls are not inlined
+
+; Test recursive function is not inlined
+define i32 @recursive(i32 %n) {
+entry:
+  %cmp = icmp eq i32 %n, 0
+  br i1 %cmp, label %base, label %recurse
+
+base:
+  ret i32 0
+
+recurse:
+  %dec = sub i32 %n, 1
+  %rec = call i32 @recursive(i32 %dec)
+  %add = add i32 %rec, 1
+  ret i32 %add
+}
+
+define i32 @call_recursive(i32 %x) {
+; CHECK-LABEL: define i32 @call_recursive(
+; CHECK-SAME: i32 [[X:%.*]]) {
+; CHECK-NEXT:    [[R:%.*]] = call i32 @recursive(i32 [[X]])
+; CHECK-NEXT:    ret i32 [[R]]
+;
+  %r = call i32 @recursive(i32 %x)
+  ret i32 %r
+}
+
+; Test indirectbr prevents inlining
+define void @has_indirectbr(ptr %ptr, i32 %cond) {
+entry:
+  switch i32 %cond, label %default [
+  i32 0, label %target0
+  i32 1, label %target1
+  ]
+
+target0:
+  br label %end
+
+target1:
+  br label %end
+
+default:
+  br label %end
+
+end:
+  indirectbr ptr %ptr, [label %target0, label %target1]
+  ret void
+}
+
+define void @call_indirectbr(ptr %p, i32 %c) {
+; CHECK-LABEL: define void @call_indirectbr(
+; CHECK-SAME: ptr [[P:%.*]], i32 [[C:%.*]]) {
+; CHECK-NEXT:    call void @has_indirectbr(ptr [[P]], i32 [[C]])
+; CHECK-NEXT:    ret void
+;
+  call void @has_indirectbr(ptr %p, i32 %c)
+  ret void
+}
+

@llvmbot
Copy link
Member

llvmbot commented Aug 6, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Justin Fargnoli (justinfargnoli)

Changes

Add a default off option to the inline cost calculation to always inline all viable calls regardless of the cost/benefit and cost/threshold calculations.


Full diff: https://github.com/llvm/llvm-project/pull/152365.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/InlineCost.cpp (+7)
  • (added) llvm/test/Transforms/Inline/inline-all-viable-calls.ll (+115)
diff --git a/llvm/lib/Analysis/InlineCost.cpp b/llvm/lib/Analysis/InlineCost.cpp
index 22f4d08448a22..6c28306896e18 100644
--- a/llvm/lib/Analysis/InlineCost.cpp
+++ b/llvm/lib/Analysis/InlineCost.cpp
@@ -180,6 +180,10 @@ static cl::opt<bool> DisableGEPConstOperand(
     "disable-gep-const-evaluation", cl::Hidden, cl::init(false),
     cl::desc("Disables evaluation of GetElementPtr with constant operands"));
 
+static cl::opt<bool> InlineAllViableCalls(
+    "inline-all-viable-calls", cl::Hidden, cl::init(false),
+    cl::desc("Inline all viable calls, even if they exceed the inlining "
+             "threshold"));
 namespace llvm {
 std::optional<int> getStringFnAttrAsInt(const Attribute &Attr) {
   if (Attr.isValid()) {
@@ -3272,6 +3276,9 @@ InlineCost llvm::getInlineCost(
     return llvm::InlineCost::getNever(UserDecision->getFailureReason());
   }
 
+  if (InlineAllViableCalls && isInlineViable(*Callee).isSuccess())
+    return llvm::InlineCost::getAlways("inline all viable calls");
+
   LLVM_DEBUG(llvm::dbgs() << "      Analyzing call of " << Callee->getName()
                           << "... (caller:" << Call.getCaller()->getName()
                           << ")\n");
diff --git a/llvm/test/Transforms/Inline/inline-all-viable-calls.ll b/llvm/test/Transforms/Inline/inline-all-viable-calls.ll
new file mode 100644
index 0000000000000..2104a30f76db9
--- /dev/null
+++ b/llvm/test/Transforms/Inline/inline-all-viable-calls.ll
@@ -0,0 +1,115 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes=inline -inline-threshold=0 -inline-all-viable-calls -S < %s | FileCheck %s
+
+; Check that viable calls that are beyond the cost threshold are still inlined.
+define i32 @callee_simple(i32 %x) {
+  %1 = add i32 %x, 1
+  %2 = mul i32 %1, 2
+  %3 = sub i32 %2, 1
+  %4 = add i32 %3, 3
+  %5 = mul i32 %4, 2
+  %6 = sub i32 %5, 2
+  %7 = add i32 %6, 1
+  ret i32 %7
+}
+
+; Check that user decisions are respected.
+define i32 @callee_alwaysinline(i32 %x) alwaysinline {
+  %sub = sub i32 %x, 3
+  ret i32 %sub
+}
+
+define i32 @callee_noinline(i32 %x) noinline {
+  %div = sdiv i32 %x, 2
+  ret i32 %div
+}
+
+define i32 @callee_optnone(i32 %x) optnone noinline {
+  %rem = srem i32 %x, 2
+  ret i32 %rem
+}
+
+define i32 @caller(i32 %a) {
+; CHECK-LABEL: define i32 @caller(
+; CHECK-SAME: i32 [[A:%.*]]) {
+; CHECK-NEXT:    [[TMP7:%.*]] = add i32 [[A]], 1
+; CHECK-NEXT:    [[TMP8:%.*]] = mul i32 [[TMP7]], 2
+; CHECK-NEXT:    [[TMP3:%.*]] = sub i32 [[TMP8]], 1
+; CHECK-NEXT:    [[TMP4:%.*]] = add i32 [[TMP3]], 3
+; CHECK-NEXT:    [[TMP5:%.*]] = mul i32 [[TMP4]], 2
+; CHECK-NEXT:    [[TMP6:%.*]] = sub i32 [[TMP5]], 2
+; CHECK-NEXT:    [[ADD_I:%.*]] = add i32 [[TMP6]], 1
+; CHECK-NEXT:    [[SUB_I:%.*]] = sub i32 [[ADD_I]], 3
+; CHECK-NEXT:    [[TMP1:%.*]] = call i32 @callee_noinline(i32 [[SUB_I]])
+; CHECK-NEXT:    [[TMP2:%.*]] = call i32 @callee_optnone(i32 [[TMP1]])
+; CHECK-NEXT:    [[SUM:%.*]] = add i32 [[TMP2]], [[TMP1]]
+; CHECK-NEXT:    ret i32 [[SUM]]
+;
+  %1 = call i32 @callee_simple(i32 %a)
+  %2 = call i32 @callee_alwaysinline(i32 %1)
+  %3 = call i32 @callee_noinline(i32 %2)
+  %4 = call i32 @callee_optnone(i32 %3)
+  %sum = add i32 %4, %3
+  ret i32 %sum
+}
+
+; Check that non-viable calls are not inlined
+
+; Test recursive function is not inlined
+define i32 @recursive(i32 %n) {
+entry:
+  %cmp = icmp eq i32 %n, 0
+  br i1 %cmp, label %base, label %recurse
+
+base:
+  ret i32 0
+
+recurse:
+  %dec = sub i32 %n, 1
+  %rec = call i32 @recursive(i32 %dec)
+  %add = add i32 %rec, 1
+  ret i32 %add
+}
+
+define i32 @call_recursive(i32 %x) {
+; CHECK-LABEL: define i32 @call_recursive(
+; CHECK-SAME: i32 [[X:%.*]]) {
+; CHECK-NEXT:    [[R:%.*]] = call i32 @recursive(i32 [[X]])
+; CHECK-NEXT:    ret i32 [[R]]
+;
+  %r = call i32 @recursive(i32 %x)
+  ret i32 %r
+}
+
+; Test indirectbr prevents inlining
+define void @has_indirectbr(ptr %ptr, i32 %cond) {
+entry:
+  switch i32 %cond, label %default [
+  i32 0, label %target0
+  i32 1, label %target1
+  ]
+
+target0:
+  br label %end
+
+target1:
+  br label %end
+
+default:
+  br label %end
+
+end:
+  indirectbr ptr %ptr, [label %target0, label %target1]
+  ret void
+}
+
+define void @call_indirectbr(ptr %p, i32 %c) {
+; CHECK-LABEL: define void @call_indirectbr(
+; CHECK-SAME: ptr [[P:%.*]], i32 [[C:%.*]]) {
+; CHECK-NEXT:    call void @has_indirectbr(ptr [[P]], i32 [[C]])
+; CHECK-NEXT:    ret void
+;
+  call void @has_indirectbr(ptr %p, i32 %c)
+  ret void
+}
+

@justinfargnoli
Copy link
Contributor Author

Ping @mingmingl-llvm @kazutakahirata @aeubanks for review.

Copy link
Member

@Artem-B Artem-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with this as a hidden option needed to address a special use case. I'll hold off on the approval stamp so others can chime in, as this is a change in a common LLVM code I can claim no ownership of.

It would be great if you could expand the PR description and elaborate on that intended use case so there's a clear rationale for existence of such an option.

Copy link
Member

@Artem-B Artem-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@arsenm @yxsamliu I vaguely recall that AMDGPU builds sometimes wanted to inline everything. Perhaps this could be a useful knob which could be used for that purpose while leaving normal inlining heuristics in place for the rest of the builds.

@arsenm
Copy link
Contributor

arsenm commented Aug 16, 2025

LGTM.

@arsenm @yxsamliu I vaguely recall that AMDGPU builds sometimes wanted to inline everything. Perhaps this could be a useful knob which could be used for that purpose while leaving normal inlining heuristics in place for the rest of the builds.

We have AMDGPUAlwaysInlinePass already (which I want to delete without replacement)

@nikic
Copy link
Contributor

nikic commented Aug 16, 2025

For performance reasons, some users require that all calls be inlined. Rather than forcing them to adjust the inlining threshold to an arbitrarily high value, offer an option to inline all calls.

So is the intention here that end-users are going to do something like -mllvm -inline-all-viable-calls=1? Otherwise I'd expect this to be a frontend option that set alwaysinline on all functions.

Another option would be to use -force-attribue=alwaysinline.

@justinfargnoli
Copy link
Contributor Author

justinfargnoli commented Aug 18, 2025

So is the intention here that end-users are going to do something like -mllvm -inline-all-viable-calls=1?

Yes, that's correct.

Otherwise I'd expect this to be a frontend option that set alwaysinline on all functions.

I considered this option, but it seemed simpler to implement this in LLVM rather than all of the relevant frontends.

Another option would be to use -force-attribue=alwaysinline.

I like the idea behind this approach. However, it looks like ForceFunctionAttrs adds the attribute regardless of the other attributes on the function. Thus, it'll break programs that use noinline.

@justinfargnoli
Copy link
Contributor Author

Assuming I understood @nikic's comment correctly, there are no outstanding concerns with this PR.

Enabling auto-merge.

@justinfargnoli justinfargnoli enabled auto-merge (squash) August 18, 2025 17:22
@justinfargnoli justinfargnoli merged commit 58de8f2 into llvm:main Aug 18, 2025
9 checks passed
@arsenm
Copy link
Contributor

arsenm commented Aug 19, 2025

I like the idea behind this approach. However, it looks like ForceFunctionAttrs adds the attribute regardless of the other attributes on the function. Thus, it'll break programs that use noinline.

This is just a bug in the pass that is fixable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants