Skip to content

Conversation

@shiltian
Copy link
Contributor

@shiltian shiltian commented Apr 6, 2025

On one hand, we intend to force import all functions when the option is enabled.
On the other hand, we currently drop definitions of some functions and convert
them to declarations, which contradicts this intent.

With this PR, functions will no longer be converted to declarations when
force-import-all is enabled.

@llvmbot llvmbot added backend:AMDGPU LTO Link time optimization (regular/full LTO or ThinLTO) llvm:transforms labels Apr 6, 2025
Copy link
Contributor Author

shiltian commented Apr 6, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@llvmbot
Copy link
Member

llvmbot commented Apr 6, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Shilei Tian (shiltian)

Changes

On one hand, we intend to force-import-all functions when the option is
enabled. On the other hand, we currently drop definitions of some functions and
convert them to declarations, which contradicts this intent.

With this PR, functions will no longer be converted to declarations when
force-import-all is enabled.


Full diff: https://github.com/llvm/llvm-project/pull/134541.diff

4 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/FunctionImport.cpp (+4-1)
  • (added) llvm/test/ThinLTO/AMDGPU/Inputs/noinline.ll (+15)
  • (added) llvm/test/ThinLTO/AMDGPU/force-import-all.ll (+26)
  • (added) llvm/test/ThinLTO/AMDGPU/lit.local.cfg (+2)
diff --git a/llvm/lib/Transforms/IPO/FunctionImport.cpp b/llvm/lib/Transforms/IPO/FunctionImport.cpp
index f1dce5d7904f9..19a6bc55b0147 100644
--- a/llvm/lib/Transforms/IPO/FunctionImport.cpp
+++ b/llvm/lib/Transforms/IPO/FunctionImport.cpp
@@ -1701,13 +1701,16 @@ void llvm::thinLTOFinalizeInModule(Module &TheModule,
     if (NewLinkage == GV.getLinkage())
       return;
 
+    bool ForceImportFunction = isa<Function>(GV) && ForceImportAll;
+
     // Check for a non-prevailing def that has interposable linkage
     // (e.g. non-odr weak or linkonce). In that case we can't simply
     // convert to available_externally, since it would lose the
     // interposable property and possibly get inlined. Simply drop
     // the definition in that case.
     if (GlobalValue::isAvailableExternallyLinkage(NewLinkage) &&
-        GlobalValue::isInterposableLinkage(GV.getLinkage())) {
+        GlobalValue::isInterposableLinkage(GV.getLinkage()) &&
+        !ForceImportFunction) {
       if (!convertToDeclaration(GV))
         // FIXME: Change this to collect replaced GVs and later erase
         // them from the parent module once thinLTOResolvePrevailingGUID is
diff --git a/llvm/test/ThinLTO/AMDGPU/Inputs/noinline.ll b/llvm/test/ThinLTO/AMDGPU/Inputs/noinline.ll
new file mode 100644
index 0000000000000..6f93f880c3157
--- /dev/null
+++ b/llvm/test/ThinLTO/AMDGPU/Inputs/noinline.ll
@@ -0,0 +1,15 @@
+define void @f2(ptr %v) #0 {
+entry:
+  %v.addr = alloca ptr, align 8, addrspace(5)
+  store ptr %v, ptr addrspace(5) %v.addr, align 8
+  call void @f3(ptr %v)
+  ret void
+}
+
+define weak hidden void @f3(ptr %v) #0 {
+entry:
+  store i32 12345, ptr %v
+  ret void
+}
+
+attributes #0 = { noinline }
diff --git a/llvm/test/ThinLTO/AMDGPU/force-import-all.ll b/llvm/test/ThinLTO/AMDGPU/force-import-all.ll
new file mode 100644
index 0000000000000..f8beb0dae390f
--- /dev/null
+++ b/llvm/test/ThinLTO/AMDGPU/force-import-all.ll
@@ -0,0 +1,26 @@
+; RUN: opt -mtriple=amdgcn-amd-amdhsa -module-summary %s -o %t1.bc
+; RUN: opt -mtriple=amdgcn-amd-amdhsa -module-summary %p/Inputs/noinline.ll -o %t2.bc
+; RUN: llvm-lto -thinlto-action=run -force-import-all %t1.bc %t2.bc -thinlto-save-temps=%t3.
+; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s --check-prefix=MOD1
+; RUN: llvm-dis %t3.1.3.imported.bc -o - | FileCheck %s --check-prefix=MOD2
+
+define i32 @f1(ptr %p) #0 {
+entry:
+  call void @f2(ptr %p)
+  ret i32 0
+}
+
+define weak hidden void @f3(ptr %v) #0 {
+entry:
+  store i32 12345, ptr %v
+  ret void
+}
+
+declare void @f2(ptr)
+
+attributes #0 = { noinline }
+
+; MOD1: define weak hidden void @f3
+; MOD1: define available_externally void @f2
+; MOD2: define void @f2
+; MOD2: define available_externally hidden void @f3
diff --git a/llvm/test/ThinLTO/AMDGPU/lit.local.cfg b/llvm/test/ThinLTO/AMDGPU/lit.local.cfg
new file mode 100644
index 0000000000000..7c492428aec76
--- /dev/null
+++ b/llvm/test/ThinLTO/AMDGPU/lit.local.cfg
@@ -0,0 +1,2 @@
+if not "AMDGPU" in config.root.targets:
+    config.unsupported = True

@llvmbot
Copy link
Member

llvmbot commented Apr 6, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Shilei Tian (shiltian)

Changes

On one hand, we intend to force-import-all functions when the option is
enabled. On the other hand, we currently drop definitions of some functions and
convert them to declarations, which contradicts this intent.

With this PR, functions will no longer be converted to declarations when
force-import-all is enabled.


Full diff: https://github.com/llvm/llvm-project/pull/134541.diff

4 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/FunctionImport.cpp (+4-1)
  • (added) llvm/test/ThinLTO/AMDGPU/Inputs/noinline.ll (+15)
  • (added) llvm/test/ThinLTO/AMDGPU/force-import-all.ll (+26)
  • (added) llvm/test/ThinLTO/AMDGPU/lit.local.cfg (+2)
diff --git a/llvm/lib/Transforms/IPO/FunctionImport.cpp b/llvm/lib/Transforms/IPO/FunctionImport.cpp
index f1dce5d7904f9..19a6bc55b0147 100644
--- a/llvm/lib/Transforms/IPO/FunctionImport.cpp
+++ b/llvm/lib/Transforms/IPO/FunctionImport.cpp
@@ -1701,13 +1701,16 @@ void llvm::thinLTOFinalizeInModule(Module &TheModule,
     if (NewLinkage == GV.getLinkage())
       return;
 
+    bool ForceImportFunction = isa<Function>(GV) && ForceImportAll;
+
     // Check for a non-prevailing def that has interposable linkage
     // (e.g. non-odr weak or linkonce). In that case we can't simply
     // convert to available_externally, since it would lose the
     // interposable property and possibly get inlined. Simply drop
     // the definition in that case.
     if (GlobalValue::isAvailableExternallyLinkage(NewLinkage) &&
-        GlobalValue::isInterposableLinkage(GV.getLinkage())) {
+        GlobalValue::isInterposableLinkage(GV.getLinkage()) &&
+        !ForceImportFunction) {
       if (!convertToDeclaration(GV))
         // FIXME: Change this to collect replaced GVs and later erase
         // them from the parent module once thinLTOResolvePrevailingGUID is
diff --git a/llvm/test/ThinLTO/AMDGPU/Inputs/noinline.ll b/llvm/test/ThinLTO/AMDGPU/Inputs/noinline.ll
new file mode 100644
index 0000000000000..6f93f880c3157
--- /dev/null
+++ b/llvm/test/ThinLTO/AMDGPU/Inputs/noinline.ll
@@ -0,0 +1,15 @@
+define void @f2(ptr %v) #0 {
+entry:
+  %v.addr = alloca ptr, align 8, addrspace(5)
+  store ptr %v, ptr addrspace(5) %v.addr, align 8
+  call void @f3(ptr %v)
+  ret void
+}
+
+define weak hidden void @f3(ptr %v) #0 {
+entry:
+  store i32 12345, ptr %v
+  ret void
+}
+
+attributes #0 = { noinline }
diff --git a/llvm/test/ThinLTO/AMDGPU/force-import-all.ll b/llvm/test/ThinLTO/AMDGPU/force-import-all.ll
new file mode 100644
index 0000000000000..f8beb0dae390f
--- /dev/null
+++ b/llvm/test/ThinLTO/AMDGPU/force-import-all.ll
@@ -0,0 +1,26 @@
+; RUN: opt -mtriple=amdgcn-amd-amdhsa -module-summary %s -o %t1.bc
+; RUN: opt -mtriple=amdgcn-amd-amdhsa -module-summary %p/Inputs/noinline.ll -o %t2.bc
+; RUN: llvm-lto -thinlto-action=run -force-import-all %t1.bc %t2.bc -thinlto-save-temps=%t3.
+; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s --check-prefix=MOD1
+; RUN: llvm-dis %t3.1.3.imported.bc -o - | FileCheck %s --check-prefix=MOD2
+
+define i32 @f1(ptr %p) #0 {
+entry:
+  call void @f2(ptr %p)
+  ret i32 0
+}
+
+define weak hidden void @f3(ptr %v) #0 {
+entry:
+  store i32 12345, ptr %v
+  ret void
+}
+
+declare void @f2(ptr)
+
+attributes #0 = { noinline }
+
+; MOD1: define weak hidden void @f3
+; MOD1: define available_externally void @f2
+; MOD2: define void @f2
+; MOD2: define available_externally hidden void @f3
diff --git a/llvm/test/ThinLTO/AMDGPU/lit.local.cfg b/llvm/test/ThinLTO/AMDGPU/lit.local.cfg
new file mode 100644
index 0000000000000..7c492428aec76
--- /dev/null
+++ b/llvm/test/ThinLTO/AMDGPU/lit.local.cfg
@@ -0,0 +1,2 @@
+if not "AMDGPU" in config.root.targets:
+    config.unsupported = True

@llvmbot
Copy link
Member

llvmbot commented Apr 6, 2025

@llvm/pr-subscribers-lto

Author: Shilei Tian (shiltian)

Changes

On one hand, we intend to force-import-all functions when the option is
enabled. On the other hand, we currently drop definitions of some functions and
convert them to declarations, which contradicts this intent.

With this PR, functions will no longer be converted to declarations when
force-import-all is enabled.


Full diff: https://github.com/llvm/llvm-project/pull/134541.diff

4 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/FunctionImport.cpp (+4-1)
  • (added) llvm/test/ThinLTO/AMDGPU/Inputs/noinline.ll (+15)
  • (added) llvm/test/ThinLTO/AMDGPU/force-import-all.ll (+26)
  • (added) llvm/test/ThinLTO/AMDGPU/lit.local.cfg (+2)
diff --git a/llvm/lib/Transforms/IPO/FunctionImport.cpp b/llvm/lib/Transforms/IPO/FunctionImport.cpp
index f1dce5d7904f9..19a6bc55b0147 100644
--- a/llvm/lib/Transforms/IPO/FunctionImport.cpp
+++ b/llvm/lib/Transforms/IPO/FunctionImport.cpp
@@ -1701,13 +1701,16 @@ void llvm::thinLTOFinalizeInModule(Module &TheModule,
     if (NewLinkage == GV.getLinkage())
       return;
 
+    bool ForceImportFunction = isa<Function>(GV) && ForceImportAll;
+
     // Check for a non-prevailing def that has interposable linkage
     // (e.g. non-odr weak or linkonce). In that case we can't simply
     // convert to available_externally, since it would lose the
     // interposable property and possibly get inlined. Simply drop
     // the definition in that case.
     if (GlobalValue::isAvailableExternallyLinkage(NewLinkage) &&
-        GlobalValue::isInterposableLinkage(GV.getLinkage())) {
+        GlobalValue::isInterposableLinkage(GV.getLinkage()) &&
+        !ForceImportFunction) {
       if (!convertToDeclaration(GV))
         // FIXME: Change this to collect replaced GVs and later erase
         // them from the parent module once thinLTOResolvePrevailingGUID is
diff --git a/llvm/test/ThinLTO/AMDGPU/Inputs/noinline.ll b/llvm/test/ThinLTO/AMDGPU/Inputs/noinline.ll
new file mode 100644
index 0000000000000..6f93f880c3157
--- /dev/null
+++ b/llvm/test/ThinLTO/AMDGPU/Inputs/noinline.ll
@@ -0,0 +1,15 @@
+define void @f2(ptr %v) #0 {
+entry:
+  %v.addr = alloca ptr, align 8, addrspace(5)
+  store ptr %v, ptr addrspace(5) %v.addr, align 8
+  call void @f3(ptr %v)
+  ret void
+}
+
+define weak hidden void @f3(ptr %v) #0 {
+entry:
+  store i32 12345, ptr %v
+  ret void
+}
+
+attributes #0 = { noinline }
diff --git a/llvm/test/ThinLTO/AMDGPU/force-import-all.ll b/llvm/test/ThinLTO/AMDGPU/force-import-all.ll
new file mode 100644
index 0000000000000..f8beb0dae390f
--- /dev/null
+++ b/llvm/test/ThinLTO/AMDGPU/force-import-all.ll
@@ -0,0 +1,26 @@
+; RUN: opt -mtriple=amdgcn-amd-amdhsa -module-summary %s -o %t1.bc
+; RUN: opt -mtriple=amdgcn-amd-amdhsa -module-summary %p/Inputs/noinline.ll -o %t2.bc
+; RUN: llvm-lto -thinlto-action=run -force-import-all %t1.bc %t2.bc -thinlto-save-temps=%t3.
+; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s --check-prefix=MOD1
+; RUN: llvm-dis %t3.1.3.imported.bc -o - | FileCheck %s --check-prefix=MOD2
+
+define i32 @f1(ptr %p) #0 {
+entry:
+  call void @f2(ptr %p)
+  ret i32 0
+}
+
+define weak hidden void @f3(ptr %v) #0 {
+entry:
+  store i32 12345, ptr %v
+  ret void
+}
+
+declare void @f2(ptr)
+
+attributes #0 = { noinline }
+
+; MOD1: define weak hidden void @f3
+; MOD1: define available_externally void @f2
+; MOD2: define void @f2
+; MOD2: define available_externally hidden void @f3
diff --git a/llvm/test/ThinLTO/AMDGPU/lit.local.cfg b/llvm/test/ThinLTO/AMDGPU/lit.local.cfg
new file mode 100644
index 0000000000000..7c492428aec76
--- /dev/null
+++ b/llvm/test/ThinLTO/AMDGPU/lit.local.cfg
@@ -0,0 +1,2 @@
+if not "AMDGPU" in config.root.targets:
+    config.unsupported = True

@shiltian shiltian force-pushed the users/shiltian/dont-convert-to-decl-if-force-import-all branch from 70db06f to 60dd2c2 Compare April 6, 2025 14:46
@shiltian shiltian force-pushed the users/shiltian/dont-convert-to-decl-if-force-import-all branch 2 times, most recently from fd9f457 to 35d8ee7 Compare April 8, 2025 02:36
; RUN: opt -mtriple=amdgcn-amd-amdhsa -module-summary %s -o %t.main.bc
; RUN: opt -mtriple=amdgcn-amd-amdhsa -module-summary %p/Inputs/in-0.ll -o %t.in.0.bc
; RUN: opt -mtriple=amdgcn-amd-amdhsa -module-summary %p/Inputs/in-1.ll -o %t.in.1.bc
; RUN: llvm-lto -thinlto-action=run -force-import-all %t.main.bc %t.in.0.bc %t.in.1.bc --thinlto-save-temps=%t.2.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you just need to run -thinlto-action=import instead of running the full pipeline and rely on save-temps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to do that as well, but it looks like thinLTOResolvePrevailingGUID doesn't run if we simply does import, as shown below:

; RUN: llvm-lto -thinlto-action=thinlink -force-import-all %t.main.bc %t.in.bc -o %t.index.bc
; RUN: llvm-lto -thinlto-action=import -force-import-all %t.main.bc %t.in.bc --thinlto-index=%t.index.bc
; RUN: llvm-dis %t.main.bc.thinlto.imported.bc -o - | FileCheck --check-prefix=MOD1 %s
; RUN: llvm-dis %t.in.bc.thinlto.imported.bc -o - | FileCheck --check-prefix=MOD2 %s

Did I miss anything here?

…l` is enabled

On one hand, we intend to `force-import-all` functions when the option is
enabled. On the other hand, we currently drop definitions of some functions and
convert them to declarations, which contradicts this intent.

With this PR, functions will no longer be converted to declarations when
`force-import-all` is enabled.
@shiltian shiltian force-pushed the users/shiltian/dont-convert-to-decl-if-force-import-all branch from 35d8ee7 to 3f4e024 Compare April 12, 2025 18:21
@shiltian
Copy link
Contributor Author

@cachemeifyoucan How about this version?

Copy link
Collaborator

@cachemeifyoucan cachemeifyoucan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@shiltian shiltian merged commit b0fede3 into main Apr 12, 2025
11 checks passed
@shiltian shiltian deleted the users/shiltian/dont-convert-to-decl-if-force-import-all branch April 12, 2025 22:22
@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 12, 2025

LLVM Buildbot has detected a new failure on builder lldb-aarch64-ubuntu running on linaro-lldb-aarch64-ubuntu while building llvm at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/59/builds/15917

Here is the relevant piece of the build log for the reference
Step 6 (test) failure: build (failure)
...
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentDelaySignalBreak.py (624 of 2122)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentDelaySignalWatch.py (625 of 2122)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentDelayWatchBreak.py (626 of 2122)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentDelayedCrashWithBreakpointSignal.py (627 of 2122)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentDelayedCrashWithBreakpointWatchpoint.py (628 of 2122)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentManyBreakpoints.py (629 of 2122)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentSignalBreak.py (630 of 2122)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentNWatchNBreak.py (631 of 2122)
PASS: lldb-api :: functionalities/statusline/TestStatusline.py (632 of 2122)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentManyCrash.py (633 of 2122)
FAIL: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentManyWatchpoints.py (634 of 2122)
******************** TEST 'lldb-api :: functionalities/thread/concurrent_events/TestConcurrentManyWatchpoints.py' FAILED ********************
Script:
--
/usr/bin/python3.10 /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin --arch aarch64 --build-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/lldb --compiler /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/clang --dsymutil /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/dsymutil --make /usr/bin/gmake --llvm-tools-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin --lldb-obj-root /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/tools/lldb --lldb-libs-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./lib /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/functionalities/thread/concurrent_events -p TestConcurrentManyWatchpoints.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 21.0.0git (https://github.com/llvm/llvm-project.git revision b0fede358fea768ead2dc5d59ad0b6b2decc4217)
  clang revision b0fede358fea768ead2dc5d59ad0b6b2decc4217
  llvm revision b0fede358fea768ead2dc5d59ad0b6b2decc4217
Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc']

--
Command Output (stderr):
--
FAIL: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test (TestConcurrentManyWatchpoints.ConcurrentManyWatchpoints)
======================================================================
FAIL: test (TestConcurrentManyWatchpoints.ConcurrentManyWatchpoints)
   Test 100 watchpoints from 100 threads.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/decorators.py", line 148, in wrapper
    return func(*args, **kwargs)
  File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/functionalities/thread/concurrent_events/TestConcurrentManyWatchpoints.py", line 14, in test
    self.do_thread_actions(num_watchpoint_threads=100)
  File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/concurrent_base.py", line 225, in do_thread_actions
    self.assertEqual(
AssertionError: 1 != 101 : Expected to see 101 threads, but seeing 1. Details:
thread 1 running due to none at
	
Config=aarch64-/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang
----------------------------------------------------------------------
Ran 1 test in 6.644s

FAILED (failures=1)


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AMDGPU llvm:transforms LTO Link time optimization (regular/full LTO or ThinLTO)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants