Skip to content

Conversation

@SLTozer
Copy link
Contributor

@SLTozer SLTozer commented Apr 23, 2025

Fixes the issue reported following the merge of #118026.

When a valid musttail call is made, the function it is made from must return immediately after the call; if there are any cleanups left in the function, then an error is triggered. This is not necessary for fake uses however - it is perfectly valid to simply emit the fake use "cleanup" code before the tail call, and indeed LLVM will automatically move any fake uses following a tail call to come before the tail call.

Therefore, this patch specifically choose to handle fake use cleanups when a musttail call is present by simply emitting them immediately before the call.

Fixes the issue reported following the merge of llvm#118026.

When a valid `musttail` call is made, the function it is made from must
return immediately after the call; if there are any cleanups left in the
function, then an error is triggered. This is not necessary for fake uses
however - it is perfectly valid to simply emit the fake use "cleanup" code
before the tail call, and indeed LLVM will automatically move any fake uses
following a tail call to come before the tail call.

Therefore, this patch specifically choose to handle fake use cleanups when
a musttail call is present by simply emitting them immediately before the
call.
@SLTozer SLTozer added the clang:codegen IR generation bugs: mangling, exceptions, etc. label Apr 23, 2025
@SLTozer SLTozer self-assigned this Apr 23, 2025
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Apr 23, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 23, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Stephen Tozer (SLTozer)

Changes

Fixes the issue reported following the merge of #118026.

When a valid musttail call is made, the function it is made from must return immediately after the call; if there are any cleanups left in the function, then an error is triggered. This is not necessary for fake uses however - it is perfectly valid to simply emit the fake use "cleanup" code before the tail call, and indeed LLVM will automatically move any fake uses following a tail call to come before the tail call.

Therefore, this patch specifically choose to handle fake use cleanups when a musttail call is present by simply emitting them immediately before the call.


Full diff: https://github.com/llvm/llvm-project/pull/136867.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CGCall.cpp (+11-1)
  • (added) clang/test/CodeGenCXX/fake-use-musttail.cpp (+72)
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 8cb27420dd911..c8cddfc06fce7 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -6001,8 +6001,18 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
     for (auto it = EHStack.find(CurrentCleanupScopeDepth); it != EHStack.end();
          ++it) {
       EHCleanupScope *Cleanup = dyn_cast<EHCleanupScope>(&*it);
-      if (!(Cleanup && Cleanup->getCleanup()->isRedundantBeforeReturn()))
+      // Fake uses can be safely emitted immediately prior to the tail call; we
+      // choose to emit the fake use before the call rather than after, to avoid
+      // forcing variable values from every call on the "stack" to be preserved
+      // simultaneously.
+      if (Cleanup && Cleanup->isFakeUse()) {
+        CGBuilderTy::InsertPointGuard IPG(Builder);
+        Builder.SetInsertPoint(CI);
+        Cleanup->getCleanup()->Emit(*this, EHScopeStack::Cleanup::Flags());
+      } else if (!(Cleanup &&
+                   Cleanup->getCleanup()->isRedundantBeforeReturn())) {
         CGM.ErrorUnsupported(MustTailCall, "tail call skipping over cleanups");
+      }
     }
     if (CI->getType()->isVoidTy())
       Builder.CreateRetVoid();
diff --git a/clang/test/CodeGenCXX/fake-use-musttail.cpp b/clang/test/CodeGenCXX/fake-use-musttail.cpp
new file mode 100644
index 0000000000000..7d1420ce5fdb6
--- /dev/null
+++ b/clang/test/CodeGenCXX/fake-use-musttail.cpp
@@ -0,0 +1,72 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
+// RUN: %clang_cc1 -emit-llvm -fextend-variable-liveness -o - %s | FileCheck %s
+
+/// Tests that when we have fake uses in a function ending in a musttail call,
+/// we emit the fake uses and their corresponding loads immediately prior to the
+/// tail call.
+
+struct Class1 {
+  Class1(int);
+};
+
+class Class2 {
+  static const char *foo(int *, const char *, int *, Class1, const int *,
+                         unsigned long);
+  template <class>
+  static char *bar(int *, const char *, int *, Class1, const int *, unsigned long);
+};
+
+// CHECK-LABEL: define dso_local noundef ptr @_ZN6Class23fooEPiPKcS0_6Class1PKim(
+// CHECK-SAME: ptr noundef [[E:%.*]], ptr noundef [[F:%.*]], ptr noundef [[G:%.*]], ptr noundef [[H:%.*]], i64 noundef [[I:%.*]]) #[[ATTR0:[0-9]+]] align 2 {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[TMP0:%.*]] = alloca [[STRUCT_CLASS1:%.*]], align 1
+// CHECK-NEXT:    [[E_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[F_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[G_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[H_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[I_ADDR:%.*]] = alloca i64, align 8
+// CHECK-NEXT:    [[AGG_TMP:%.*]] = alloca [[STRUCT_CLASS1]], align 1
+// CHECK-NEXT:    store ptr [[E]], ptr [[E_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[F]], ptr [[F_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[G]], ptr [[G_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[H]], ptr [[H_ADDR]], align 8
+// CHECK-NEXT:    store i64 [[I]], ptr [[I_ADDR]], align 8
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr, ptr [[E_ADDR]], align 8
+// CHECK-NEXT:    [[TMP2:%.*]] = load ptr, ptr [[F_ADDR]], align 8
+// CHECK-NEXT:    [[TMP3:%.*]] = load ptr, ptr [[G_ADDR]], align 8
+// CHECK-NEXT:    call void @_ZN6Class1C1Ei(ptr noundef nonnull align 1 dereferenceable(1) [[AGG_TMP]], i32 noundef 0)
+// CHECK-NEXT:    [[TMP4:%.*]] = load ptr, ptr [[H_ADDR]], align 8
+// CHECK-NEXT:    [[TMP5:%.*]] = load i64, ptr [[I_ADDR]], align 8
+// CHECK-NEXT:    [[FAKE_USE:%.*]] = load i64, ptr [[I_ADDR]], align 8
+// CHECK-NEXT:    notail call void (...) @llvm.fake.use(i64 [[FAKE_USE]]) #[[ATTR3:[0-9]+]]
+// CHECK-NEXT:    [[FAKE_USE1:%.*]] = load ptr, ptr [[H_ADDR]], align 8
+// CHECK-NEXT:    notail call void (...) @llvm.fake.use(ptr [[FAKE_USE1]]) #[[ATTR3]]
+// CHECK-NEXT:    [[FAKE_USE2:%.*]] = load [[STRUCT_CLASS1]], ptr [[TMP0]], align 1
+// CHECK-NEXT:    notail call void (...) @llvm.fake.use([[STRUCT_CLASS1]] [[FAKE_USE2]]) #[[ATTR3]]
+// CHECK-NEXT:    [[FAKE_USE3:%.*]] = load ptr, ptr [[G_ADDR]], align 8
+// CHECK-NEXT:    notail call void (...) @llvm.fake.use(ptr [[FAKE_USE3]]) #[[ATTR3]]
+// CHECK-NEXT:    [[FAKE_USE4:%.*]] = load ptr, ptr [[F_ADDR]], align 8
+// CHECK-NEXT:    notail call void (...) @llvm.fake.use(ptr [[FAKE_USE4]]) #[[ATTR3]]
+// CHECK-NEXT:    [[FAKE_USE5:%.*]] = load ptr, ptr [[E_ADDR]], align 8
+// CHECK-NEXT:    notail call void (...) @llvm.fake.use(ptr [[FAKE_USE5]]) #[[ATTR3]]
+// CHECK-NEXT:    [[CALL:%.*]] = musttail call noundef ptr @_ZN6Class23barIiEEPcPiPKcS2_6Class1PKim(ptr noundef [[TMP1]], ptr noundef [[TMP2]], ptr noundef [[TMP3]], ptr noundef [[TMP4]], i64 noundef [[TMP5]])
+// CHECK-NEXT:    ret ptr [[CALL]]
+// CHECK:       [[BB6:.*:]]
+// CHECK-NEXT:    [[FAKE_USE6:%.*]] = load i64, ptr [[I_ADDR]], align 8
+// CHECK-NEXT:    notail call void (...) @llvm.fake.use(i64 [[FAKE_USE6]]) #[[ATTR3]]
+// CHECK-NEXT:    [[FAKE_USE7:%.*]] = load ptr, ptr [[H_ADDR]], align 8
+// CHECK-NEXT:    notail call void (...) @llvm.fake.use(ptr [[FAKE_USE7]]) #[[ATTR3]]
+// CHECK-NEXT:    [[FAKE_USE8:%.*]] = load [[STRUCT_CLASS1]], ptr [[TMP0]], align 1
+// CHECK-NEXT:    notail call void (...) @llvm.fake.use([[STRUCT_CLASS1]] [[FAKE_USE8]]) #[[ATTR3]]
+// CHECK-NEXT:    [[FAKE_USE9:%.*]] = load ptr, ptr [[G_ADDR]], align 8
+// CHECK-NEXT:    notail call void (...) @llvm.fake.use(ptr [[FAKE_USE9]]) #[[ATTR3]]
+// CHECK-NEXT:    [[FAKE_USE10:%.*]] = load ptr, ptr [[F_ADDR]], align 8
+// CHECK-NEXT:    notail call void (...) @llvm.fake.use(ptr [[FAKE_USE10]]) #[[ATTR3]]
+// CHECK-NEXT:    [[FAKE_USE11:%.*]] = load ptr, ptr [[E_ADDR]], align 8
+// CHECK-NEXT:    notail call void (...) @llvm.fake.use(ptr [[FAKE_USE11]]) #[[ATTR3]]
+// CHECK-NEXT:    ret ptr undef
+//
+const char *Class2::foo(int *e, const char *f, int *g, Class1, const int *h,
+                        unsigned long i) {
+  [[clang::musttail]] return bar<int>(e, f, g, int(), h, i);
+}

@gulfemsavrun
Copy link
Contributor

Thanks for the fix, and I verified that it resolves the issue that I reported in #118026 (comment).

@wolfy1961
Copy link
Collaborator

LGTM

@gulfemsavrun
Copy link
Contributor

Is there anything that is blocking this PR to be merged? If it needs more time to review, could you please revert the original change and reland with this fix? This is blocking us to roll Clang in our project.

Copy link
Contributor

@rjmccall rjmccall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionality seems fine. A couple minor requests, then LGTM.

// Fake uses can be safely emitted immediately prior to the tail call; we
// choose to emit the fake use before the call rather than after, to avoid
// forcing variable values from every call on the "stack" to be preserved
// simultaneously.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second half of this comment is odd. There isn't really a choice about this; the fake uses must be emitted prior to the tail call because there cannot be anything after it.

// CHECK-NEXT: [[TMP3:%.*]] = load ptr, ptr [[H_ADDR]], align 8
// CHECK-NEXT: [[TMP4:%.*]] = load i64, ptr [[I_ADDR]], align 8
// CHECK-NEXT: [[FAKE_USE:%.*]] = load i64, ptr [[I_ADDR]], align 8
// CHECK-NEXT: notail call void (...) @llvm.fake.use(i64 [[FAKE_USE]]) #[[ATTR3:[0-9]+]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're not testing the attributes, you can just leave this off; matches don't have to match the entire line.

@SLTozer SLTozer merged commit 2d00c73 into llvm:main Apr 25, 2025
11 checks passed
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Fixes the issue reported following the merge of llvm#118026.

When a valid `musttail` call is made, the function it is made from must
return immediately after the call; if there are any cleanups left in the
function, then an error is triggered. This is not necessary for fake
uses however - it is perfectly valid to simply emit the fake use
"cleanup" code before the tail call, and indeed LLVM will automatically
move any fake uses following a tail call to come before the tail call.

Therefore, this patch specifically choose to handle fake use cleanups
when a musttail call is present by simply emitting them immediately
before the call.
SLTozer added a commit that referenced this pull request Jun 19, 2025
Relands this feature after several fixes:

* Force fake uses to be emitted before musttail calls (#136867)
* Added soften-float legalization for fake uses (#142714)
* Treat fake uses as size-less instructions in a SystemZ assert (#144390)

If further issues with fake uses are found then this may be reverted again,
but all currently-known issues are resolved.

This reverts commit 2dc6e98.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants