Skip to content

Conversation

@NewSigma
Copy link
Contributor

@NewSigma NewSigma commented Jul 29, 2025

This patch attempts to implement piece of the proposed solution to CWG2563:

[9.6.4 dcl.fct.def.coroutine.p8] This return exits the scope of gro. It exits the scope of promise only if the coroutine completed without suspending.

If a coroutine completes without suspending, it does not exit the scope of the promise until GRO conversion is done, because GRO conversion is considered part of the coroutine execution. The current behavior performs conversion after coroutine state cleanup, which does not conform to the standard:

before.cleanup:
  ; ...
  br label %coro.cleanup

coro.cleanup:
  ; cleanup logics
  br %coro.end

coro.end:
  call void @llvm.coro.end(ptr null, i1 false, token none)
  ; GRO conversion
  ; ret GRO or void

This patch proposes the following codegen:

any.suspend:
  %suspend = call i8 @llvm.coro.suspend(token %8, i1 true)
  switch i8 %suspend, label %pre.gro.conv [
    i8 0, label %ready
    i8 1, label %coro.cleanup
  ]

ready:
  ; ...

pre.gro.conv:
  %body.done = phi i1 [ false, %any.suspend ], [ true, %any.ready ]
  %InRamp = call i1 @llvm.coro.is_in_ramp()
  br i1 %InRamp, label %gro.conv, label %after.gro.conv

gro.conv:
  ; GRO conversion
  br label %after.gro.conv

after.gro.conv:
  br i1 %body.done, label %coro.cleanup, label %coro.ret

coro.cleanup:
  ; cleanup logics
  br %coro.ret

coro.ret:
  call void @llvm.coro.end(ptr null, i1 false, token none)
  ; ret GRO or void

I'm not quite sure I understood the wording correctly. Feel free to correct me.

Close #120200

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. coroutines C++20 coroutines llvm:transforms labels Jul 29, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 29, 2025

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-coroutines
@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-llvm-transforms

Author: Weibo He (NewSigma)

Changes

This patch attempts to implement piece of the proposed solution to CWG2563:

> [9.6.4 dcl.fct.def.coroutine.p8] This return exits the scope of gro. It exits the scope of promise only if the coroutine completed without suspending.

The patch contains two parts:

  1. Middle end part: Do not lower coro.end to ret void if its return value will be used. This allows front end separate coro start section and resume section without early return from resume function.
  2. Front end part: Clone and defer the destruction of coro promise if the coroutine completed without suspending. This is achieved by branching just after destruction of the function parameter copies.

I'm not quite sure I understood the wording correctly. Feel free to correct me.

Close #120200


Patch is 20.81 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/151067.diff

11 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+3)
  • (modified) clang/lib/CodeGen/CGCoroutine.cpp (+111-16)
  • (modified) clang/test/CodeGenCXX/type-aware-coroutines.cpp (+5-5)
  • (modified) clang/test/CodeGenCoroutines/coro-dest-slot.cpp (+1-1)
  • (modified) clang/test/CodeGenCoroutines/coro-params.cpp (+11-4)
  • (added) clang/test/CodeGenCoroutines/coro-promise-dtor2.cpp (+53)
  • (modified) clang/test/CodeGenCoroutines/coro-symmetric-transfer-02.cpp (+2-2)
  • (modified) llvm/docs/Coroutines.rst (+6-7)
  • (modified) llvm/include/llvm/Transforms/Coroutines/CoroInstr.h (+1-1)
  • (modified) llvm/lib/Transforms/Coroutines/CoroSplit.cpp (+1-1)
  • (added) llvm/test/Transforms/Coroutines/coro-split-coroend.ll (+42)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index dd6b44dd0c657..e1ea3ec53244a 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -151,6 +151,9 @@ C++20 Feature Support
 - Implement constant evaluation of lambdas that capture structured bindings.
   (#GH145956)
 
+- Fixed premature coroutine promise destruction when the coroutine completes
+  without suspending, as part of CWG2563. (#GH120200)
+
 C++17 Feature Support
 ^^^^^^^^^^^^^^^^^^^^^
 
diff --git a/clang/lib/CodeGen/CGCoroutine.cpp b/clang/lib/CodeGen/CGCoroutine.cpp
index 5ee908922b5a3..805670249c315 100644
--- a/clang/lib/CodeGen/CGCoroutine.cpp
+++ b/clang/lib/CodeGen/CGCoroutine.cpp
@@ -73,6 +73,15 @@ struct clang::CodeGen::CGCoroData {
   // the address of the coroutine frame of the current coroutine.
   llvm::CallInst *CoroBegin = nullptr;
 
+  // Stores call to promise destruction and the llvm.lifetime.end of promise alloca
+  // We will clone them into deferred promise free block.
+  llvm::CallInst *PromiseDtor = nullptr;
+  llvm::CallInst *PromiseEnd = nullptr;
+
+  // Stores the llvm.coro.end that identifies if the coroutine exit without
+  // suspend.
+  llvm::CallInst *InResume = nullptr;
+
   // Stores the last emitted coro.free for the deallocate expressions, we use it
   // to wrap dealloc code with if(auto mem = coro.free) dealloc(mem).
   llvm::CallInst *LastCoroFree = nullptr;
@@ -595,7 +604,7 @@ struct CallCoroEnd final : public EHScopeStack::Cleanup {
 namespace {
 // Make sure to call coro.delete on scope exit.
 struct CallCoroDelete final : public EHScopeStack::Cleanup {
-  Stmt *Deallocate;
+  const CoroutineBodyStmt *S;
 
   // Emit "if (coro.free(CoroId, CoroBegin)) Deallocate;"
 
@@ -605,16 +614,73 @@ struct CallCoroDelete final : public EHScopeStack::Cleanup {
   // builds a single call to a deallocation function which is safe to emit
   // multiple times.
   void Emit(CodeGenFunction &CGF, Flags) override {
+    bool Sinked = CGF.CurCoro.Data->InResume != nullptr;
+    if (!Sinked)
+      sinkPromiseBB(CGF);
+    EmitCoroFree(CGF);
+  }
+
+  explicit CallCoroDelete(const CoroutineBodyStmt *S) : S(S) {}
+
+  // Sink promise destructor to separate block
+  void sinkPromiseBB(CodeGenFunction &CGF) {
+    auto &Builder = CGF.Builder;
+    auto &CoroData = *CGF.CurCoro.Data;
+    auto *CleanupBB = Builder.GetInsertBlock()->getSinglePredecessor();
+    auto *Promise = CGF.GetAddrOfLocalVar(S->getPromiseDecl()).getBasePointer();
+    llvm::CallInst *PromiseDtor = nullptr;
+    BasicBlock *PromiseBB = nullptr;
+    for (llvm::User *U : Promise->users()) {
+      auto *I = dyn_cast<llvm::CallInst>(U);
+      if (!I || I->getParent() != CleanupBB)
+        continue;
+
+      if (I->isLifetimeStartOrEnd()) {
+        assert(!CoroData.PromiseEnd && "Unexpected multiple lifetime.end");
+        I->moveBefore(CleanupBB->getTerminator()->getIterator());
+        PromiseBB = CleanupBB->splitBasicBlock(I, "promise.free");
+        CoroData.PromiseEnd = I;
+      }
+      else {
+        assert(!PromiseDtor && "Unexpected multiple destructor call");
+        PromiseDtor = I;
+      }
+    }
+
+    if (PromiseDtor) {
+      PromiseDtor->moveBefore(CoroData.PromiseEnd->getIterator());
+      CoroData.PromiseDtor = PromiseDtor;
+    }
+
+    BasicBlock *SaveInsertBlock = CGF.Builder.GetInsertBlock();
+    CleanupBB->getTerminator()->eraseFromParent();
+    Builder.SetInsertPoint(CleanupBB);
+
+    llvm::Function *CoroEnd = CGF.CGM.getIntrinsic(llvm::Intrinsic::coro_end);
+    auto *NullPtr = llvm::ConstantPointerNull::get(CGF.Int8PtrTy);
+    CoroData.InResume = Builder.CreateCall(
+        CoroEnd,
+        {NullPtr, Builder.getFalse(),
+         llvm::ConstantTokenNone::get(CoroEnd->getContext())},
+        "InResume");
+
+    BasicBlock *EndBB = CoroData.CleanupJD.getBlock();
+    Builder.CreateCondBr(CoroData.InResume, PromiseBB, EndBB);
+    Builder.SetInsertPoint(SaveInsertBlock);
+  }
+
+  void EmitCoroFree(CodeGenFunction &CGF, const Twine &NameSuffix = "") {
     // Remember the current point, as we are going to emit deallocation code
     // first to get to coro.free instruction that is an argument to a delete
     // call.
     BasicBlock *SaveInsertBlock = CGF.Builder.GetInsertBlock();
 
-    auto *FreeBB = CGF.createBasicBlock("coro.free");
+    Stmt *Deallocate = S->getDeallocate();
+    auto *FreeBB = CGF.createBasicBlock("coro.free" + NameSuffix);
     CGF.EmitBlock(FreeBB);
     CGF.EmitStmt(Deallocate);
 
-    auto *AfterFreeBB = CGF.createBasicBlock("after.coro.free");
+    auto *AfterFreeBB = CGF.createBasicBlock("after.coro.free" + NameSuffix);
     CGF.EmitBlock(AfterFreeBB);
 
     // We should have captured coro.free from the emission of deallocate.
@@ -639,7 +705,6 @@ struct CallCoroDelete final : public EHScopeStack::Cleanup {
     InsertPt->eraseFromParent();
     CGF.Builder.SetInsertPoint(AfterFreeBB);
   }
-  explicit CallCoroDelete(Stmt *DeallocStmt) : Deallocate(DeallocStmt) {}
 };
 }
 
@@ -787,13 +852,13 @@ void CodeGenFunction::EmitCoroutineBody(const CoroutineBodyStmt &S) {
   auto *AllocBB = createBasicBlock("coro.alloc");
   auto *InitBB = createBasicBlock("coro.init");
   auto *FinalBB = createBasicBlock("coro.final");
-  auto *RetBB = createBasicBlock("coro.ret");
+  auto *EndBB = createBasicBlock("coro.end");
 
   auto *CoroId = Builder.CreateCall(
       CGM.getIntrinsic(llvm::Intrinsic::coro_id),
       {Builder.getInt32(NewAlign), NullPtr, NullPtr, NullPtr});
   createCoroData(*this, CurCoro, CoroId);
-  CurCoro.Data->SuspendBB = RetBB;
+  CurCoro.Data->SuspendBB = EndBB;
   assert(ShouldEmitLifetimeMarkers &&
          "Must emit lifetime intrinsics for coroutines");
 
@@ -829,23 +894,25 @@ void CodeGenFunction::EmitCoroutineBody(const CoroutineBodyStmt &S) {
 
   EmitBlock(InitBB);
 
-  // Pass the result of the allocation to coro.begin.
-  auto *Phi = Builder.CreatePHI(VoidPtrTy, 2);
-  Phi->addIncoming(NullPtr, EntryBB);
-  Phi->addIncoming(AllocateCall, AllocOrInvokeContBB);
-  auto *CoroBegin = Builder.CreateCall(
-      CGM.getIntrinsic(llvm::Intrinsic::coro_begin), {CoroId, Phi});
-  CurCoro.Data->CoroBegin = CoroBegin;
+  {
+    // Pass the result of the allocation to coro.begin.
+    auto *Phi = Builder.CreatePHI(VoidPtrTy, 2);
+    Phi->addIncoming(NullPtr, EntryBB);
+    Phi->addIncoming(AllocateCall, AllocOrInvokeContBB);
+    auto *CoroBegin = Builder.CreateCall(
+        CGM.getIntrinsic(llvm::Intrinsic::coro_begin), {CoroId, Phi});
+    CurCoro.Data->CoroBegin = CoroBegin;
+  }
 
   GetReturnObjectManager GroManager(*this, S);
   GroManager.EmitGroAlloca();
 
-  CurCoro.Data->CleanupJD = getJumpDestInCurrentScope(RetBB);
+  CurCoro.Data->CleanupJD = getJumpDestInCurrentScope(EndBB);
   {
     CGDebugInfo *DI = getDebugInfo();
     ParamReferenceReplacerRAII ParamReplacer(LocalDeclMap);
     CodeGenFunction::RunCleanupsScope ResumeScope(*this);
-    EHStack.pushCleanup<CallCoroDelete>(NormalAndEHCleanup, S.getDeallocate());
+    EHStack.pushCleanup<CallCoroDelete>(NormalAndEHCleanup, &S);
 
     // Create mapping between parameters and copy-params for coroutine function.
     llvm::ArrayRef<const Stmt *> ParamMoves = S.getParamMoves();
@@ -950,7 +1017,14 @@ void CodeGenFunction::EmitCoroutineBody(const CoroutineBodyStmt &S) {
     }
   }
 
-  EmitBlock(RetBB);
+  EmitBlock(EndBB);
+  auto *Phi = Builder.CreatePHI(Builder.getInt1Ty(), llvm::pred_size(EndBB),
+                                "never.suspend");
+  BasicBlock *CleanupBB = CurCoro.Data->InResume->getParent();
+  for (auto *Pred : llvm::predecessors(EndBB)) {
+    auto *V = (Pred == CleanupBB) ? Builder.getTrue() : Builder.getFalse();
+    Phi->addIncoming(V, Pred);
+  }
   // Emit coro.end before getReturnStmt (and parameter destructors), since
   // resume and destroy parts of the coroutine should not include them.
   llvm::Function *CoroEnd = CGM.getIntrinsic(llvm::Intrinsic::coro_end);
@@ -971,7 +1045,28 @@ void CodeGenFunction::EmitCoroutineBody(const CoroutineBodyStmt &S) {
     // shouldn't change the AST.
     if (PreviousRetValue)
       cast<ReturnStmt>(Ret)->setRetValue(PreviousRetValue);
+
+    // Defer destruction of promise if coroutine completed without suspending
+    auto *DeferPromiseBB = createBasicBlock("promise.free.defer");
+    auto *RetBB = EndBB->splitBasicBlock(EndBB->getTerminator(), "coro.ret");
+    EndBB->getTerminator()->eraseFromParent();
+
+    Builder.SetInsertPoint(EndBB);
+    Builder.CreateCondBr(Phi, DeferPromiseBB, RetBB);
+
+    EmitBlock(DeferPromiseBB);
+    auto *InsertPt = Builder.CreateBr(RetBB);
+    if (auto *I = CurCoro.Data->PromiseDtor)
+      I->clone()->insertBefore(InsertPt->getIterator());
+    CurCoro.Data->PromiseEnd->clone()->insertBefore(InsertPt->getIterator());
+    InsertPt->eraseFromParent();
+
+    CallCoroDelete(&S).EmitCoroFree(*this, ".defer");
+    Builder.CreateBr(RetBB);
+    Builder.ClearInsertionPoint();
   }
+  else
+    EndBB->setName("coro.ret");
 
   // LLVM require the frontend to mark the coroutine.
   CurFn->setPresplitCoroutine();
diff --git a/clang/test/CodeGenCXX/type-aware-coroutines.cpp b/clang/test/CodeGenCXX/type-aware-coroutines.cpp
index 0a19079d987e9..84b054ff8a214 100644
--- a/clang/test/CodeGenCXX/type-aware-coroutines.cpp
+++ b/clang/test/CodeGenCXX/type-aware-coroutines.cpp
@@ -55,7 +55,7 @@ extern "C" resumable f1(int) {
   co_return;
 // CHECK: coro.alloc:
 // CHECK: _ZN9resumable12promise_typenwEmi
-// CHECK: coro.free:
+// CHECK: coro.free.defer:
 // CHECK: _ZN9resumable12promise_typedlEPv
 }
 
@@ -64,7 +64,7 @@ extern "C" resumable f2(float) {
   co_return;
 // CHECK: coro.alloc:
 // CHECK: _ZN9resumable12promise_typenwEmf
-// CHECK: coro.free:
+// CHECK: coro.free.defer:
 // CHECK: _ZN9resumable12promise_typedlEPv
 }
 
@@ -74,7 +74,7 @@ extern "C" resumable2 f3(int, float, const char*, Allocator) {
    co_return;
 // CHECK: coro.alloc:
 // CHECK: _ZN10resumable212promise_typenwIJifPKc9AllocatorEEEPvmDpT_
-// CHECK: coro.free:
+// CHECK: coro.free.defer:
 // CHECK: _ZN10resumable212promise_typedlEPv
 // CHECK: _ZN10resumable212promise_typedlEPv
 }
@@ -84,7 +84,7 @@ extern "C" resumable f4(int n = 10) {
    for (int i = 0; i < n; i++) co_yield i;
 // CHECK: coro.alloc:
 // CHECK: call {{.*}}@_ZN9resumable12promise_typenwEmi(
-// CHECK: coro.free:
+// CHECK: coro.free.defer:
 // CHECK: call void @_ZN9resumable12promise_typedlEPv(
 // CHECK: call void @_ZN9resumable12promise_typedlEPv(
 }
@@ -110,7 +110,7 @@ extern "C" resumable3 f5(float) {
   co_return;
 // CHECK: coro.alloc:
 // CHECK: call {{.*}}@_Znwm(
-// CHECK: coro.free:
+// CHECK: coro.free.defer:
 // CHECK: call void @_ZdlPvm(
 // CHECK: call void @_ZdlPvm(
 }
diff --git a/clang/test/CodeGenCoroutines/coro-dest-slot.cpp b/clang/test/CodeGenCoroutines/coro-dest-slot.cpp
index d794c74cd52d6..b8bc2434f6fc1 100644
--- a/clang/test/CodeGenCoroutines/coro-dest-slot.cpp
+++ b/clang/test/CodeGenCoroutines/coro-dest-slot.cpp
@@ -25,7 +25,7 @@ extern "C" coro f(int) { co_return; }
 // CHECK: %[[CLEANUP_DEST0:.+]] = phi i32 [ 0, %[[INIT_READY]] ], [ 2, %[[INIT_CLEANUP]] ]
 
 // CHECK: %[[FINAL_SUSPEND:.+]] = call i8 @llvm.coro.suspend(
-// CHECK-NEXT: switch i8 %{{.*}}, label %coro.ret [
+// CHECK-NEXT: switch i8 %{{.*}}, label %coro.end [
 // CHECK-NEXT:   i8 0, label %[[FINAL_READY:.+]]
 // CHECK-NEXT:   i8 1, label %[[FINAL_CLEANUP:.+]]
 // CHECK-NEXT: ]
diff --git a/clang/test/CodeGenCoroutines/coro-params.cpp b/clang/test/CodeGenCoroutines/coro-params.cpp
index 719726cca29c5..1f41ddcf34ac0 100644
--- a/clang/test/CodeGenCoroutines/coro-params.cpp
+++ b/clang/test/CodeGenCoroutines/coro-params.cpp
@@ -104,17 +104,24 @@ void f(int val, MoveOnly moParam, MoveAndCopy mcParam, TrivialABI trivialParam)
   // CHECK: call void @_ZNSt16coroutine_traitsIJvi8MoveOnly11MoveAndCopy10TrivialABIEE12promise_type13final_suspendEv(
   // CHECK: call void @_ZN14suspend_always12await_resumeEv(
 
-  // Destroy promise, then parameter copies:
-  // CHECK: call void @_ZNSt16coroutine_traitsIJvi8MoveOnly11MoveAndCopy10TrivialABIEE12promise_typeD1Ev(ptr {{[^,]*}} %__promise)
-  // CHECK-NEXT: call void @llvm.lifetime.end.p0(
-  // CHECK-NEXT: call void @_ZN10TrivialABID1Ev(ptr {{[^,]*}} %[[TrivialCopy]])
+  // Parameter copies:
+  // CHECK: call void @_ZN10TrivialABID1Ev(ptr {{[^,]*}} %[[TrivialCopy]])
   // CHECK-NEXT: call void @llvm.lifetime.end.p0(
   // CHECK-NEXT: call void @_ZN11MoveAndCopyD1Ev(ptr {{[^,]*}} %[[McCopy]])
   // CHECK-NEXT: call void @llvm.lifetime.end.p0(
   // CHECK-NEXT: call void @_ZN8MoveOnlyD1Ev(ptr {{[^,]*}} %[[MoCopy]]
   // CHECK-NEXT: call void @llvm.lifetime.end.p0(
   // CHECK-NEXT: call void @llvm.lifetime.end.p0(
+  // CHECK-NEXT: %InResume = call i1 @llvm.coro.end(ptr null, i1 false, token none)
+  // CHECK-NEXT: br i1 %InResume, label %promise.free, label %coro.ret
+
+  // Destroy promise:
+  // CHECK: promise.free
+  // CHECK-NEXT: call void @_ZNSt16coroutine_traitsIJvi8MoveOnly11MoveAndCopy10TrivialABIEE12promise_typeD1Ev(ptr {{[^,]*}} %__promise)
+  // CHECK-NEXT: call void @llvm.lifetime.end.p0(
   // CHECK-NEXT: call ptr @llvm.coro.free(
+  // CHECK-NEXT: {{[^,]*}} = icmp ne ptr {{[^,]*}}, null
+  // CHECK-NEXT: br i1 {{[^,]*}}, label %coro.free, label %after.coro.free
 
   // The original trivial_abi parameter is destroyed when returning from the ramp.
   // CHECK: call i1 @llvm.coro.end
diff --git a/clang/test/CodeGenCoroutines/coro-promise-dtor2.cpp b/clang/test/CodeGenCoroutines/coro-promise-dtor2.cpp
new file mode 100644
index 0000000000000..0c1b347461acf
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/coro-promise-dtor2.cpp
@@ -0,0 +1,53 @@
+// Test that destruction of promise is deferred if coroutine completed without suspending
+// RUN: %clang_cc1 -std=c++20 -triple=x86_64-unknown-linux-gnu -emit-llvm -o - %s -fexceptions -fcxx-exceptions -disable-llvm-passes | FileCheck %s
+
+#include "Inputs/coroutine.h"
+
+template<class T>
+struct coro {
+  struct RValueWrapper {
+    T* p;
+
+    operator T&&() const noexcept { return static_cast<T&&>(*p); }
+  };
+
+  using promise_type = T;
+
+  T& getDerived() noexcept { return *static_cast<T*>(this); }
+
+  auto get_return_object() noexcept { return RValueWrapper(&getDerived()); }
+  std::suspend_never initial_suspend() noexcept { return {}; }
+  std::suspend_never final_suspend() noexcept { return {}; }
+  void return_value(T&& x) noexcept { getDerived() = static_cast<T&&>(x); }
+  void unhandled_exception() {}
+};
+
+struct A : public coro<A> {
+  int a;
+
+  ~A() {}
+};
+
+A func() {
+  A aa{};
+  aa.a = 5;
+  co_return static_cast<A&&>(aa);
+}
+
+
+// CHECK-LABEL: define {{.*}} void @_Z4funcv(
+// CHECK: coro.end:
+// CHECK-NEXT: %never.suspend = phi i1 [ false, %cleanup.cont16 ], [ true, %cleanup12 ], [ false, %after.coro.free ], [ false, %final.suspend ], [ false, %init.suspend ]
+// CHECK-NEXT: call i1 @llvm.coro.end
+// CHECK-NEXT: ptr @_ZNK4coroI1AE13RValueWrappercvOS0_Ev
+// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i64
+// CHECK-NEXT: br i1 %never.suspend, label %promise.free.defer, label %coro.ret
+
+// CHECK: promise.free.defer:
+// CHECK-NEXT: call void @_ZN1AD1Ev
+// CHECK-NEXT: call void @llvm.lifetime.end.p0
+// CHECK-NEXT: call ptr @llvm.coro.free
+
+// CHECK: coro.ret:
+// CHECK-NEXT: call void @llvm.lifetime.end.p0
+// CHECK-NEXT: ret void
diff --git a/clang/test/CodeGenCoroutines/coro-symmetric-transfer-02.cpp b/clang/test/CodeGenCoroutines/coro-symmetric-transfer-02.cpp
index f36f89926505f..41012065f914a 100644
--- a/clang/test/CodeGenCoroutines/coro-symmetric-transfer-02.cpp
+++ b/clang/test/CodeGenCoroutines/coro-symmetric-transfer-02.cpp
@@ -91,7 +91,7 @@ Task bar() {
 // CHECK-NEXT:    %{{.+}} = call token @llvm.coro.save(ptr null)
 // CHECK-NEXT:    call void @llvm.coro.await.suspend.handle
 // CHECK-NEXT:    %{{.+}} = call i8 @llvm.coro.suspend
-// CHECK-NEXT:    switch i8 %{{.+}}, label %coro.ret [
+// CHECK-NEXT:    switch i8 %{{.+}}, label %coro.end [
 // CHECK-NEXT:      i8 0, label %[[CASE1_AWAIT_READY]]
 // CHECK-NEXT:      i8 1, label %[[CASE1_AWAIT_CLEANUP:.+]]
 // CHECK-NEXT:    ]
@@ -105,7 +105,7 @@ Task bar() {
 // CHECK-NEXT:    %{{.+}} = call token @llvm.coro.save(ptr null)
 // CHECK-NEXT:    call void @llvm.coro.await.suspend.handle
 // CHECK-NEXT:    %{{.+}} = call i8 @llvm.coro.suspend
-// CHECK-NEXT:    switch i8 %{{.+}}, label %coro.ret [
+// CHECK-NEXT:    switch i8 %{{.+}}, label %coro.end [
 // CHECK-NEXT:      i8 0, label %[[CASE2_AWAIT_READY]]
 // CHECK-NEXT:      i8 1, label %[[CASE2_AWAIT_CLEANUP:.+]]
 // CHECK-NEXT:    ]
diff --git a/llvm/docs/Coroutines.rst b/llvm/docs/Coroutines.rst
index 7472c68a70df4..c87657666c31e 100644
--- a/llvm/docs/Coroutines.rst
+++ b/llvm/docs/Coroutines.rst
@@ -1477,7 +1477,7 @@ The purpose of this intrinsic is to allow frontends to mark the cleanup and
 other code that is only relevant during the initial invocation of the coroutine
 and should not be present in resume and destroy parts.
 
-In returned-continuation lowering, ``llvm.coro.end`` fully destroys the
+* In returned-continuation lowering, ``llvm.coro.end`` fully destroys the
 coroutine frame.  If the second argument is `false`, it also returns from
 the coroutine with a null continuation pointer, and the next instruction
 will be unreachable.  If the second argument is `true`, it falls through
@@ -1485,13 +1485,12 @@ so that the following logic can resume unwinding.  In a yield-once
 coroutine, reaching a non-unwind ``llvm.coro.end`` without having first
 reached a ``llvm.coro.suspend.retcon`` has undefined behavior.
 
-The remainder of this section describes the behavior under switched-resume
-lowering.
-
-This intrinsic is lowered when a coroutine is split into
-the start, resume and destroy parts. In the start part, it is a no-op,
+* In switched-resume lowering, This intrinsic is lowered when a coroutine is
+split into the start, resume and destroy parts. In the start part, it is a no-op,
 in resume and destroy parts, it is replaced with `ret void` instruction and
-the rest of the block containing `coro.end` instruction is discarded.
+the rest of the block containing `coro.end` instruction is discarded if the
+return value of `coro.end` is unused.
+
 In landing pads it is replaced with an appropriate instruction to unwind to
 caller. The handling of coro.end differs depending on whether the target is
 using landingpad or WinEH exception model.
diff --git a/llvm/include/llvm/Transforms/Coroutines/CoroInstr.h b/llvm/include/llvm/Transforms/Coroutines/CoroInstr.h
index 0688068167ae6..7ec2913044e43 100644
--- a/llvm/include/llvm/Transforms/Coroutines/CoroInstr.h
+++ b/llvm/include/llvm/Transforms/Coroutines/CoroInstr.h
@@ -682,7 +682,7 @@ class AnyCoroEndInst : public IntrinsicInst {
   enum { FrameArg, UnwindArg, TokenArg };
 
 public:
-  bool isFallthrough() const { return !isUnwind(); }
+  bool isFallthrough() const { return !isUnwind() && user_empty(); }
   bool isUnwind() const {
     return cast<Constant>(getArgOperand(UnwindArg))->isOneValue();
   }
diff --git a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
index 64b33e46404f0..63ba80588a5d3 100644
--- a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
@@ -225,7 +225,7 @@ static void replaceFallthroughCoroEnd(AnyCoroEndInst *End,
            "switch coroutine should not return any values");
     // coro.end doesn't immediately end the coroutine in the main function
     // in this lowering, because we need to deallocate the coroutine.
-    if (!InResume)
+    if (!InResume || !End->user_empty())
       return;
     Builder.CreateRetVoid();
     break;
diff --git a/llvm/test/Transforms/Coroutines/coro-split-coroend.ll b/llvm/test/Transforms/Coroutines/coro-split-coroend.ll
new file mode 100644
index 0000000000000..a8bd8504fc4c0
--- /dev/null
+++ b/llvm/test/Transforms/Coroutines/coro-split-coroend.ll
@@ -0,0 +1,42 @@
+; Check that we do not lower coro.end to 'ret void' if its return value is used
+; RUN: opt < %s -passes='cgscc(coro-split),simplifycfg,early-cse' -S | FileCheck %s
+
+define ptr @f1() presplitcoroutine {
+  ; CHECK-LABEL: define internal fastcc void @f1.resume(
+  ; CHECK-NOT: call void @a()
+entry:
+  %id = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr null)
+  %hdl = call ptr @llvm.coro.begin(token %id, ptr null)
+  %0 = ...
[truncated]

@ChuanqiXu9
Copy link
Member

I don't understand this:

Do not lower coro.end to ret void if its return value will be used. This allows front end separate coro start section and resume section without early return from resume function.

Could elaborate it? Especially why it is safe to not lower coro.end to ret void?

@NewSigma
Copy link
Contributor Author

NewSigma commented Aug 1, 2025

Could elaborate it? Especially why it is safe to not lower coro.end to ret void?

The idea is: if we lower coro.end to ret void, users of coro.end will become unreachable. Thus, I assume that if the frontend uses the return value of coro.end, the intention is to determine whether we are in the start or resume part instead of returning.

Personally, I think the problem is that coro.end mixes two functionalities: querying where we are and lowering to some code. I think it might be reasonable to introduce a new intrinsic, llvm.coro.where, and drop the return value of coro.end. However, I’d prefer to hear others’ thoughts before actually doing it.

@ChuanqiXu9
Copy link
Member

Could elaborate it? Especially why it is safe to not lower coro.end to ret void?

The idea is: if we lower coro.end to ret void, users of coro.end will become unreachable. Thus, I assume that if the frontend uses the return value of coro.end, the intention is to determine whether we are in the start or resume part instead of returning.

Personally, I think the problem is that coro.end mixes two functionalities: querying where we are and lowering to some code. I think it might be reasonable to introduce a new intrinsic, llvm.coro.where, and drop the return value of coro.end. However, I’d prefer to hear others’ thoughts before actually doing it.

Since I didn't have time to look into details, the instinct reaction to my mind is, then why do we change the lowering of coro.end conditionally? I mean, if it is good, it will be better and simpler to do it unconditionally.

@NewSigma
Copy link
Contributor Author

NewSigma commented Aug 7, 2025

then why do we change the lowering of coro.end conditionally? I mean, if it is good, it will be better and simpler to do it unconditionally.

It is reasonable. I will use a new intrinsic instead of lower coro.end conditionally.

ChuanqiXu9 pushed a commit that referenced this pull request Aug 25, 2025
…oro.end (#153404)

As mentioned in #151067, current design of `llvm.coro.end` mixes two
functionalities: querying where we are and lowering to some code. This
patch separate these functionalities into independent intrinsics by
introducing a new intrinsic `llvm.coro.is_in_ramp`.
NewSigma added a commit that referenced this pull request Sep 17, 2025
…of llvm.coro.end #153404" (#155339)

As mentioned in #151067, current design of llvm.coro.end mixes two
functionalities: querying where we are and lowering to some code. This
patch separate these functionalities into independent intrinsics by
introducing a new intrinsic llvm.coro.is_in_ramp.
NewSigma added a commit that referenced this pull request Sep 24, 2025
…of llvm.coro.end (#155339)" (#159278)

As mentioned in #151067, current design of llvm.coro.end mixes two functionalities: querying where we are and lowering to some code. This patch separate these functionalities into independent intrinsics by introducing a new intrinsic llvm.coro.is_in_ramp.

Update a test in inline/ML, Reapply #155339
@NewSigma
Copy link
Contributor Author

NewSigma commented Oct 13, 2025

CC some people who viewed the previous patch

@github-actions
Copy link

github-actions bot commented Oct 13, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

// pass expects that there is at most one fallthrough coro.end.
if (cast<AnyCoroEndInst>(&I)->isFallthrough())
CB->setCannotDuplicate();
break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be some opt -passes=coro-early testcase for this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not sure. It doesn't seem straightforward to test that a feature does not exist.

@ChuanqiXu9
Copy link
Member

I don't understand the title and the cited issue. The cited issue is about the point of GRO. How does that relates to the lifetime of promise? IIUC, if there is problem, we need to promote the point of GRO.

@NewSigma
Copy link
Contributor Author

I don't understand the title and the cited issue. The cited issue is about the point of GRO. How does that relates to the lifetime of promise?

In the cited issue, the GRO conversion function reads coro promise. However, since control has flowed off final_suspend, the promise has already been destroyed. We will encounter a heap-use-after-free at -O0. This patch defers the destruction after GRO conversion, thereby extending the lifetime of the promise.

IIUC, if there is problem, we need to promote the point of GRO.

I thought it is more natural to promote the point of GRO and tried it yesterday. But it appears tricky to properly place GRO. On one side, coro.end must dominate GRO and ret so that ramp function and resume function have different return types. On the other side, coro.end must not dominate coro.cleanup, which would prevent ramp function and resume function from sharing it.

@ChuanqiXu9
Copy link
Member

I don't understand the title and the cited issue. The cited issue is about the point of GRO. How does that relates to the lifetime of promise?

In the cited issue, the GRO conversion function reads coro promise. However, since control has flowed off final_suspend, the promise has already been destroyed. We will encounter a heap-use-after-free at -O0. This patch defers the destruction after GRO conversion, thereby extending the lifetime of the promise.

IIUC, if there is problem, we need to promote the point of GRO.

I thought it is more natural to promote the point of GRO and tried it yesterday. But it appears tricky to properly place GRO. On one side, coro.end must dominate GRO and ret so that ramp function and resume function have different return types. On the other side, coro.end must not dominate coro.cleanup, which would prevent ramp function and resume function from sharing it.

I had the same impression. I forgot details but I roughly remembered I tried to move the point of GRO several years ago. But I still feel if we had the chance, it is better to do the right things.

Why coro.end must dominate GRO? I think we should be able to call GRO early and return it in the end.

@NewSigma
Copy link
Contributor Author

I had the same impression. I forgot details but I roughly remembered I tried to move the point of GRO several years ago. But I still feel if we had the chance, it is better to do the right things.

Sounds good.

Why coro.end must dominate GRO? I think we should be able to call GRO early and return it in the end.

Thank you for your comment. After some thought, I guess the following might work:

any.suspend:
  %suspend = call i8 @llvm.coro.suspend(token %8, i1 true)
  switch i8 %suspend, label %pre.gro.conv [
    i8 0, label %ready
    i8 1, label %pre.coro.cleanup
  ]

ready:
  ; ...

pre.gro.conv:
  %suspended = phi i1 [ true, %any.suspend ], [ false, %any.ready ]
  %InRamp = call i1 @llvm.coro.is_in_ramp()
  br i1 %InRamp, label %gro.conv, label %pre.coro.cleanup

gro.conv:
  ; GRO conversion
  br label %pre.coro.cleanup

pre.coro.cleanup:
  %suspended.1 = phi i1 [ true, %any.suspend ], [ %suspended, %pre.gro.conv ], [ %suspended, %gro.conv ]
  %coro.destroy = phi i1 [ true, %any.suspend ], [ false, %pre.gro.conv ], [ false, %gro.conv ]
  ; should.cleanup equals to 1 if
  ; (a) completes without suspend
  ; (b) control flows off final_suspend
  ; (c) coro.destroy is invoked
  %should_cleanup = or i1 %suspended.1, %coro.destroy
  br i1 %should_cleanup, label %coro.cleanup, label %coro.ret

coro.cleanup:
  ; cleanup logics
  br %coro.ret

coro.ret:
  call void @llvm.coro.end(ptr null, i1 false, token none)
  ; ret GRO or void

@NewSigma NewSigma removed clang Clang issues not falling into any other category llvm:transforms labels Oct 30, 2025
@NewSigma NewSigma changed the title [clang][CodeGen] Extends lifetime of coroutine promise(CWG2563) [clang][CodeGen] Promote point of GRO(CWG2563) Oct 30, 2025
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Oct 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category coroutines C++20 coroutines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[clang] [coroutine] Initialize return value before destruction of coroutine state

4 participants