- 
                Notifications
    You must be signed in to change notification settings 
- Fork 14.9k
[Coroutines] fix coroutines + std::unique_ptr with async exceptions validation errors #149691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Coroutines] fix coroutines + std::unique_ptr with async exceptions validation errors #149691
Conversation
…a new cleanupret instruction was added, make it unwind to the same edge of the genuine cleanupret
…Begin on async-exception ready function, it created instruction dominance issues. the fix is to clone those bbs and keep unwinding path that happens before CoroBegin
| Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using  If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. | 
| @llvm/pr-subscribers-coroutines @llvm/pr-subscribers-llvm-transforms Author: None (tzuralon) ChangesFixes #148035 Patch is 502.39 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/149691.diff 7 Files Affected: 
 diff --git a/llvm/lib/Transforms/Coroutines/CoroFrame.cpp b/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
index a65d0fb54c212..a3ec5d3cb06ea 100644
--- a/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
@@ -35,6 +35,7 @@
 #include "llvm/Transforms/Coroutines/SpillUtils.h"
 #include "llvm/Transforms/Coroutines/SuspendCrossingInfo.h"
 #include "llvm/Transforms/Utils/BasicBlockUtils.h"
+#include "llvm/Transforms/Utils/Cloning.h"
 #include "llvm/Transforms/Utils/Local.h"
 #include "llvm/Transforms/Utils/PromoteMemToReg.h"
 #include <algorithm>
@@ -974,6 +975,159 @@ static StructType *buildFrameType(Function &F, coro::Shape &Shape,
   return FrameTy;
 }
 
+// Fixer for the "Instruction does not dominate all uses!" bug
+// The fix consists of mapping problematic paths (where CoroBegin does not
+// dominate cleanup BBs) and clones them to 2 flows - the one that insertSpills
+// intended to create (using the spill) and another one, preserving the logics
+// of pre-splitting, which would be triggered if unwinding happened before
+// CoroBegin
+static void
+splitBasicBlocksNotDominatedByCoroBegin(const FrameDataInfo &FrameData,
+                                        coro::Shape &Shape, Function *F,
+                                        DominatorTree &DT) {
+  ValueToValueMapTy VMap;
+  DenseMap<BasicBlock *, BasicBlock *>
+ProcessedSpillBlockToAlternativeUnspilledBlockMap;
+  bool FunctionHasSomeBlockNotDominatedByCoroBegin;
+  SmallVector<BasicBlock *> SpillUserBlocks;
+  
+  for (const auto &E : FrameData.Spills) {
+    for (auto *U : E.second) {
+      if (std::find(SpillUserBlocks.begin(), SpillUserBlocks.end(),
+U->getParent()) == SpillUserBlocks.end()) {
+        SpillUserBlocks.push_back(U->getParent());
+      }
+    }
+  }
+  SpillUserBlocks.push_back(Shape.AllocaSpillBlock);
+
+  do {
+    FunctionHasSomeBlockNotDominatedByCoroBegin = false;
+    // We want to traverse the function post-order (predecessors first),
+    // and check dominance starting CoroBegin
+    bool HaveTraversedCoroBegin = false;
+    for (BasicBlock *CurrentBlock : ReversePostOrderTraversal<Function *>(F)) {
+      if (!HaveTraversedCoroBegin &&
+CurrentBlock != Shape.CoroBegin->getParent()) {
+        continue;
+      }
+      HaveTraversedCoroBegin = true;
+      
+      // Focus on 2 types of users that produce errors - those in
+      // FrameData.Spills, and decendants of Shape.AllocaSpillBlocks
+      if (!DT.dominates(Shape.CoroBegin, CurrentBlock) &&
+std::find(SpillUserBlocks.begin(), SpillUserBlocks.end(),
+CurrentBlock) != SpillUserBlocks.end()) {
+        // Mark another iteration of the loop is needed, to verify that no more
+        // dominance issues after current run
+        FunctionHasSomeBlockNotDominatedByCoroBegin = true;
+
+        // Clone (preserve) the current basic block, before it will be modified
+        // by insertSpills
+        auto UnspilledAlternativeBlock =
+CloneBasicBlock(CurrentBlock, VMap, ".unspilled_alternative", F);
+
+        // Remap the instructions, VMap here aggregates instructions across
+        // multiple BasicBlocks, and we assume that traversal is post-order,
+        // therefore successor blocks (for example instructions having funclet
+        // tags) will be mapped correctly to the new cloned cleanuppad
+        for (Instruction &I : *UnspilledAlternativeBlock) {
+          RemapInstruction(&I, VMap,
+RF_NoModuleLevelChanges | RF_IgnoreMissingLocals);
+        }
+
+        // Keep track between the processed spill basic block and the cloned
+        // alternative unspilled basic block Will help us fix instructions that
+        // their context is complex (for example cleanuppad of funclet is
+        // defined in another BB)
+        ProcessedSpillBlockToAlternativeUnspilledBlockMap[CurrentBlock] =
+UnspilledAlternativeBlock;
+
+        SmallVector<Instruction *> FixUpPredTerminators;
+
+        // Find the specific predecessors that does not dominated by
+        // Shape.CoroBegin We don't fix them here but later because it's not
+        // safe while using predecessors as iterator function
+        for (BasicBlock *Pred : predecessors(CurrentBlock)) {
+          if (!DT.dominates(Shape.CoroBegin, Pred)) {
+            FixUpPredTerminators.push_back(Pred->getTerminator());
+          }
+        }
+
+        // Fixups for current block terminator
+        const auto &CurrentBlockTerminator = CurrentBlock->getTerminator();
+
+        // If it's cleanupret, find the correspondant cleanuppad (use the map to
+        // find it)
+        if (auto CurrentBlockCleanupReturnTerminator =
+dyn_cast<CleanupReturnInst>(CurrentBlockTerminator)) {
+          BasicBlock *CBCPBB =
+CurrentBlockCleanupReturnTerminator->getCleanupPad()->getParent();
+          CleanupReturnInst *DBT = dyn_cast<CleanupReturnInst>(
+UnspilledAlternativeBlock->getTerminator());
+
+          // Again assuming post-order traversal - if we mapped the predecessing
+          // cleanuppad block before, we should find it here If not, do nothing
+          if (ProcessedSpillBlockToAlternativeUnspilledBlockMap.contains(
+CBCPBB)) {
+            Instruction *DCPr =
+&ProcessedSpillBlockToAlternativeUnspilledBlockMap[CBCPBB]
+->front();           
+            CleanupPadInst *DCP = cast<CleanupPadInst>(DCPr);
+            DBT->setCleanupPad(DCP);
+          }
+
+        // If it's a branch/invoke, keep track of its successors, we want to
+          // calculate dominance between CoroBegin and them also They might need
+          // clone as well
+        } else if (auto CurrentBlockBranchTerminator =
+dyn_cast<BranchInst>(CurrentBlockTerminator)) {
+          for (unsigned int successorIdx = 0;
+successorIdx < CurrentBlockBranchTerminator->getNumSuccessors();
+++successorIdx) {
+            SpillUserBlocks.push_back(
+CurrentBlockBranchTerminator->getSuccessor(successorIdx));
+          }
+        } else if (auto CurrentBlockInvokeTerminator =
+dyn_cast<InvokeInst>(CurrentBlockTerminator)) {
+          SpillUserBlocks.push_back(
+CurrentBlockInvokeTerminator->getUnwindDest());
+        } else {
+          report_fatal_error("Not implemented terminator for this specific "
+                             "instruction fixup in current block fixups");
+        }
+
+        // Fixups on the predecessors terminator - direct them to out untouched
+        // alternative block to break dominance error.
+        for (auto FixUpPredTerminator : FixUpPredTerminators) {
+          if (auto FixUpPredTerminatorInvoke =
+dyn_cast<InvokeInst>(FixUpPredTerminator)) {
+            FixUpPredTerminatorInvoke->setUnwindDest(UnspilledAlternativeBlock);
+          } else if (auto FixUpPredTerminatorBranch =
+dyn_cast<BranchInst>(FixUpPredTerminator)) {
+            for (unsigned int successorIdx = 0;
+                successorIdx < FixUpPredTerminatorBranch->getNumSuccessors();
+                ++successorIdx) {
+              if (CurrentBlock ==
+FixUpPredTerminatorBranch->getSuccessor(successorIdx)) {
+                FixUpPredTerminatorBranch->setSuccessor(
+successorIdx, UnspilledAlternativeBlock);
+              }
+            }
+          } else {
+            report_fatal_error("Not implemented terminator for this specific "
+                               "instruction in pred fixups");
+          }
+        }
+        
+        // We changed dominance tree, so recalculate.
+        DT.recalculate(*F);
+        continue;
+      }
+    }
+  } while (FunctionHasSomeBlockNotDominatedByCoroBegin);
+}
+
 // Replace all alloca and SSA values that are accessed across suspend points
 // with GetElementPointer from coroutine frame + loads and stores. Create an
 // AllocaSpillBB that will become the new entry block for the resume parts of
@@ -1053,6 +1207,8 @@ static void insertSpills(const FrameDataInfo &FrameData, coro::Shape &Shape) {
     return GEP;
   };
 
+  splitBasicBlocksNotDominatedByCoroBegin(FrameData, Shape, F, DT);
+
   for (auto const &E : FrameData.Spills) {
     Value *Def = E.first;
     auto SpillAlignment = Align(FrameData.getAlign(Def));
diff --git a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
index 64b33e46404f0..37b3d61c9f8f2 100644
--- a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
@@ -376,7 +376,16 @@ static void replaceUnwindCoroEnd(AnyCoroEndInst *End, const coro::Shape &Shape,
   // If coro.end has an associated bundle, add cleanupret instruction.
   if (auto Bundle = End->getOperandBundle(LLVMContext::OB_funclet)) {
     auto *FromPad = cast<CleanupPadInst>(Bundle->Inputs[0]);
-    auto *CleanupRet = Builder.CreateCleanupRet(FromPad, nullptr);
+
+    // If the terminator is an invoke, 
+    // set the cleanupret unwind destination the same as the other edges, to
+    // avoid validation errors
+    BasicBlock *UBB = nullptr;
+    if (auto II = dyn_cast<InvokeInst>(FromPad->getParent()->getTerminator())) {
+      UBB = II->getUnwindDest();
+    }
+
+    auto *CleanupRet = Builder.CreateCleanupRet(FromPad, UBB);
     End->getParent()->splitBasicBlock(End);
     CleanupRet->getParent()->getTerminator()->eraseFromParent();
   }
diff --git a/llvm/test/Transforms/Coroutines/pr148035_0_coroutine.ll b/llvm/test/Transforms/Coroutines/pr148035_0_coroutine.ll
new file mode 100644
index 0000000000000..2457bc3045cec
--- /dev/null
+++ b/llvm/test/Transforms/Coroutines/pr148035_0_coroutine.ll
@@ -0,0 +1,7609 @@
+; This is the cppreference example for coroutines, in llvm IR form, built with async exceptions flag.
+; crashed before fix because of the validation mismatch of Unwind edges out of a funclet pad must have the same unwind dest
+; RUN: opt < %s -passes=coro-split -S | FileCheck %s
+; CHECK: define
+
+; ModuleID = 'coroutine.cpp'
+source_filename = "coroutine.cpp"
+target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-pc-windows-msvc19.38.33135"
+
+%"class.std::basic_ostream" = type { ptr, [4 x i8], i32, %"class.std::basic_ios" }
+%"class.std::basic_ios" = type { %"class.std::ios_base", ptr, ptr, i8 }
+%"class.std::ios_base" = type { ptr, i64, i32, i32, i32, i64, i64, ptr, ptr, ptr }
+%rtti.TypeDescriptor23 = type { ptr, ptr, [24 x i8] }
+%eh.CatchableType = type { i32, i32, i32, i32, i32, i32, i32 }
+%rtti.TypeDescriptor19 = type { ptr, ptr, [20 x i8] }
+%eh.CatchableTypeArray.2 = type { i32, [2 x i32] }
+%eh.ThrowInfo = type { i32, i32, i32, i32 }
+%rtti.CompleteObjectLocator = type { i32, i32, i32, i32, i32, i32 }
+%rtti.ClassHierarchyDescriptor = type { i32, i32, i32, i32 }
+%rtti.BaseClassDescriptor = type { i32, i32, i32, i32, i32, i32, i32 }
+%"struct.std::nostopstate_t" = type { i8 }
+%rtti.TypeDescriptor26 = type { ptr, ptr, [27 x i8] }
+%rtti.TypeDescriptor22 = type { ptr, ptr, [23 x i8] }
+%eh.CatchableTypeArray.5 = type { i32, [5 x i32] }
+%"union.std::error_category::_Addr_storage" = type { i64 }
+%rtti.TypeDescriptor35 = type { ptr, ptr, [36 x i8] }
+%rtti.TypeDescriptor24 = type { ptr, ptr, [25 x i8] }
+%"struct.std::_Fake_allocator" = type { i8 }
+%rtti.TypeDescriptor30 = type { ptr, ptr, [31 x i8] }
+%eh.CatchableTypeArray.3 = type { i32, [3 x i32] }
+%struct.awaitable = type { ptr }
+%struct.task = type { i8 }
+%"struct.task::promise_type" = type { i8 }
+%"struct.std::suspend_never" = type { i8 }
+%"class.std::thread::id" = type { i32 }
+%"struct.std::coroutine_handle" = type { ptr }
+%"struct.std::coroutine_handle.0" = type { ptr }
+%"class.std::basic_ostream<char>::sentry" = type { %"class.std::basic_ostream<char>::_Sentry_base", i8 }
+%"class.std::basic_ostream<char>::_Sentry_base" = type { ptr }
+%"class.std::runtime_error" = type { %"class.std::exception" }
+%"class.std::exception" = type { ptr, %struct.__std_exception_data }
+%struct.__std_exception_data = type { ptr, i8 }
+%"class.std::jthread" = type { %"class.std::thread", %"class.std::stop_source" }
+%"class.std::thread" = type { %struct._Thrd_t }
+%struct._Thrd_t = type { ptr, i32 }
+%"class.std::stop_source" = type { ptr }
+%class.anon = type { %"struct.std::coroutine_handle" }
+%"class.std::unique_ptr" = type { %"class.std::_Compressed_pair" }
+%"class.std::_Compressed_pair" = type { ptr }
+%"struct.std::_Stop_state" = type { %"struct.std::atomic", %"struct.std::atomic", %"class.std::_Locked_pointer", %"struct.std::atomic.6", i32 }
+%"struct.std::atomic" = type { %"struct.std::_Atomic_integral_facade" }
+%"struct.std::_Atomic_integral_facade" = type { %"struct.std::_Atomic_integral" }
+%"struct.std::_Atomic_integral" = type { %"struct.std::_Atomic_storage" }
+%"struct.std::_Atomic_storage" = type { %"struct.std::_Atomic_padded" }
+%"struct.std::_Atomic_padded" = type { i32 }
+%"class.std::_Locked_pointer" = type { %"struct.std::atomic.1" }
+%"struct.std::atomic.1" = type { %"struct.std::_Atomic_integral_facade.2" }
+%"struct.std::_Atomic_integral_facade.2" = type { %"struct.std::_Atomic_integral.3" }
+%"struct.std::_Atomic_integral.3" = type { %"struct.std::_Atomic_storage.4" }
+%"struct.std::_Atomic_storage.4" = type { %"struct.std::_Atomic_padded.5" }
+%"struct.std::_Atomic_padded.5" = type { i64 }
+%"struct.std::atomic.6" = type { %"struct.std::_Atomic_pointer" }
+%"struct.std::_Atomic_pointer" = type { %"struct.std::_Atomic_storage.7" }
+%"struct.std::_Atomic_storage.7" = type { %"struct.std::_Atomic_padded.8" }
+%"struct.std::_Atomic_padded.8" = type { ptr }
+%"struct.std::_Exact_args_t" = type { i8 }
+%"struct.std::_Zero_then_variadic_args_t" = type { i8 }
+%"class.std::tuple" = type { %"struct.std::_Tuple_val" }
+%"struct.std::_Tuple_val" = type { %class.anon }
+%"class.std::_Stop_callback_base" = type { ptr, ptr, ptr, ptr }
+%"class.std::basic_streambuf" = type { ptr, ptr, ptr, ptr, ptr, ptr, ptr, ptr, ptr, i32, i32, ptr, ptr, ptr }
+%"class.std::ios_base::failure" = type { %"class.std::system_error" }
+%"class.std::system_error" = type { %"class.std::_System_error" }
+%"class.std::_System_error" = type { %"class.std::runtime_error", %"class.std::error_code" }
+%"class.std::error_code" = type { i32, ptr }
+%"class.std::basic_string" = type { %"class.std::_Compressed_pair.10" }
+%"class.std::_Compressed_pair.10" = type { %"class.std::_String_val" }
+%"class.std::_String_val" = type { %"union.std::_String_val<std::_Simple_types<char>>::_Bxty", i64, i64 }
+%"union.std::_String_val<std::_Simple_types<char>>::_Bxty" = type { ptr, [8 x i8] }
+%"class.std::error_condition" = type { i32, ptr }
+%"struct.std::_Fake_proxy_ptr_impl" = type { i8 }
+%"class.std::bad_array_new_length" = type { %"class.std::bad_alloc" }
+%"class.std::bad_alloc" = type { %"class.std::exception" }
+%"class.std::error_category" = type { ptr, %"union.std::error_category::_Addr_storage" }
+%"class.std::allocator" = type { i8 }
+%"struct.std::_One_then_variadic_args_t" = type { i8 }
+%class.anon.11 = type { i8 }
+
+$"?get_return_object@promise_type@task@@QEAA?AU2@XZ" = comdat any
+
+$"?initial_suspend@promise_type@task@@QEAA?AUsuspend_never@std@@XZ" = comdat any
+
+$"?await_ready@suspend_never@std@@QEBA_NXZ" = comdat any
+
+$"?await_suspend@suspend_never@std@@QEBAXU?$coroutine_handle@X@2@@Z" = comdat any
+
+$"?from_address@?$coroutine_handle@Upromise_type@task@@@std@@SA?AU12@QEAX@Z" = comdat any
+
+$"??B?$coroutine_handle@Upromise_type@task@@@std@@QEBA?AU?$coroutine_handle@X@1@XZ" = comdat any
+
+$"?await_resume@suspend_never@std@@QEBAXXZ" = comdat any
+
+$"??$?6U?$char_traits@D@std@@@std@@YAAEAV?$basic_ostream@DU?$char_traits@D@std@@@0@AEAV10@D@Z" = comdat any
+
+$"??$?6DU?$char_traits@D@std@@@std@@YAAEAV?$basic_ostream@DU?$char_traits@D@std@@@0@AEAV10@Vid@thread@0@@Z" = comdat any
+
+$"??$?6U?$char_traits@D@std@@@std@@YAAEAV?$basic_ostream@DU?$char_traits@D@std@@@0@AEAV10@PEBD@Z" = comdat any
+
+$"?get_id@this_thread@std@@YA?AVid@thread@2@XZ" = comdat any
+
+$"?return_void@promise_type@task@@QEAAXXZ" = comdat any
+
+$"?unhandled_exception@promise_type@task@@QEAAXXZ" = comdat any
+
+$"?final_suspend@promise_type@task@@QEAA?AUsuspend_never@std@@XZ" = comdat any
+
+$"??0jthread@std@@QEAA@XZ" = comdat any
+
+$"??1jthread@std@@QEAA@XZ" = comdat any
+
+$"??0?$coroutine_handle@Upromise_type@task@@@std@@QEAA@XZ" = comdat any
+
+$"?from_address@?$coroutine_handle@X@std@@SA?AU12@QEAX@Z" = comdat any
+
+$"??0?$coroutine_handle@X@std@@QEAA@XZ" = comdat any
+
+$"??0id@thread@std@@AEAA@I@Z" = comdat any
+
+$"?joinable@jthread@std@@QEBA_NXZ" = comdat any
+
+$"??0runtime_error@std@@QEAA@PEBD@Z" = comdat any
+
+$"??0runtime_error@std@@QEAA@AEBV01@@Z" = comdat any
+
+$"??0exception@std@@QEAA@AEBV01@@Z" = comdat any
+
+$"??1runtime_error@std@@UEAA@XZ" = comdat any
+
+$"??4jthread@std@@QEAAAEAV01@$$QEAV01@@Z" = comdat any
+
+$"?get_id@jthread@std@@QEBA?AVid@thread@2@XZ" = comdat any
+
+$"?joinable@thread@std@@QEBA_NXZ" = comdat any
+
+$"??0exception@std@@QEAA@QEBD@Z" = comdat any
+
+$"??1exception@std@@UEAA@XZ" = comdat any
+
+$"??_Gruntime_error@std@@UEAAPEAXI@Z" = comdat any
+
+$"?what@exception@std@@UEBAPEBDXZ" = comdat any
+
+$"??_Gexception@std@@UEAAPEAXI@Z" = comdat any
+
+$"??0thread@std@@QEAA@XZ" = comdat any
+
+$"??0stop_source@std@@QEAA@XZ" = comdat any
+
+$"??1stop_source@std@@QEAA@XZ" = comdat any
+
+$"??1thread@std@@QEAA@XZ" = comdat any
+
+$"??0_Stop_state@std@@QEAA@XZ" = comdat any
+
+$"??0?$_Locked_pointer@V_Stop_callback_base@std@@@std@@QEAA@XZ" = comdat any
+
+$"??0?$_Atomic_storage@I$03@std@@QEAA@I@Z" = comdat any
+
+$"??0?$atomic@_K@std@@QEAA@XZ" = comdat any
+
+$"??0?$_Atomic_storage@PEBV_Stop_callback_base@std@@$07@std@@QEAA@QEBV_Stop_callback_base@1@@Z" = comdat any
+
+$"??$?0U_Exact_args_t@std@@$0A@@?$tuple@$$V@std@@QEAA@U_Exact_args_t@1@@Z" = comdat any
+
+$"?resume@?$coroutine_handle@X@std@@QEBAXXZ" = comdat any
+
+$"?fetch_sub@?$_Atomic_integral_facade@I@std@@QEAAIIW4memory_order@2@@Z" = comdat any
+
+$"?fetch_add@?$_Atomic_integral@I$03@std@@QEAAIIW4memory_order@2@@Z" = comdat any
+
+$"?_Negate@?$_Atomic_integral_facade@I@std@@SAII@Z" = comdat any
+
+$_Check_memory_order = comdat any
+
+$"??$_Atomic_address_as@JU?$_Atomic_padded@I@std@@@std@@YAPECJAEAU?$_Atomic_padded@I@0@@Z" = comdat any
+
+$"?_Try_cancel_and_join@jthread@std@@AEAAXXZ" = comdat any
+
+$"??4thread@std@@QEAAAEAV01@$$QEAV01@@Z" = comdat any
+
+$"??4stop_source@std@@QEAAAEAV01@$$QEAV01@@Z" = comdat any
+
+$"?request_stop@stop_source@std@@QEAA_NXZ" = comdat any
+
+$"?join@thread@std@@QEAAXXZ" = comdat any
+
+$"?_Request_stop@_Stop_state@std@@QEAA_NXZ" = comdat any
+
+$"?fetch_or@?$_Atomic_integral@I$03@std@@QEAAIIW4memory_order@2@@Z" = comdat any
+
+$"?_Lock_and_load@?$_Locked_pointer@V_Stop_callback_base@std@@@std@@QEAAPEAV_Stop_callback_base@2@XZ" = comdat any
+
+$"?store@?$_Atomic_storage@PEBV_Stop_callback_base@std@@$07@std@@QEAAXQEBV_Stop_callback_base@2@W4memory_order@2@@Z" = comdat any
+
+$"?notify_all@?$_Atomic_storage@PEBV_Stop_callback_base@std@@$07@std@@QEAAXXZ" = comdat any
+
+$"?_Store_and_unlock@?$_Locked_pointer@V_Stop_callback_base@std@@@std@@QEAAXQEAV_Stop_callback_base@2@@Z" = comdat any
+
+$"??$exchange@PEAV_Stop_callback_base@std@@$$T@std@@YAPEAV_Stop_callback_base@0@AEAPEAV10@$$QEA$$T@Z" = comdat any
+
+$"?load@?$_Atomic_storage@_K$07@std@@QEBA_KW4memory_order@2@@Z" = comdat any
+
+$"?compare_exchange_weak@?$atomic@_K@std@@QEAA_NAEA_K_K@Z" = comdat any
+
+$"?wait@?$_Atomic_storage@_K$07@std@@QEBAX_KW4memory_order@2@@Z" = comdat any
+
+$"??$_Atomic_address_as@_JU?$_Atomic_padded@_K@std@@@std@@YAPED_JAEBU?$_Atomic_padded@_K@0@@Z" = comdat any
+
+$"?compare_exchange_strong@?$_Atomic_storage@_K$07@std@@QEAA_NAEA_K_KW4memory_order@2@@Z" = comdat any
+
+$"??$_Atomic_reinterpret_as@_J_K@std@@YA_JAEB_K@Z" = comdat any
+
+$"??$_Atomic_address_as@_JU?$_Atomic_padded@_K@std@@@std@@YAPEC_JAEAU?$_Atomic_padded@_K@0@@Z" = comdat any
+
+$"??$_Atomic_wait_direct@_K_J@std@@YAXQEBU?$_Atomic_storage@_K$07@0@_JW4memory_order@0@@Z" = comdat any
+
+$"??$_Atomic_address_as@_JU?$_Atomic_padded@PEBV_Stop_callback_base@std@@@std@@@std@@YAPEC_JAEAU?$_Atomic_padded@PEBV_Stop_callback_base@std@@@0@@Z" = comdat any
+
+$"??$_Atomic_reinterpret_as@_JPEBV_Stop_callback_base@std@@@std@@YA_JAEBQEBV_Stop_callback_base@0@@Z" = comdat any
+
+$"?store@?$_Atomic_storage@PEBV_Stop_callback_base@std@@$07@std@@QEAAXQEBV_Stop_callback_base@2@@Z" = comdat any
+
+$"?exchange@?$_Atomic_storage@_K$07@std@@QEAA_K_KW4memory_order@2@@Z" = comdat any
+
+$"?notify_all@?$_Atomic_storage@_K...
[truncated]
 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some comments on styles. Could you please explain the problem and your solution in the pr description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please further reduce this test? A test file of ~7000 lines is much larger than the others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a non-reduced version, I guess I can remove it
| continue; | ||
| } | ||
| } | ||
| } while (FunctionHasSomeBlockNotDominatedByCoroBegin); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks expensive. Is it possible to avoid this do-while?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, this while was redundant and I removed it.
|  | ||
| ; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: read) | ||
| declare token @llvm.coro.id(i32, ptr readnone, ptr nocapture readonly, ptr) #2 | ||
|  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intrinsic function declarations are redundant. Could you please drop them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
… unwind paths not handled correctly
…split performance
…eir reduced versions
… not fixed at first
| You can test this locally with the following command:git-clang-format --diff HEAD~1 HEAD --extensions cpp -- llvm/lib/Transforms/Coroutines/CoroFrame.cpp llvm/lib/Transforms/Coroutines/CoroSplit.cppView the diff from clang-format here.diff --git a/llvm/lib/Transforms/Coroutines/CoroFrame.cpp b/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
index bfe69f589..133f3b400 100644
--- a/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
@@ -987,21 +987,21 @@ static void finalizeBasicBlockCloneAndTrackSuccessors(
   //  it will use VMap to do so
   // in addition, it will add the node successors to SuccessorBlocksSet
 
-        // Remap the instructions, VMap here aggregates instructions across
-        // multiple BasicBlocks, and we assume that traversal is reversed post-order,
-        // therefore successor blocks (for example instructions having funclet
-        // tags) will be mapped correctly to the new cloned cleanuppad
-        for (Instruction &ClonedBlockInstruction : *ClonedBlock) {
-          RemapInstruction(&ClonedBlockInstruction, VMap,
-RF_NoModuleLevelChanges | RF_IgnoreMissingLocals);
-        }
+  // Remap the instructions, VMap here aggregates instructions across
+  // multiple BasicBlocks, and we assume that traversal is reversed post-order,
+  // therefore successor blocks (for example instructions having funclet
+  // tags) will be mapped correctly to the new cloned cleanuppad
+  for (Instruction &ClonedBlockInstruction : *ClonedBlock) {
+    RemapInstruction(&ClonedBlockInstruction, VMap,
+                     RF_NoModuleLevelChanges | RF_IgnoreMissingLocals);
+  }
 
-        const auto &InitialBlockTerminator = InitialBlock->getTerminator();
+  const auto &InitialBlockTerminator = InitialBlock->getTerminator();
 
-        // If it's cleanupret, find the correspondant cleanuppad (use the VMap to
-        // find it).
-        if (auto *InitialBlockTerminatorCleanupReturn =
-dyn_cast<CleanupReturnInst>(InitialBlockTerminator)) {
+  // If it's cleanupret, find the correspondant cleanuppad (use the VMap to
+  // find it).
+  if (auto *InitialBlockTerminatorCleanupReturn =
+          dyn_cast<CleanupReturnInst>(InitialBlockTerminator)) {
     // if none found do nothing
     if (VMap.find(InitialBlockTerminatorCleanupReturn->getCleanupPad()) ==
         VMap.end()) {
@@ -1009,32 +1009,32 @@ dyn_cast<CleanupReturnInst>(InitialBlockTerminator)) {
     }
 
     auto *ClonedBlockTerminatorCleanupReturn =
-cast<CleanupReturnInst>(ClonedBlock->getTerminator());
+        cast<CleanupReturnInst>(ClonedBlock->getTerminator());
 
-          // Assuming reversed post-order traversal
+    // Assuming reversed post-order traversal
     llvm::Value *ClonedBlockCleanupPadValue =
         VMap[InitialBlockTerminatorCleanupReturn->getCleanupPad()];
     auto *ClonedBlockCleanupPad =
-cast<CleanupPadInst>(ClonedBlockCleanupPadValue);
-            ClonedBlockTerminatorCleanupReturn->setCleanupPad(ClonedBlockCleanupPad);
-          
-        // If it's a branch/invoke, keep track of its successors, we want to
-          // calculate dominance between CoroBegin and them also
-        } else if (auto *InitialBlockTerminatorBranch =
-dyn_cast<BranchInst>(InitialBlockTerminator)) {
-          for (unsigned int successorIdx = 0;
-successorIdx < InitialBlockTerminatorBranch->getNumSuccessors();
-++successorIdx) {
-            SuccessorBlocksSet.insert(
+        cast<CleanupPadInst>(ClonedBlockCleanupPadValue);
+    ClonedBlockTerminatorCleanupReturn->setCleanupPad(ClonedBlockCleanupPad);
+
+    // If it's a branch/invoke, keep track of its successors, we want to
+    // calculate dominance between CoroBegin and them also
+  } else if (auto *InitialBlockTerminatorBranch =
+                 dyn_cast<BranchInst>(InitialBlockTerminator)) {
+    for (unsigned int successorIdx = 0;
+         successorIdx < InitialBlockTerminatorBranch->getNumSuccessors();
+         ++successorIdx) {
+      SuccessorBlocksSet.insert(
           InitialBlockTerminatorBranch->getSuccessor(successorIdx));
-          }
-        } else if (auto *InitialBlockTerminatorInvoke =
-dyn_cast<InvokeInst>(InitialBlockTerminator)) {
+    }
+  } else if (auto *InitialBlockTerminatorInvoke =
+                 dyn_cast<InvokeInst>(InitialBlockTerminator)) {
     SuccessorBlocksSet.insert(InitialBlockTerminatorInvoke->getUnwindDest());
   } else if (isa<ReturnInst>(InitialBlockTerminator)) {
     // No action needed
-        } else {
-          InitialBlockTerminator->print(dbgs());
+  } else {
+    InitialBlockTerminator->print(dbgs());
     report_fatal_error("Terminator is not implemented in "
                        "finalizeBasicBlockCloneAndTrackSuccessors");
   }
@@ -1051,12 +1051,12 @@ void splitIfBasicBlockPredecessors(
   std::copy_if(Predecessors.begin(), Predecessors.end(),
                std::back_inserter(InitialBlockPredecessors), Predicate);
 
-        // Fixups on the predecessors terminator - point them to ReplacementBlock.
+  // Fixups on the predecessors terminator - point them to ReplacementBlock.
   for (auto *InitialBlockPredecessor : InitialBlockPredecessors) {
     auto *InitialBlockPredecessorTerminator =
         InitialBlockPredecessor->getTerminator();
     if (auto *InitialBlockPredecessorTerminatorInvoke =
-dyn_cast<InvokeInst>(InitialBlockPredecessorTerminator)) {
+            dyn_cast<InvokeInst>(InitialBlockPredecessorTerminator)) {
       if (InitialBlock ==
           InitialBlockPredecessorTerminatorInvoke->getUnwindDest()) {
         InitialBlockPredecessorTerminatorInvoke->setUnwindDest(
@@ -1066,16 +1066,16 @@ dyn_cast<InvokeInst>(InitialBlockPredecessorTerminator)) {
             ReplacementBlock);
       }
     } else if (auto *InitialBlockPredecessorTerminatorBranch =
-dyn_cast<BranchInst>(InitialBlockPredecessorTerminator)) {
-            for (unsigned int successorIdx = 0;
-                successorIdx <
+                   dyn_cast<BranchInst>(InitialBlockPredecessorTerminator)) {
+      for (unsigned int successorIdx = 0;
+           successorIdx <
            InitialBlockPredecessorTerminatorBranch->getNumSuccessors();
-                ++successorIdx) {
-              if (InitialBlock ==
-InitialBlockPredecessorTerminatorBranch->getSuccessor(
-successorIdx)) {
-                InitialBlockPredecessorTerminatorBranch->setSuccessor(
-successorIdx, ReplacementBlock);
+           ++successorIdx) {
+        if (InitialBlock ==
+            InitialBlockPredecessorTerminatorBranch->getSuccessor(
+                successorIdx)) {
+          InitialBlockPredecessorTerminatorBranch->setSuccessor(
+              successorIdx, ReplacementBlock);
         }
       }
     } else if (auto *InitialBlockPredecessorTerminatorCleanupReturn =
@@ -1083,8 +1083,8 @@ successorIdx, ReplacementBlock);
                        InitialBlockPredecessorTerminator)) {
       InitialBlockPredecessorTerminatorCleanupReturn->setUnwindDest(
           ReplacementBlock);
-          } else {
-            InitialBlockPredecessorTerminator->print(dbgs());
+    } else {
+      InitialBlockPredecessorTerminator->print(dbgs());
       report_fatal_error(
           "Terminator is not implemented in splitIfBasicBlockPredecessors");
     }
@@ -1111,15 +1111,15 @@ splitBasicBlocksNotDominatedByCoroBegin(const FrameDataInfo &FrameData,
       if (!DT.dominates(Shape.CoroBegin, CurrentBlock)) {
         SpillUserBlocksSet.insert(CurrentBlock);
       }
-          }
-        }
-        
-        // Run is in reversed post order, to enforce visiting predecessors before
+    }
+  }
+
+  // Run is in reversed post order, to enforce visiting predecessors before
   // successors
   for (BasicBlock *CurrentBlock : ReversePostOrderTraversal<Function *>(F)) {
     if (!SpillUserBlocksSet.contains(CurrentBlock)) {
-        continue;
-      }
+      continue;
+    }
     SpillUserBlocksSet.erase(CurrentBlock);
 
     // Preserve the current node. the duplicate will become the unspilled
diff --git a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
index f95d214f5..ceac1cc71 100644
--- a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
@@ -377,7 +377,7 @@ static void replaceUnwindCoroEnd(AnyCoroEndInst *End, const coro::Shape &Shape,
   if (auto Bundle = End->getOperandBundle(LLVMContext::OB_funclet)) {
     auto *FromPad = cast<CleanupPadInst>(Bundle->Inputs[0]);
 
-    // If the terminator is an invoke, 
+    // If the terminator is an invoke,
     // set the cleanupret unwind destination the same as the other edges, to
     // avoid validation errors
     BasicBlock *UBB = nullptr;
 | 
| Fixed formatter comments | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. It looks much better. Just a few questions about implementation.
| } | ||
|  | ||
| // Dominance issue fixer for each predecessor satisfying predicate function | ||
| void splitIfBasicBlockPredecessors( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static void?
| // mapping to their corresponding clones | ||
| static void finalizeBasicBlockCloneAndTrackSuccessors( | ||
| BasicBlock *InitialBlock, BasicBlock *ClonedBlock, ValueToValueMapTy &VMap, | ||
| SmallSet<BasicBlock *, 20> &SuccessorBlocksSet) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use SmallSetImpl to avoid explicitly passing 20.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you meant SmallPtrSetImpl? It is also bound to storage (in ctor), it expects the caller to pass a storage buffer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, SmallPtrSetImpl is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually I changed to SmallPtrSet (which was implicitly used anyway in practice when I used SmallSet, as the Set is of ptrs), and not SmallPtrSetImpl, because SmallPtrSetImpl does not spare the storage management as thought initially.
| const coro::Shape &Shape, Function *F, | ||
| DominatorTree &DT) { | ||
| ValueToValueMapTy VMap; | ||
| SmallSet<BasicBlock *, 20> SpillUserBlocksSet; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the value 20 intentional? I suspect the number of basic blocks that are not dominated by coro.begin is small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all my tests, I found <=2, so I will change it to 3
|  | ||
| // If the terminator is an invoke, | ||
| // set the cleanupret unwind destination the same as the other edges, to | ||
| // avoid validation errors | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "validation errors" is confusing to maintainers that not familiar to this bug. Maybe we could drop it.
BTW, there are some similar comments that are context-dependent. It would be great if you could reword them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rewritten comments
| // Preserve the current node. the duplicate will become the unspilled | ||
| // alternative | ||
| auto UnspilledAlternativeBlock = | ||
| CloneBasicBlock(CurrentBlock, VMap, ".unspilled_alternative", F); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we clear VMap before reuse it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it's mandatory to keep it across traverse (when tested it previously), but I don't find an actual example to prove it, so I will clear it for now.
| ; In coro-split, this coroutine code reduced IR, produced using clang with async-exceptions | ||
| ; crashed before fix because of the validation mismatch of Instruction does not dominate all uses! | ||
| ; RUN: opt < %s -passes='coro-split' -S 2>&1 | FileCheck %s --implicit-check-not="Instruction does not dominate all uses!" | ||
| ; RUN: opt < %s -passes='default<Os>,coro-split' -S 2>&1 | FileCheck %s --implicit-check-not="Instruction does not dominate all uses!" | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain why do we need coro-split after default<Os>?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need, it reproduces also just with coro-split, I removed the default<Os>.
| ; In coro-split, this coroutine code reduced IR, produced using clang with async-exceptions | ||
| ; crashed before fix because of the validation mismatch of Instruction does not dominate all uses! | ||
| ; RUN: opt < %s -passes='coro-split' -S 2>&1 | FileCheck %s --implicit-check-not="Instruction does not dominate all uses!" | ||
| ; RUN: opt < %s -passes='default<Os>,coro-split' -S 2>&1 | FileCheck %s --implicit-check-not="Instruction does not dominate all uses!" | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking for crashes/errors is already handled implicitly, so it doesn't need explicit mention. Perhaps we could go further by using utils/update_test_checks.py to generate more robust tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fix handles 2 different bugs, I wanted to make sure that each IR test file validates that it's correspondent bug is being fixed, but it is the situation also after simplifying the RUN line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a impression that this patch tries to fix problematic patterns after the conversion. This is generally unwanted. We prefer to make the checks before the conversion. e.g., in certain cases, let's avoid adding something as the spills or vice versa.
It will be helpful to me to understand the problem if you can give more description to the issue with IR examples.
…t traversing func if no violating spill users
| The fix is done right before the conversion, so no logics was changed in the conversion itself. 
 
 | 
| Ready for CR | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm reading your patch carefully and noticed you did not split llvm.seh.scope.begin into resume/destroy functions? I suspect that unhandled_exception would not be invoked if an exception were thrown within the resume/destroy function.
|  | ||
| 14: ; preds = %12, %1 | ||
| %15 = cleanuppad within none [] | ||
| store i32 0, ptr %0, align 4 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you are clearing pointer in unhandled_exception(). Could you please confirm it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pr148035_inst_does_not_dominate.ll is generated using llvm-reduce of a basic cpp file provided in the Issue.
It should only be an input test file to reproduce the potential bug, the fix I implemented takes this instruction as given.
Anyway, I think that this store is a product of the llvm-reduce, reduction of the default_delete of the std::unique_ptr
  %108 = cleanuppad within none []
  call void @"??1?$unique_ptr@HU?$default_delete@H@std@@@std@@QEAA@XZ"(ptr noundef nonnull align 8 dereferenceable(8) %0) #4 [ "funclet"(token %108) ]
  cleanupret from %108 unwind to caller
| My goal is to fix the coroutine functions nodes dominance violations. 
 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixing dominance violations is necessary, but not sufficient. We must ensure the input IR is semantically correct and that transformations are semantics-preserving. Specific to llvm.seh.scope.begin/end, we must keep them paired.
I agree with @ChuanqiXu9 that CoroFrame.cpp may not be the ideal place to put these logics. Given the patch is done before coro frame construction, it would be better to place it earlier, e.g., in CoroSplit.cpp.
FWIW, I think we need a clearer motivation beyond focusing on the validation error, which could potentially hide the root issues.
Fixes #148035
The problem
3 cpp files from #148035 that are based on cppreference coroutines tutorial (the actual code, and 2 minor changes on top of it - of using std::unique_ptr as byval argument[s] to coroutine) are failing to compile under (+assertions) clang-cl with the parameters:
/c /EHa /std:c++latest /OsWhy?
It seems that there are 2 broken invariants during the coroutine split pass
How?
The solution here aims to restore those 2 invariants: