Fix a SimplifyCFG typo that leads to unbounded optimization

atrick · atrick · commit e3e2849ca309 · 2021-01-26T23:31:26.000-08:00
Fixes rdar://73357726 ([SR-14068]: Compiling with optimisation runs indefinitely for grpc-swift) The root cause of this problem is that SimplifyCFG::tryJumpThreading jump threads into loops, effectively peeling loops. This is not the right way to implement loop peeling. That belongs in a loop optimization pass. There's is simply no sane way to control jump threading if it is allowed across loop boundaries, both from the standpoint of requiring optimizations to terminate and from the standpoint of reducing senseless code bloat. SimplifyCFG does have a mechanism to avoid jump-threading into loop in most cases. That mechanism would actually prevent the infinite loop peeling in this particular case if it were implemented correctly. But the original implementation circa 2014 appears to have a typo. This commit fixes that obvious bug. I do not think it's a sufficient to ensure we never see the bad behavior. I will file separate bugs for the broader issue. This bad behavior was exposed incidentally by splitting critical edges. Without edge splitting, SimplifyCFG::simplifyBlocks only performs "jump threading" once, creating a critical edge to the loop header. Because simplifyBlocks works under the assumption that there are no critical edges, it never attempts to perform jump threading again. In other words, the presence of the critical edge "breaks" the optimization, preventing it from continuing as intended. With edge splitting, the simplifyBlocks worklist performs "jump threading" followed by "jump to trampoline" removal, which creates a new loop-back edge to the original loop header. This is fine. However, simplifyBlocks iteratively attempts all optimizations up to a fix point and it does not stop at loop headers! So, splitting the critical edge causes simplifyBlocks to work as intended, which leads to infinite loop peeling. The end result is an infinite sequence of nested loops. Each peeled iteration is actually within the parent loop. (cherry picked from commit 8948f75)
diff --git a/lib/SILOptimizer/Transforms/SimplifyCFG.cpp b/lib/SILOptimizer/Transforms/SimplifyCFG.cpp
@@ -1401,7 +1401,7 @@ bool SimplifyCFG::simplifyBranchBlock(BranchInst *BI) {
     // Eliminating the trampoline can expose opportunities to improve the
     // new block we branch to.
     if (LoopHeaders.count(DestBB))
-      LoopHeaders.insert(BB);
+      LoopHeaders.insert(trampolineDest.destBB);
 
     addToWorklist(trampolineDest.destBB);
     BI->eraseFromParent();
diff --git a/test/SILOptimizer/simplify_cfg_simple.sil b/test/SILOptimizer/simplify_cfg_simple.sil
@@ -7,6 +7,11 @@ sil_stage canonical
 import Builtin
 import Swift
 
+internal enum Enum {
+  case one
+  case two
+}
+
 // CHECK-LABEL: sil @simple_test : $@convention(thin) () -> () {
 // CHECK: bb0:
 // CHECK-NEXT: tuple
@@ -27,3 +32,64 @@ bb3:
   %9999 = tuple ()
   return %9999 : $()
 }
+
+// Test that SimplifyCFG::simplifyBlocks, tryJumpThreading does not
+// perform unbounded loop peeling.
+//
+// rdar://73357726 ([SR-14068]: Compiling with optimisation runs indefinitely for grpc-swift)
+// CHECK-LABEL: sil @testInfinitePeeling : $@convention(method) (Builtin.Int64, Enum) -> () {
+//
+// There is only one switch_enum blocks, and it is no longer in a loop.
+// CHECK: bb0(%0 : $Builtin.Int64, %1 : $Enum):
+// CHECK:   switch_enum %1 : $Enum, case #Enum.one!enumelt: bb3, case #Enum.two!enumelt: bb4
+// CHECK: bb1:
+// CHECK:   br bb8
+// CHECK: bb2:
+// CHECK:   br bb5(%{{.*}} : $Enum)
+//
+// This is the original cond_br block
+// CHECK: bb3:
+// CHECK:   cond_br %{{.*}}, bb2, bb1
+// CHECK: bb4:
+// CHECK:   br bb5(%1 : $Enum)
+//
+// This is the cond_br block after jump-threading.
+// CHECK: bb5(%{{.*}} : $Enum):
+// CHECK:   cond_br %{{.*}}, bb6, bb7
+// CHECK: bb6:
+// CHECK:   br bb5(%{{.*}} : $Enum)
+// CHECK: bb7:
+// CHECK:   br bb8
+// CHECK: bb8:
+// CHECK:   return %19 : $()
+// CHECK-LABEL: } // end sil function 'testInfinitePeeling'
+sil @testInfinitePeeling : $@convention(method) (Builtin.Int64, Enum) -> () {
+bb0(%0 : $Builtin.Int64, %1 : $Enum):
+  %2 = integer_literal $Builtin.Int64, 99999999
+  br bb1(%0 : $Builtin.Int64, %1 : $Enum)
+
+bb1(%4 : $Builtin.Int64, %5 : $Enum):
+  switch_enum %5 : $Enum, case #Enum.one!enumelt: bb4, default bb5
+
+bb2(%7 : $Builtin.Int64, %8 : $Enum):
+  %9 = builtin "cmp_slt_Int64"(%2 : $Builtin.Int64, %7 : $Builtin.Int64) : $Builtin.Int1
+  cond_br %9, bb3, bb6
+
+bb3:
+  br bb1(%7 : $Builtin.Int64, %8 : $Enum)
+
+bb4:
+  %12 = integer_literal $Builtin.Int64, 1
+  %13 = integer_literal $Builtin.Int1, -1
+  %14 = builtin "sadd_with_overflow_Int64"(%4 : $Builtin.Int64, %12 : $Builtin.Int64, %13 : $Builtin.Int1) : $(Builtin.Int64, Builtin.Int1)
+  %15 = tuple_extract %14 : $(Builtin.Int64, Builtin.Int1), 0
+  %16 = enum $Enum, #Enum.two!enumelt
+  br bb2(%15 : $Builtin.Int64, %16 : $Enum)
+
+bb5:
+  br bb2(%2 : $Builtin.Int64, %5 : $Enum)
+
+bb6:
+  %19 = tuple ()
+  return %19 : $()
+}