Use all unsolved pieces on the offending line as expand pieces. (#1494)

munificent · web-flow · commit 0a785297b6c8 · 2024-05-20T15:22:19.000-07:00
The solver works by incrementally building a up a solution by binding pieces to states one at a time. To avoid wasting time exploring solutions that are pointless, it only looks at unbound pieces in play when the first line containing overflow characters or an invalid newline as written.

Before this PR, it only looked at the *first* unbound piece on that line. Often, the first piece on the line is not the one that actually needs to split. For example, in:

```dart
//                    |
variable = function(argument);
```

Here, the first piece on the overflowing line is the AssignPiece for the `=`, but the piece we actually want to split at is the ListPiece for the argument list.

To handle that, the solver currently tries binding the first piece to all values, including State.unsplit, even though that's effectively the value the current solution used, since unbound pieces behave like they have State.unsplit. The only reason it makes a new solution and binds the piece to State.unsplit is that if that piece turns out to *not* be the one that needs to split, we can now find the *next* piece on that same line. Now that the first piece is *bound* to State.unsplit, the second piece will be the first *unbound* one.

But the end result is that we end up generating a lot of more or less redundant solutions that just bind a bunch of pieces to State.unsplit and then produce the exact same formatted result.

Instead, this PR collects *all* of the unbound pieces on the first overflowing line. When it expands, it expands them all, but *doesn't* bind any of them to State.unsplit. The old formatter works the same way, but it wasn't clear to me that doing so was important for perf. It is!

```

Benchmark (tall)                fastest   median  slowest  average  baseline
-----------------------------  --------  -------  -------  -------  --------
block                             0.065    0.067    0.131    0.070     96.3%
chain                             1.515    1.540    1.629    1.547    172.2%
collection                        0.169    0.173    0.189    0.175     98.6%
collection_large                  0.896    0.926    0.959    0.925     96.5%
conditional                       0.088    0.089    0.104    0.091    179.9%
curry                             1.651    1.667    1.689    1.668    147.9%
flutter_popup_menu_test           0.409    0.422    0.448    0.423    116.8%
flutter_scrollbar_test            0.154    0.159    0.176    0.159     94.2%
function_call                     1.428    1.457    1.625    1.463     98.1%
infix_large                       0.680    0.702    0.776    0.708    144.4%
infix_small                       0.163    0.167    0.193    0.169     97.5%
interpolation                     0.090    0.093    0.120    0.094    107.0%
large                             4.552    4.631    4.987    4.650     90.6%
top_level                         0.180    0.183    0.214    0.185    135.0%
```

My goal is to get the new formatter as fast as the old one on a real-world corpus. Here's the results for formatting the Flutter repo:

```
Current formatter    15.890 ========================================
Optimized            14.309 ====================================
Old formatter         7.131 =================
The current formatter is 55.12% slower than the old formatter.
The optimized is 11.05% faster than the current formatter.
The optimized is 50.16% slower than the old formatter.
The optimization gets the formatter 18.05% of the way to the old one.
```

So not a huge improvement, but a big step in the right direction.
diff --git a/lib/src/back_end/code_writer.dart b/lib/src/back_end/code_writer.dart
@@ -72,22 +72,18 @@ class CodeWriter {
   /// this line.
   bool _foundExpandLine = false;
 
-  /// The first solvable piece on the first overflowing or invalid line, if
-  /// we've found one.
+  /// The solvable pieces on the first overflowing or invalid line, if we've
+  /// found any.
   ///
   /// A piece is "solvable" if we haven't already bound it to a state and there
   /// are multiple states it accepts. This is the piece whose states will be
   /// bound when we expand the [Solution] that this [CodeWriter] is building
   /// into further solutions.
   ///
-  /// If [_foundExpandLine] is `false`, then this is the first solvable piece
-  /// that has written text to the current line. It may not actually be an
-  /// expand piece. We don't know until we reach the end of the line to see if
-  /// it overflows or is invalid. If the line is OK, then [_nextPieceToExpand]
-  /// is cleared when the next line begins. If [_foundExpandLine] is `true`,
-  /// then this known to be the piece that will be expanded next for this
-  /// solution.
-  Piece? _nextPieceToExpand;
+  /// If [_foundExpandLine] is `true`, then this contains the list of unsolved
+  /// pieces that were being formatted when text was written to the first
+  /// problematic line.
+  final List<Piece> _expandPieces = [];
 
   /// The stack of solvable pieces currently being formatted.
   ///
@@ -96,6 +92,10 @@ class CodeWriter {
   /// solution if the line ends up overflowing.
   final List<Piece> _currentUnsolvedPieces = [];
 
+  /// The set of unsolved pieces that were being formatted when text was
+  /// written to the current line.
+  final Set<Piece> _currentLinePieces = {};
+
   /// [leadingIndent] is the number of spaces of leading indentation at the
   /// beginning of each line independent of indentation created by pieces being
   /// written.
@@ -106,12 +106,12 @@ class CodeWriter {
     _pendingIndent = leadingIndent;
   }
 
-  /// Returns the final formatted text and the next piece that can be expanded
+  /// Returns the final formatted text and the next pieces that can be expanded
   /// from the solution this [CodeWriter] is writing, if any.
-  (String, Piece?) finish() {
+  (String, List<Piece>) finish() {
     _finishLine();
 
-    return (_buffer.toString(), _nextPieceToExpand);
+    return (_buffer.toString(), _expandPieces);
   }
 
   /// Appends [text] to the output.
@@ -128,11 +128,9 @@ class CodeWriter {
     _column += text.length;
 
     // If we haven't found an overflowing line yet, then this line might be one
-    // so keep track of the pieces we've encountered.
-    if (!_foundExpandLine &&
-        _nextPieceToExpand == null &&
-        _currentUnsolvedPieces.isNotEmpty) {
-      _nextPieceToExpand = _currentUnsolvedPieces.first;
+    // so keep track of the unsolved pieces we've encountered on it.
+    if (!_foundExpandLine) {
+      _currentLinePieces.addAll(_currentUnsolvedPieces);
     }
   }
 
@@ -377,17 +375,16 @@ class CodeWriter {
       _solution.addOverflow(_column - _pageWidth);
     }
 
-    // If we found a problematic line, and there is a piece on the line that
-    // we can try to split, then remember that piece so that the solution will
-    // expand it next.
-    if (!_foundExpandLine &&
-        _nextPieceToExpand != null &&
-        (_column > _pageWidth || !_solution.isValid)) {
-      // We found a problematic line, so remember it and the piece on it.
+    // If we found a problematic line, and there is are pieces on the line that
+    // we can try to split, then remember them so that the solution will expand
+    // them next.
+    if (!_foundExpandLine && (_column > _pageWidth || !_solution.isValid)) {
+      // We found a problematic line, so remember the pieces on it.
       _foundExpandLine = true;
+      _expandPieces.addAll(_currentLinePieces);
     } else if (!_foundExpandLine) {
       // This line was OK, so we don't need to expand the piece on it.
-      _nextPieceToExpand = null;
+      _currentLinePieces.clear();
     }
   }
 }
diff --git a/lib/src/back_end/solution.dart b/lib/src/back_end/solution.dart
@@ -72,20 +72,20 @@ class Solution implements Comparable<Solution> {
   ///
   /// So we skip past any pieces that aren't on overflowing lines or on lines
   /// whose newline led to an invalid solution. Further, it's also the case
-  /// that splitting an earlier pieces will often reshuffle the formatting of
-  /// much of the code following it.
+  /// that splitting earlier pieces will often reshuffle the formatting of much
+  /// of the code following it.
   ///
-  /// Thus we only worry about the *first* unsolved piece on the first
-  /// problematic line when expanding. If selecting states for that piece still
-  /// doesn't help, the solver will work its way through later pieces from those
-  /// subsequenct partial solutions.
+  /// Thus we only worry about unsolved pieces on the *first* problematic line
+  /// when expanding. If selecting states for those pieces still doesn't help,
+  /// the solver will work its way through later pieces from those subsequent
+  /// partial solutions.
   ///
   /// This lets us efficiently skip through almost all of the pieces that don't
   /// need to be touched in order to find a valid solution.
   ///
-  /// If this is `null`, then there are no further solutions to generate from
+  /// If this is empty, then there are no further solutions to generate from
   /// this one. It's either a dead end or a winner.
-  late final Piece? _nextPieceToExpand;
+  late final List<Piece> _expandPieces;
 
   /// The offset in [text] where the selection starts, or `null` if there is
   /// no selection.
@@ -125,10 +125,10 @@ class Solution implements Comparable<Solution> {
 
     var writer = CodeWriter(pageWidth, leadingIndent, cache, this);
     writer.format(root);
-    var (text, nextPieceToExpand) = writer.finish();
+    var (text, expandPieces) = writer.finish();
 
     _text = text;
-    _nextPieceToExpand = nextPieceToExpand;
+    _expandPieces = expandPieces;
   }
 
   /// Attempt to eagerly bind [piece] to a state given that it must fit within
@@ -215,8 +215,8 @@ class Solution implements Comparable<Solution> {
     _invalidPiece = piece;
   }
 
-  /// Derives new potential solutions from this one by binding
-  /// [_nextPieceToExpand] to all of its possible states.
+  /// Derives new potential solutions from this one by binding [_expandPieces]
+  /// to all of their possible states.
   ///
   /// If there is no potential piece to expand, or all attempts to expand it
   /// fail, returns an empty list.
@@ -227,27 +227,48 @@ class Solution implements Comparable<Solution> {
     // the same way, so discard the whole solution tree hanging off this one.
     if (_invalidPiece case var piece? when isBound(piece)) return const [];
 
-    var expandPiece = _nextPieceToExpand;
-
     // If there is no piece that we can expand on this solution, it's a dead
     // end (or a winner).
-    if (expandPiece == null) return const [];
+    if (_expandPieces.isEmpty) return const [];
 
-    // For each state that the expanding piece can be in, create a new solution
-    // that inherits all of the bindings of this one, and binds the expanding
-    // piece to that state (along with any further pieces constrained by that
-    // one).
     var solutions = <Solution>[];
-    for (var state in expandPiece.states) {
-      var newStates = {..._pieceStates};
-
-      var additionalCost = _tryBind(newStates, expandPiece, state);
-
-      // Discard the solution if we hit a constraint violation.
-      if (additionalCost == null) continue;
-
-      solutions.add(Solution._(cache, root, pageWidth, leadingIndent,
-          cost + additionalCost, newStates));
+    for (var i = 0; i < _expandPieces.length; i++) {
+      // For each non-default state that the expanding piece can be in, create
+      // a new solution that inherits all of the bindings of this one, and binds
+      // the expanding piece to that state (along with any further pieces
+      // constrained by that one).
+      var expandPiece = _expandPieces[i];
+      for (var state in expandPiece.additionalStates) {
+        var newStates = {..._pieceStates};
+
+        // Bind all preceding expand pieces to their unsplit state. Their
+        // other states have already been expanded by earlier iterations of
+        // the outer for loop.
+        var valid = true;
+        var additionalCost = 0;
+        for (var j = 0; j < i; j++) {
+          if (_tryBind(newStates, _expandPieces[j], State.unsplit)
+              case var cost?) {
+            additionalCost += cost;
+          } else {
+            valid = false;
+            break;
+          }
+        }
+
+        // Discard the solution if we hit a constraint violation.
+        if (!valid) continue;
+
+        if (_tryBind(newStates, expandPiece, state) case var cost?) {
+          additionalCost += cost;
+        } else {
+          // Discard the solution if we hit a constraint violation.
+          continue;
+        }
+
+        solutions.add(Solution._(cache, root, pageWidth, leadingIndent,
+            cost + additionalCost, newStates));
+      }
     }
 
     return solutions;
diff --git a/lib/src/back_end/solver.dart b/lib/src/back_end/solver.dart
@@ -62,7 +62,7 @@ class Solver {
     if (debug.traceSolver) {
       var unsolved = <Piece>[];
       void traverse(Piece piece) {
-        if (piece.states.length > 1) unsolved.add(piece);
+        if (piece.additionalStates.isNotEmpty) unsolved.add(piece);
 
         piece.forEachChild(traverse);
       }
diff --git a/lib/src/piece/piece.dart b/lib/src/piece/piece.dart
@@ -15,15 +15,6 @@ typedef Constrain = void Function(Piece other, State constrainedState);
 /// formatting and line splitting. The final output is then determined by
 /// deciding which pieces split and how.
 abstract class Piece {
-  /// The ordered list of ways this piece may split.
-  ///
-  /// This is [State.unsplit], which all pieces support, followed by any other
-  /// [additionalStates].
-  List<State> get states {
-    if (_pinnedState case var pinned?) return [pinned];
-    return [State.unsplit, ...additionalStates];
-  }
-
   /// The ordered list of all possible ways this piece could split.
   ///
   /// Piece subclasses should override this if they support being split in

Original file line number	Diff line number	Diff line change
`@@ -62,7 +62,7 @@ class Solver {`
`62`	`62`	`if (debug.traceSolver) {`
`63`	`63`	`var unsolved = <Piece>[];`
`64`	`64`	`void traverse(Piece piece) {`
`65`		`- if (piece.states.length > 1) unsolved.add(piece);`
	`65`	`+ if (piece.additionalStates.isNotEmpty) unsolved.add(piece);`
`66`	`66`
`67`	`67`	`piece.forEachChild(traverse);`
`68`	`68`	`}`