Skip to content

Commit cb16515

Browse files
authored
Simplify subtree merging. (#1500)
A key optimization in the new formatting is formatting entire branches of the piece tree separately and then merging in the results. This lets us reuse the results of solving and formatting that subtree across multiple solutions. When merging the results back, it's important that the resulting solution has the same cost and other metrics as if the subtree had not been formatted separately. Otherwise, the optimization can change the formatting behavior. When I first implemented subtree merging, I ran into some issues where it seemed like the resulting cost wasn't right. I wasn't able to figure out exactly why. To fix it, I made sure we didn't double count the cost of any given bound piece by merging each piece state in the subtree back into the parent solution and only counting its cost if the parent didn't already have that piece bound. After a lot of debugging, I figured out why bound pieces would sometimes get double counted otherwise. The problem goes like this: 1. We create a solution. When creating it, we format it which traverses the piece tree and ends up separately formatting some subtrees and merging those into this solution. 2. Later, we expand this solution into a series of child solutions. When we do, we copy the parent solution's bound pieces and its cost. 3. When we go to format each of these expanded child solutions, we again traverse the piece tree which ends up separately formatting many of the same subtrees. For each one, we add in the cost of the subtree solution. But we've already added in that cost from the parent solution, which the expanded child inherited, so now we've double counted. This PR fixes that: For each solution, we separately track the cost of the pieces directly bound by the solution and the cost it accumulated from merging subtrees. When expanding a solution, we inherit the former cost, but not the latter. That way, when we go to format the expanded solution and merge the subtrees in again, we only count those subtree costs once, when the child merges them. This is a small but noticeable speed improvement: ``` Benchmark (tall) fastest median slowest average baseline ----------------------------- -------- ------- ------- ------- -------- block 0.063 0.068 0.120 0.071 109.4% chain 0.627 0.641 0.674 0.643 103.1% collection 0.161 0.173 0.197 0.173 108.9% collection_large 0.856 0.908 0.975 0.908 107.7% conditional 0.064 0.066 0.076 0.067 103.6% curry 0.585 0.596 0.652 0.600 101.7% flutter_popup_menu_test 0.271 0.280 0.294 0.279 108.2% flutter_scrollbar_test 0.128 0.129 0.146 0.132 124.3% function_call 1.307 1.370 1.486 1.369 110.4% infix_large 0.659 0.696 0.727 0.694 106.3% infix_small 0.168 0.179 0.204 0.179 103.6% interpolation 0.091 0.095 0.115 0.097 99.5% large 3.431 3.598 3.761 3.610 102.4% top_level 0.140 0.142 0.169 0.144 104.0% ``` And when formatting flutter/packages/flutter: ``` Current formatter 8.624 ======================================== Optimized 8.233 ====================================== Old formatter 4.455 ==================== The current formatter is 48.34% slower than the old formatter. The optimized is 4.75% faster than the current formatter. The optimized is 45.89% slower than the old formatter. The optimization gets the formatter 9.38% of the way to the old one. ``` But, maybe more importantly, my hope is that this is a step towards untangling cost calculation from formatting, so that I can move towards calculating cost eagerly and then only formatting a solution once we know it's the winner.
1 parent cedd052 commit cb16515

File tree

1 file changed

+14
-16
lines changed

1 file changed

+14
-16
lines changed

lib/src/back_end/solution.dart

Lines changed: 14 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,19 @@ class Solution implements Comparable<Solution> {
3535
final Map<Piece, List<State>> _allowedStates;
3636

3737
/// The amount of penalties applied based on the chosen line splits.
38-
int get cost => _cost;
38+
int get cost => _cost + _subtreeCost;
39+
40+
/// The cost of this solution based on pieces it has bound to states itself,
41+
/// excluding pieces from separately formatted subtrees.
3942
int _cost;
4043

44+
/// The cost of this solution from branches of the piece tree that were
45+
/// separately formatted and merged in using [mergeSubtree()].
46+
///
47+
/// We track this separately so that when expanding a solution, we don't
48+
/// double count the cost of separately formatted branches.
49+
int _subtreeCost = 0;
50+
4151
/// The formatted code.
4252
String get text => _text;
4353
late final String _text;
@@ -173,20 +183,8 @@ class Solution implements Comparable<Solution> {
173183
/// This is called when a subtree of a Piece tree is solved separately and
174184
/// the resulting solution is being merged with this one.
175185
void mergeSubtree(Solution subtreeSolution) {
176-
Profile.begin('Solution.mergeSubtree()');
177-
178186
_overflow += subtreeSolution._overflow;
179-
180-
// Add the subtree's bound pieces to this one. Make sure to not double
181-
// count costs for pieces that are already bound in this one.
182-
subtreeSolution._pieceStates.forEach((piece, state) {
183-
_pieceStates.putIfAbsent(piece, () {
184-
_cost += piece.stateCost(state);
185-
return state;
186-
});
187-
});
188-
189-
Profile.end('Solution.mergeSubtree()');
187+
_subtreeCost += subtreeSolution.cost;
190188
}
191189

192190
/// Sets [selectionStart] to be [start] code units into the output.
@@ -237,7 +235,7 @@ class Solution implements Comparable<Solution> {
237235
var expandPiece = _expandPieces[i];
238236
for (var state
239237
in _allowedStates[expandPiece] ?? expandPiece.additionalStates) {
240-
var expanded = Solution._(cache, root, pageWidth, leadingIndent, cost,
238+
var expanded = Solution._(cache, root, pageWidth, leadingIndent, _cost,
241239
{..._pieceStates}, {..._allowedStates});
242240

243241
// Bind all preceding expand pieces to their unsplit state. Their
@@ -308,7 +306,7 @@ class Solution implements Comparable<Solution> {
308306
];
309307

310308
return [
311-
'\$$cost',
309+
'\$$cost ($_cost + $_subtreeCost)',
312310
if (overflow > 0) '($overflow over)',
313311
if (!isValid) '(invalid)',
314312
states.join(' '),

0 commit comments

Comments
 (0)