Skip to content

Commit c42eee5

Browse files
authored
Create a second IR between Pieces and output string. (#1517)
* Create a second IR between Pieces and output string. Before this PR, CodeWriter eagerly built a complete string of formatted code using a StringBuffer. This is simple and works OK, but it's a little slow when separate formatting gets involved. Since a subtree of a Piece tree might be formatted separately, it means we often build a string using a StringBuffer only to then append that result to some surrounding StringBuffer, recursively. This PR introduces a slightly more abstract representation for "formatted code but not quite a single string". It lets us compose separately formatted subtrees by just adding a single existing object to a list. The performance gain isn't huge, but is measurable: ``` Benchmark (tall) fastest median slowest average baseline ----------------------------- -------- ------- ------- ------- -------- block 0.063 0.065 0.114 0.069 101.0% chain 0.605 0.618 0.645 0.619 103.6% collection 0.159 0.163 0.175 0.164 101.4% collection_large 0.856 0.896 3.295 0.997 100.1% conditional 0.065 0.067 0.086 0.069 99.7% curry 0.547 0.565 0.596 0.565 106.8% ffi 0.144 0.150 0.170 0.151 110.0% flutter_popup_menu_test 0.262 0.272 0.295 0.272 103.4% flutter_scrollbar_test 0.123 0.125 0.142 0.128 103.7% function_call 1.287 1.339 1.474 1.342 101.1% infix_large 0.619 0.638 0.669 0.642 107.2% infix_small 0.159 0.171 0.719 0.228 105.5% interpolation 0.090 0.091 0.114 0.093 100.9% large 3.456 3.499 3.537 3.499 101.7% top_level 0.141 0.145 0.168 0.147 102.2% ``` And when run on a large repo: ``` Current formatter 8.536 ======================================== Optimized 8.226 ====================================== Old formatter 4.441 ==================== The current formatter is 47.97% slower than the old formatter. The optimized is 3.77% faster than the current formatter. The optimized is 46.01% slower than the old formatter. The optimization gets the formatter 7.57% of the way to the old one. ``` The other motivation for this change is that I'm starting to work on being able to opt a region of code out of formatting and I suspect (but am not sure yet) that this representation will make it easier to slot the existing formatted code into the output as it's being lowered to a string. * Fix typo.
1 parent 1536c11 commit c42eee5

File tree

5 files changed

+201
-74
lines changed

5 files changed

+201
-74
lines changed

lib/src/back_end/code.dart

Lines changed: 174 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,174 @@
1+
// Copyright (c) 2024, the Dart project authors. Please see the AUTHORS file
2+
// for details. All rights reserved. Use of this source code is governed by a
3+
// BSD-style license that can be found in the LICENSE file.
4+
5+
/// Base class for an object that represents fully formatted code.
6+
///
7+
/// We use this instead of immediately generating a string for the resulting
8+
/// formatted code because of separate formatting. Often, a subtree of the
9+
/// [Piece] tree can be solved and formatted separately. The resulting
10+
/// [Solution] may be used by multiple different surrounding solutions while
11+
/// the [Solver] works its magic looking for the best solution. When a
12+
/// separately formatted child solution is merged into its parent, we want that
13+
/// to be fast. Appending strings to a [StringBuffer] is fairly fast, but not
14+
/// as fast simply appending a single [GroupCode] to the parent solution's
15+
/// [GroupCode].
16+
sealed class Code {}
17+
18+
/// A [Code] object which can be written to and contain other child [Code]
19+
/// objects.
20+
final class GroupCode extends Code {
21+
/// The child [Code] objects contained in this group.
22+
final List<Code> _children = [];
23+
24+
/// Appends [text] to this code.
25+
void write(String text) {
26+
_children.add(_TextCode(text));
27+
}
28+
29+
/// Writes a newline and the subsequent indentation to this code.
30+
///
31+
/// If [blank] is `true`, then a blank line is written. Otherwise, only a
32+
/// single newline is written. The [indent] parameter is the number of spaces
33+
/// of leading indentation on the next line after the newline.
34+
void newline({required bool blank, required int indent}) {
35+
_children.add(_NewlineCode(blank: blank, indent: indent));
36+
}
37+
38+
/// Adds an entire existing code [group] as a child of this one.
39+
void group(GroupCode group) {
40+
_children.add(group);
41+
}
42+
43+
/// Mark the selection start as occurring [offset] characters after the code
44+
/// that has already been written.
45+
void startSelection(int offset) {
46+
_children.add(_MarkerCode(_Marker.start, offset));
47+
}
48+
49+
/// Mark the selection end as occurring [offset] characters after the code
50+
/// that has already been written.
51+
void endSelection(int offset) {
52+
_children.add(_MarkerCode(_Marker.end, offset));
53+
}
54+
55+
/// Traverse the [Code] tree and build the final formatted string.
56+
///
57+
/// Returns the formatted string and the selection markers if there are any.
58+
({String code, int? selectionStart, int? selectionEnd}) build() {
59+
var buffer = StringBuffer();
60+
int? selectionStart;
61+
int? selectionEnd;
62+
63+
_build(buffer, (marker, offset) {
64+
if (marker == _Marker.start) {
65+
selectionStart = offset;
66+
} else {
67+
selectionEnd = offset;
68+
}
69+
});
70+
71+
return (
72+
code: buffer.toString(),
73+
selectionStart: selectionStart,
74+
selectionEnd: selectionEnd
75+
);
76+
}
77+
78+
void _build(StringBuffer buffer,
79+
void Function(_Marker marker, int offset) markSelection) {
80+
for (var i = 0; i < _children.length; i++) {
81+
var child = _children[i];
82+
switch (child) {
83+
case _NewlineCode():
84+
// Don't write any leading newlines at the top of the buffer.
85+
if (i > 0) {
86+
buffer.writeln();
87+
if (child._blank) buffer.writeln();
88+
}
89+
90+
buffer.write(_indents[child._indent] ?? (' ' * child._indent));
91+
92+
case _TextCode():
93+
buffer.write(child._text);
94+
95+
case GroupCode():
96+
child._build(buffer, markSelection);
97+
98+
case _MarkerCode():
99+
markSelection(child._marker, buffer.length + child._offset);
100+
}
101+
}
102+
}
103+
}
104+
105+
/// A [Code] object for a newline followed by any leading indentation.
106+
final class _NewlineCode extends Code {
107+
final bool _blank;
108+
final int _indent;
109+
110+
_NewlineCode({required bool blank, required int indent})
111+
: _indent = indent,
112+
_blank = blank;
113+
}
114+
115+
/// A [Code] object for literal source text.
116+
final class _TextCode extends Code {
117+
final String _text;
118+
119+
_TextCode(this._text);
120+
}
121+
122+
/// Marks the location of the beginning or end of a selection as occurring
123+
/// [_offset] characters past the point where this marker object appears in the
124+
/// list of [Code] objects.
125+
final class _MarkerCode extends Code {
126+
/// What kind of selection endpoint is being marked.
127+
final _Marker _marker;
128+
129+
/// The number of characters past this object where the marker should appear
130+
/// in the resulting code.
131+
final int _offset;
132+
133+
_MarkerCode(this._marker, this._offset);
134+
}
135+
136+
/// Which selection marker is pointed to by a [_MarkerCode].
137+
enum _Marker { start, end }
138+
139+
/// Pre-calculated whitespace strings for various common levels of indentation.
140+
///
141+
/// Generating these ahead of time is faster than concatenating multiple spaces
142+
/// at runtime.
143+
const _indents = {
144+
2: ' ',
145+
4: ' ',
146+
6: ' ',
147+
8: ' ',
148+
10: ' ',
149+
12: ' ',
150+
14: ' ',
151+
16: ' ',
152+
18: ' ',
153+
20: ' ',
154+
22: ' ',
155+
24: ' ',
156+
26: ' ',
157+
28: ' ',
158+
30: ' ',
159+
32: ' ',
160+
34: ' ',
161+
36: ' ',
162+
38: ' ',
163+
40: ' ',
164+
42: ' ',
165+
44: ' ',
166+
46: ' ',
167+
48: ' ',
168+
50: ' ',
169+
52: ' ',
170+
54: ' ',
171+
56: ' ',
172+
58: ' ',
173+
60: ' ',
174+
};

lib/src/back_end/code_writer.dart

Lines changed: 14 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ import 'dart:math';
55

66
import '../piece/piece.dart';
77
import '../profile.dart';
8+
import 'code.dart';
89
import 'solution.dart';
910
import 'solution_cache.dart';
1011

@@ -28,8 +29,8 @@ class CodeWriter {
2829
/// The solution this [CodeWriter] is generating code for.
2930
final Solution _solution;
3031

31-
/// Buffer for the code being written.
32-
final StringBuffer _buffer = StringBuffer();
32+
/// The code being written.
33+
final GroupCode _code = GroupCode();
3334

3435
/// What whitespace should be written before the next non-whitespace text.
3536
///
@@ -106,12 +107,12 @@ class CodeWriter {
106107
_pendingIndent = leadingIndent;
107108
}
108109

109-
/// Returns the final formatted text and the next pieces that can be expanded
110+
/// Returns the final formatted code and the next pieces that can be expanded
110111
/// from the solution this [CodeWriter] is writing, if any.
111-
(String, List<Piece>) finish() {
112+
(GroupCode, List<Piece>) finish() {
112113
_finishLine();
113114

114-
return (_buffer.toString(), _expandPieces);
115+
return (_code, _expandPieces);
115116
}
116117

117118
/// Appends [text] to the output.
@@ -124,7 +125,7 @@ class CodeWriter {
124125
/// selections inside lexemes are correctly updated.
125126
void write(String text) {
126127
_flushWhitespace();
127-
_buffer.write(text);
128+
_code.write(text);
128129
_column += text.length;
129130

130131
// If we haven't found an overflowing line yet, then this line might be one
@@ -253,20 +254,7 @@ class CodeWriter {
253254
_flushWhitespace();
254255

255256
_solution.mergeSubtree(solution);
256-
257-
// If a selection marker was in the child piece, set it in this piece,
258-
// relative to where the child's code is appended.
259-
if (solution.selectionStart case var start?) {
260-
_solution.startSelection(_buffer.length + start);
261-
}
262-
263-
if (solution.selectionEnd case var end?) {
264-
_solution.endSelection(_buffer.length + end);
265-
}
266-
267-
Profile.begin('CodeWriter.format() write separate piece text');
268-
_buffer.write(solution.text);
269-
Profile.end('CodeWriter.format() write separate piece text');
257+
_code.group(solution.code);
270258
}
271259

272260
/// Format [piece] writing directly into this [CodeWriter].
@@ -334,13 +322,13 @@ class CodeWriter {
334322
/// Sets [selectionStart] to be [start] code units into the output.
335323
void startSelection(int start) {
336324
_flushWhitespace();
337-
_solution.startSelection(_buffer.length + start);
325+
_code.startSelection(start);
338326
}
339327

340328
/// Sets [selectionEnd] to be [end] code units into the output.
341329
void endSelection(int end) {
342330
_flushWhitespace();
343-
_solution.endSelection(_buffer.length + end);
331+
_code.endSelection(end);
344332
}
345333

346334
/// Write any pending whitespace.
@@ -355,18 +343,13 @@ class CodeWriter {
355343

356344
case Whitespace.newline:
357345
case Whitespace.blankLine:
358-
// Don't write any leading newlines at the top of the buffer.
359-
if (_buffer.isNotEmpty) {
360-
_finishLine();
361-
_buffer.writeln();
362-
if (_pendingWhitespace == Whitespace.blankLine) _buffer.writeln();
363-
}
364-
346+
_finishLine();
365347
_column = _pendingIndent;
366-
_buffer.write(' ' * _column);
348+
_code.newline(
349+
blank: _pendingWhitespace == Whitespace.blankLine, indent: _column);
367350

368351
case Whitespace.space:
369-
_buffer.write(' ');
352+
_code.write(' ');
370353
_column++;
371354
}
372355

lib/src/back_end/solution.dart

Lines changed: 5 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
// BSD-style license that can be found in the LICENSE file.
44
import '../piece/piece.dart';
55
import '../profile.dart';
6+
import 'code.dart';
67
import 'code_writer.dart';
78
import 'solution_cache.dart';
89

@@ -49,8 +50,8 @@ class Solution implements Comparable<Solution> {
4950
int _subtreeCost = 0;
5051

5152
/// The formatted code.
52-
String get text => _text;
53-
late final String _text;
53+
GroupCode get code => _code;
54+
late final GroupCode _code;
5455

5556
/// False if this Solution contains a newline where one is prohibited.
5657
///
@@ -109,16 +110,6 @@ class Solution implements Comparable<Solution> {
109110
/// this one. It's either a dead end or a winner.
110111
late final List<Piece> _expandPieces;
111112

112-
/// The offset in [text] where the selection starts, or `null` if there is
113-
/// no selection.
114-
int? get selectionStart => _selectionStart;
115-
int? _selectionStart;
116-
117-
/// The offset in [text] where the selection ends, or `null` if there is
118-
/// no selection.
119-
int? get selectionEnd => _selectionEnd;
120-
int? _selectionEnd;
121-
122113
/// Creates a new [Solution] with no pieces set to any state (which
123114
/// implicitly means they have state [State.unsplit] unless they're pinned to
124115
/// another state).
@@ -187,22 +178,6 @@ class Solution implements Comparable<Solution> {
187178
_subtreeCost += subtreeSolution.cost;
188179
}
189180

190-
/// Sets [selectionStart] to be [start] code units into the output.
191-
///
192-
/// This should only be called by [CodeWriter].
193-
void startSelection(int start) {
194-
assert(_selectionStart == null);
195-
_selectionStart = start;
196-
}
197-
198-
/// Sets [selectionEnd] to be [end] code units into the output.
199-
///
200-
/// This should only be called by [CodeWriter].
201-
void endSelection(int end) {
202-
assert(_selectionEnd == null);
203-
_selectionEnd = end;
204-
}
205-
206181
/// Mark this solution as having a newline where none is permitted by [piece]
207182
/// and is thus not a valid solution.
208183
///
@@ -319,9 +294,9 @@ class Solution implements Comparable<Solution> {
319294
SolutionCache cache, Piece root, int pageWidth, int leadingIndent) {
320295
var writer = CodeWriter(pageWidth, leadingIndent, cache, this);
321296
writer.format(root);
322-
var (text, expandPieces) = writer.finish();
323297

324-
_text = text;
298+
var (code, expandPieces) = writer.finish();
299+
_code = code;
325300
_expandPieces = expandPieces;
326301
}
327302

lib/src/back_end/solver.dart

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ class Solver {
103103

104104
if (debug.traceSolver) {
105105
debug.log(debug.bold('Try #$attempts $solution'));
106-
debug.log(solution.text);
106+
debug.log(solution.code.build().code);
107107
debug.log('');
108108
}
109109

@@ -134,7 +134,7 @@ class Solver {
134134
if (debug.traceSolver) {
135135
debug.unindent();
136136
debug.log(debug.bold('Solved $root to $best:'));
137-
debug.log(best.text);
137+
debug.log(solution.code.build().code);
138138
debug.log('');
139139
}
140140

0 commit comments

Comments
 (0)