Skip to content

Commit a350882

Browse files
authored
Refactor Chunk to represent the split preceding the chunk's text. (#1110)
* Refactor Chunk to represent the split preceding the chunk's text. The internal representation that the formatter users is a series of chunks of code that can't contain any internal line breaks separated by information describing the split between each one: .----. .-----. .----. .-----. .----. .-----. ... |text| |split| |text| |split| |text| |split| ... '----' '-----' '----' '-----' '----' '-----' To have a homogeneous list of objects, each text/split pair is wrapped into a single Chunk class. For historical reasons, they were paired up so that each Chunk has the text and the split that follows it: .--------------. .--------------. .--------------. | Chunk | | Chunk | | Chunk | |.----. .-----.| |.----. .-----.| |.----. .-----.| ... ||text| |split|| ||text| |split|| ||text| |split|| ... |'----' '-----'| |'----' '-----'| |'----' '-----'| '--------------' '--------------' '--------------' When all we had was a flat list of chunks, this mostly didn't matter. However, in order to improve the performance of line-splitting large expressions containing nested function expressions and collection literals, I moved to a more tree-like representation where some Chunks contain a list of child Chunks. This made having each Chunk terminate with a split more annoying to work with because now the first child Chunk doesn't actually contain the information to describe where it begins on its own line. In order to be able to format child blocks independently and cache them, it makes more sense if each Chunk describes an independent line: the leading indentation followed by the text. I have tried this refactoring several times in the past, but it causes an unbelievable amount of off-by-one errors and other problems. I finally pushed through and got everything working again. This change doesn't actually simplify the codebase very much right now. My goal was to get the Chunk representation changed with as little changes to the rest of the formatter as possible. My hope is that this will enable future refactoring and simplifications. The external behavior is almost completely unchanged. On a corpus of 2,000 Pub packages (10+MLOC), there were <100 differences compared to the previous formatter behavior. All of those differences were actually (very rare) bugs, which this refactoring incidentally fixes. * Apply review feedback.
1 parent 7b3b600 commit a350882

25 files changed

+1169
-603
lines changed

CHANGELOG.md

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,27 @@
1+
# 2.2.4-dev
2+
3+
* Refactor Chunk to store split before text instead of after. This mostly does
4+
not affect the visible behavior of the formatter, but a few edge cases are
5+
handled slightly differently. These are all bug fixes where the previous
6+
behavior was unintentional. The changes are:
7+
8+
* Consistently discard blank lines between a `{` or `[` and a subsequent
9+
comment. It used to do this before the `{` in type bodies, but not switch
10+
bodies, optional parameter sections, or named parameter sections.
11+
12+
* Don't allow splitting an empty class body.
13+
14+
* Allow splitting after an inline block comment in some places where it makes
15+
sense.
16+
17+
* Don't allow a line comment in an argument list to cause preceding arguments
18+
to be misformatted.
19+
20+
* Remove blank lines after a line comment at the end of a body.
21+
122
# 2.2.3
223

3-
- Allow the latest version of `package:analyzer`.
24+
* Allow the latest version of `package:analyzer`.
425

526
# 2.2.2
627

example/format.dart

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
// Copyright (c) 2015, the Dart project authors. Please see the AUTHORS file
22
// for details. All rights reserved. Use of this source code is governed by a
33
// BSD-style license that can be found in the LICENSE file.
4+
library dart_style.example.format;
5+
46
import 'dart:io';
57
import 'dart:mirrors';
68

lib/src/chunk.dart

Lines changed: 66 additions & 115 deletions
Original file line numberDiff line numberDiff line change
@@ -42,16 +42,17 @@ abstract class Selection {
4242
}
4343
}
4444

45-
/// A chunk of non-breaking output text terminated by a hard or soft newline.
45+
/// A chunk of non-breaking output text that may begin on a newline.
4646
///
47-
/// Chunks are created by [LineWriter] and fed into [LineSplitter]. Each
48-
/// contains some text, along with the data needed to tell how the next line
49-
/// should be formatted and how desireable it is to split after the chunk.
47+
/// Chunks are created by [ChunkBuilder] and fed into [LineSplitter]. Each
48+
/// contains the data describing where the chunk should appear when starting a
49+
/// new line, how desireable it is to split, and the subsequent text for that
50+
/// line.
5051
///
51-
/// Line splitting after chunks comes in a few different forms.
52+
/// Line splitting before a chunk comes in a few different forms.
5253
///
5354
/// * A "hard" split is a mandatory newline. The formatted output will contain
54-
/// at least one newline after the chunk's text.
55+
/// at least one newline before the chunk's text.
5556
/// * A "soft" split is a discretionary newline. If a line doesn't fit within
5657
/// the page width, one or more soft splits may be turned into newlines to
5758
/// wrap the line to fit within the bounds. If a soft split is not turned
@@ -61,8 +62,8 @@ abstract class Selection {
6162
/// blank line in the output. Hard or soft splits may be doubled. This is
6263
/// determined by [isDouble].
6364
///
64-
/// A split controls the leading spacing of the subsequent line, both
65-
/// block-based [indent] and expression-wrapping-based [nesting].
65+
/// A split controls the leading spacing of the line before the chunk's text,
66+
/// both block-based [indent] and expression-wrapping-based [nesting].
6667
class Chunk extends Selection {
6768
/// The literal text output for the chunk.
6869
@override
@@ -74,21 +75,19 @@ class Chunk extends Selection {
7475
///
7576
/// For top level chunks that are not inside any block, this also includes
7677
/// leading indentation.
77-
int? get indent => _indent;
78-
int? _indent;
78+
final int indent;
7979

80-
/// The expression nesting level following this chunk.
80+
/// The expression nesting level preceding this chunk.
8181
///
8282
/// This is used to determine how much to increase the indentation when a
83-
/// line starts after this chunk. A single statement may be indented multiple
83+
/// line starts at this chunk. A single statement may be indented multiple
8484
/// times if the splits occur in more deeply nested expressions, for example:
8585
///
8686
/// // 40 columns |
8787
/// someFunctionName(argument, argument,
8888
/// argument, anotherFunction(argument,
8989
/// argument));
90-
NestingLevel? get nesting => _nesting;
91-
NestingLevel? _nesting;
90+
final NestingLevel nesting;
9291

9392
/// If this chunk marks the beginning of a block, this contains the child
9493
/// chunks and other data about that nested block.
@@ -100,52 +99,28 @@ class Chunk extends Selection {
10099
/// Whether this chunk has a [block].
101100
bool get isBlock => _block != null;
102101

103-
/// Whether it's valid to add more text to this chunk or not.
104-
///
105-
/// Chunks are built up by adding text and then "capped off" by having their
106-
/// split information set by calling [handleSplit]. Once the latter has been
107-
/// called, no more text should be added to the chunk since it would appear
108-
/// *before* the split.
109-
bool get canAddText => _rule == null;
110-
111-
/// The [Rule] that controls when a split should occur after this chunk.
102+
/// The [Rule] that controls when a split should occur before this chunk.
112103
///
113104
/// Multiple splits may share a [Rule].
114-
Rule? get rule => _rule;
115-
Rule? _rule;
105+
final Rule rule;
116106

117-
/// Whether or not an extra blank line should be output after this chunk if
118-
/// it's split.
119-
///
120-
/// Internally, this can be either `true`, `false`, or `null`. The latter is
121-
/// an indeterminate state that lets later modifications to the split decide
122-
/// whether it should be double or not.
123-
///
124-
/// However, this getter does not expose that. It will return `false` if the
125-
/// chunk is still indeterminate.
126-
bool get isDouble => _isDouble ?? false;
127-
bool? _isDouble;
107+
/// Whether or not an extra blank line should be output before this chunk if
108+
/// it splits.
109+
bool get isDouble => _isDouble;
110+
bool _isDouble;
128111

129-
/// If `true`, then the line after this chunk should always be at column
130-
/// zero regardless of any indentation or expression nesting.
112+
/// If `true`, then the line beginning with this chunk should always be at
113+
/// column zero regardless of any indentation or expression nesting.
131114
///
132115
/// Used for multi-line strings and commented out code.
133116
bool get flushLeft => _flushLeft;
134-
bool _flushLeft = false;
135-
136-
/// If `true`, then the line after this chunk and its contained block should
137-
/// be flush left.
138-
bool get flushLeftAfter {
139-
if (!isBlock) return _flushLeft;
117+
bool _flushLeft;
140118

141-
return block.chunks.last.flushLeftAfter;
142-
}
143-
144-
/// Whether this chunk should append an extra space if it does not split.
119+
/// Whether this chunk should prepend an extra space if it does not split.
145120
///
146-
/// This is `true`, for example, in a chunk that ends with a ",".
121+
/// This is `true`, for example, in a chunk following a ",".
147122
bool get spaceWhenUnsplit => _spaceWhenUnsplit;
148-
bool _spaceWhenUnsplit = false;
123+
bool _spaceWhenUnsplit;
149124

150125
/// Whether this chunk marks the end of a range of chunks that can be line
151126
/// split independently of the following chunks.
@@ -155,7 +130,7 @@ class Chunk extends Selection {
155130
late final bool _canDivide;
156131

157132
/// The number of characters in this chunk when unsplit.
158-
int get length => _text.length + (spaceWhenUnsplit ? 1 : 0);
133+
int get length => (_spaceWhenUnsplit ? 1 : 0) + _text.length;
159134

160135
/// The unsplit length of all of this chunk's block contents.
161136
///
@@ -175,59 +150,49 @@ class Chunk extends Selection {
175150
/// The [Span]s that contain this chunk.
176151
final spans = <Span>[];
177152

178-
/// Creates a new chunk starting with [_text].
179-
Chunk(this._text);
153+
/// Creates a new empty chunk with the given split properties.
154+
Chunk(this.rule, this.indent, this.nesting,
155+
{required bool space, required bool flushLeft, required bool isDouble})
156+
: _text = '',
157+
_flushLeft = flushLeft,
158+
_isDouble = isDouble,
159+
_spaceWhenUnsplit = space;
180160

181161
/// Creates a dummy chunk.
182162
///
183163
/// This is returned in some places by [ChunkBuilder] when there is no useful
184164
/// chunk to yield and it will not end up being used by the caller anyway.
185-
Chunk.dummy() : _text = '(dummy)';
186-
187-
/// Discard the split for the chunk and put it back into the state where more
188-
/// text can be appended.
189-
void allowText() {
190-
_rule = null;
191-
}
192-
193-
/// Append [text] to the end of the split's text.
165+
Chunk.dummy()
166+
: _text = '(dummy)',
167+
rule = Rule.dummy,
168+
indent = 0,
169+
nesting = NestingLevel(),
170+
_spaceWhenUnsplit = false,
171+
_flushLeft = false,
172+
_isDouble = false;
173+
174+
/// Append [text] to the end of the chunk's text.
194175
void appendText(String text) {
195-
assert(canAddText);
196176
_text += text;
197177
}
198178

199-
/// Finishes off this chunk with the given [rule] and split information.
200-
///
201-
/// This may be called multiple times on the same split since the splits
202-
/// produced by walking the source and the splits coming from comments and
203-
/// preserved whitespace often overlap. When that happens, this has logic to
204-
/// combine that information into a single split.
205-
void applySplit(Rule rule, int indent, NestingLevel nesting,
206-
{bool? flushLeft, bool? isDouble, bool? space}) {
207-
flushLeft ??= false;
208-
space ??= false;
209-
if (rule.isHardened) {
210-
// A hard split always wins.
211-
_rule = rule;
212-
} else {
213-
_rule ??= rule;
214-
}
179+
/// Updates the split information for a previously created chunk in response
180+
/// to a split from a comment.
181+
void updateSplit({bool? flushLeft, bool isDouble = false, bool? space}) {
182+
assert(text.isEmpty);
215183

216-
// Last split settings win.
217-
_flushLeft = flushLeft;
218-
_nesting = nesting;
219-
_indent = indent;
184+
if (flushLeft != null) _flushLeft = flushLeft;
220185

221-
_spaceWhenUnsplit = space;
186+
// Don't discard an already known blank newline, but do potentially add one.
187+
if (isDouble) _isDouble = true;
222188

223-
// Pin down the double state, if given and we haven't already.
224-
_isDouble ??= isDouble;
189+
if (space != null) _spaceWhenUnsplit = space;
225190
}
226191

227192
/// Turns this chunk into one that can contain a block of child chunks.
228-
void makeBlock(Chunk? blockArgument, {required bool indent}) {
193+
void makeBlock(Chunk? blockArgument) {
229194
assert(_block == null);
230-
_block = ChunkBlock(blockArgument, indent);
195+
_block = ChunkBlock(blockArgument);
231196
}
232197

233198
/// Returns `true` if the block body owned by this chunk should be expression
@@ -243,7 +208,7 @@ class Chunk extends Selection {
243208
// There may be no rule if the block occurs inside a string interpolation.
244209
// In that case, it's not clear if anything will look particularly nice, but
245210
// expression nesting is probably marginally better.
246-
if (rule == null) return true;
211+
if (rule == Rule.dummy) return true;
247212

248213
return rule.isSplit(getValue(rule), argument);
249214
}
@@ -255,28 +220,17 @@ class Chunk extends Selection {
255220

256221
@override
257222
String toString() {
258-
var parts = [];
259-
260-
if (text.isNotEmpty) parts.add(text);
261-
262-
if (_indent != null) parts.add('indent:$_indent');
263-
if (spaceWhenUnsplit == true) parts.add('space');
264-
if (_isDouble == true) parts.add('double');
265-
if (_flushLeft == true) parts.add('flush');
266-
267-
var rule = _rule;
268-
if (rule == null) {
269-
parts.add('(no split)');
270-
} else {
271-
parts.add(rule.toString());
272-
if (rule.isHardened) parts.add('(hard)');
273-
274-
if (rule.constrainedRules.isNotEmpty) {
275-
parts.add("-> ${rule.constrainedRules.join(' ')}");
276-
}
277-
}
278-
279-
return parts.join(' ');
223+
var parts = [
224+
'indent:$indent',
225+
if (spaceWhenUnsplit) 'space',
226+
if (isDouble) 'double',
227+
if (flushLeft) 'flush',
228+
'$rule${rule.isHardened ? '!' : ''}',
229+
if (rule.constrainedRules.isNotEmpty)
230+
"-> ${rule.constrainedRules.join(' ')}"
231+
];
232+
233+
return '[${parts.join(' ')}] `$text`';
280234
}
281235
}
282236

@@ -290,13 +244,10 @@ class ChunkBlock {
290244
/// may need extra expression-level indentation.
291245
final Chunk? argument;
292246

293-
/// Whether the first chunk should have a level of indentation before it.
294-
final bool indent;
295-
296247
/// The child chunks in this block.
297248
final List<Chunk> chunks = [];
298249

299-
ChunkBlock(this.argument, this.indent);
250+
ChunkBlock(this.argument);
300251
}
301252

302253
/// Constants for the cost heuristics used to determine which set of splits is

0 commit comments

Comments
 (0)