-
Notifications
You must be signed in to change notification settings - Fork 12
Parametric: formatting #677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
79abb8f to
f0930d1
Compare
rascal-lsp/src/main/java/org/rascalmpl/vscode/lsp/parametric/ParametricTextDocumentService.java
Outdated
Show resolved
Hide resolved
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this is a bit of a challenging API, especially if we want to make this more user-friendly for DSL users.
I have the feeling we should involve @jurgenvinju who has been on the topic of formatting/pretty-printing for quite some time already. I suspect that we're missing some opportunities of matching this up with something already in the standard library, or something that we might should add to the standard library first.
rascal-lsp/src/main/rascal/library/demo/lang/pico/LanguageServer.rsc
Outdated
Show resolved
Hide resolved
|
This PR is missing the adjacent: |
4b58c98 to
4aca661
Compare
bac08fb to
2b7fb47
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is going in the right direction, but I'm not sure what I should have been reviewing, the implementation or the design?
| @@ -0,0 +1,157 @@ | |||
| module util::Format | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should think about moving a version of this to stdlib, I don't want us to have custom vs code only std lib like modules.
So as you suggested, some might be nice to migrate there? I don't think all of them are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we merge this PR, we'll have to get rid of this module. So that would be a reason to wait with merging this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this code or something that does the same belongs in Box2Text. This has the same abstraction level, and also this is the same code for all DSLs. It should reside in a reusable generic component like Box2Text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want quick reuse, we could move these functions to Box2Text and call them in the top-level format function. Later some of the features could be fused into the layout algorithm to avoid revisiting the string again and again.
rascal-lsp/src/main/rascal/library/demo/lang/pico/LanguageServer.rsc
Outdated
Show resolved
Hide resolved
...-lsp/src/main/java/org/rascalmpl/vscode/lsp/parametric/InterpretedLanguageContributions.java
Outdated
Show resolved
Hide resolved
2b7fb47 to
e05f700
Compare
9e7fe1a to
523c2fe
Compare
523c2fe to
ab448fc
Compare
rascal-lsp/src/main/java/org/rascalmpl/vscode/lsp/util/locations/impl/ArrayLineOffsetMap.java
Outdated
Show resolved
Hide resolved
rascal-lsp/src/main/java/org/rascalmpl/vscode/lsp/util/locations/impl/ArrayLineOffsetMap.java
Outdated
Show resolved
Hide resolved
|
Is there an opportunity to use the Focus abstraction for this contribution? Especially for the range alternative. That way language engineers don't have to go search for the right trees anymore, and they could easily support only a few top-level types for starters, and they could easily recover the required indentation level from the parent tree's layout siblings. |
@jurgenvinju I thought about this, but the tricky part there is that a range does not necessarily correspond to an exact tree. Even if we give a focus tree that ends at the largest tree encapsulating the range, there would still be the need to filter the edits so they are not outside of the given range. |
7a9d722 to
01d9853
Compare
a9d7907 to
ce46294
Compare
|
| </dependencies> | ||
| <configuration> | ||
| <systemPropertyVariables> | ||
| <forkMode>always</forkMode> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this change in the pom.xml has to do with this PR?
| final ILanguageContributions contribs = contributions(uri); | ||
|
|
||
| // call the `formatting` implementation of the relevant language contribution | ||
| var fileState = getFile(uri); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: fileState can be inlined in the next line.
| // just a range | ||
| var start = Locations.toRascalPosition(uri, range.getStart(), columns); | ||
| var end = Locations.toRascalPosition(uri, range.getEnd(), columns); | ||
| // compute the focus list at the end of the range |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment is not accurate?
| var optionsType = tf.abstractDataType(typeStore, "FormattingOptions"); | ||
| var consType = tf.constructor(typeStore, optionsType, "formattingOptions"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be stored in a field, and only looked-up at the start?
| logger.trace("Common focus suffix length: {}", commonSuffix.length()); | ||
| // The range spans multiple subtrees. The easy way out is not to focus farther down than | ||
| // their smallest common subtree (i.e. `commonSuffix`) - let's see if we can do any better. | ||
| if (TreeAdapter.isList((ITree) commonSuffix.get(0))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, this is quite clean 👍
| final var selected = elements.stream() | ||
| .map(ITree.class::cast) | ||
| .dropWhile(t -> { | ||
| final var l = TreeAdapter.getLocation(t); | ||
| // only include layout if the element before it is selected as well | ||
| return TreeAdapter.isLayout(t) | ||
| ? rightOfBegin(l, startLine, startColumn) | ||
| : rightOfEnd(l, startLine, startColumn); | ||
| }) | ||
| .takeWhile(t -> { | ||
| final var l = TreeAdapter.getLocation(t); | ||
| // only include layout if the element after it is selected as well | ||
| return TreeAdapter.isLayout(t) | ||
| ? rightOfEnd(l, endLine, endColumn) | ||
| : rightOfBegin(l, endLine, endColumn); | ||
| }) | ||
| .collect(VF.listWriter()); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternative:
var result = VF.listWriter();
boolean inside = false;
for (var e : elements) {
var t = (ITree)e;
var l = TreeAdapter.getLocation(t);
var isLayout = TreeAdapter.isLayout(t);
if (!inside) {
inside = isLayout ? rightOfBegin(l, startLine, startColumn) : rightOfEnd(l, startLine, startColumn)
} else if (isLayout ? rightOfEnd(l, endLine, endColumn) : rightOfBegin(l, endLine, endColumn)) {
break;
}
if (inside) {
result.add(t);
}
}
var selected = result.done();not sure if better, was just wondering if it got more compact by turning it into a loop.
| begin | ||
| declare | ||
| input : natural, | ||
| output : natural, | ||
| repnr : natural, | ||
| rep : natural; | ||
|
|
||
| begin | ||
| declare | ||
| input : natural, output : natural, repnr : natural, rep : natural | ||
| ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't this change break the UI test that have some line numbers from this example?
also, shouldn't we commit the unformatted version, and then run a test that triggers the formatter and see that indeed it worked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an error/incompleteness in the Pico toBox function. It should pretty print like the original does. The reason is that the default heuristics of toBox pick HOV layout for comma-separated lists.
| @@ -0,0 +1,157 @@ | |||
| module util::Format | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we merge this PR, we'll have to get rid of this module. So that would be a reason to wait with merging this PR.
| * The optional `prepareRename` service argument discovers places in the editor where a ((util::LanguageServer::rename)) is possible. If renameing the location is not supported, it should throw an exception. | ||
| * The ((didRenameFiles)) service collects ((DocumentEdit))s corresponding to renamed files (e.g. to rename a class when the class file was renamed). The IDE applies the edits after moving the files. It might fail and report why in diagnostics. | ||
| * The ((selectionRange)) service discovers selections around a cursor, that a user might want to select. It expects the list of source locations to be in ascending order of size (each location should be contained by the next) - similar to ((Focus)) trees. | ||
| * The ((formatting)) service determines what edits to do to format (part of) a file. The `range` parameter determines what part of the file to format. For whole-file formatting, `_tree.top == range`. ((FormattingOptions)) influence how formatting treats whitespace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a problem with tree.top since that drops the comments at the start & end of the file. shouldn't tree == range? So including the start part?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no range parameter anymore since we have a Focus list now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is indeed outdated now
| import org.rascalmpl.values.parsetrees.TreeAdapter; | ||
| import org.rascalmpl.vscode.lsp.util.RascalServices; | ||
|
|
||
| public class TreeSearchTests { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great that we have some actual tests for this piece of code 👍🏼
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| str original = "<input[-1]>"; | ||
| box = toBox(input[-1]); | ||
| box = visit (box) { case i:I(_) => i[is=opts.tabSize] } | ||
| formatted = format(box); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code would look better if the options were passed directly to format and it would implement them.
For me the code drops under a certain level of abstraction (under the syntax-directed level) with all the string-level operations like replaceLast, perLine etc, while format and box2text which indeed take care of these details normally.
| list[TextEdit] picoFormattingService(Focus input, FormattingOptions opts) { | ||
| str original = "<input[-1]>"; | ||
| box = toBox(input[-1]); | ||
| box = visit (box) { case i:I(_) => i[is=opts.tabSize] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a good example for users on how to parameterize the indentation level. The reason is that not all I boxes will correspond to the default indentation level in a typical language. Some are different.
So indentation level should be a parameter to toBox where the spec writer can decide where to use it and where not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good to discover this now and here. A few minor additions to toBox should help. We have to think hard how to make passing the parameters simple and easy, because toBox is highly recursive and we might skip accidentally to pass it on..
|
|
||
| // instead of computing all edits and filtering, we can be more efficient by only formatting certain trees. | ||
| loc range = input[0]@\loc; | ||
| filteredEdits = [e | e <- edits, isContainedIn(e.range, range)]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some troubles with this filtering, for the Pico example and in general. The idea would be that the Focus decides already which part to format and which part to leave alone. So here we take the range of the smallest focus while we format the entire file.. but that doesn't seem natural (also it could break parsing).
Filtering the edits on a much larger tree is expensive but it could also be wrong. Maybe you only wrote a formatter for statements and not for expressions, so you selected the smallest statement in the focus, then filtering only the edits in the range will possibly even break the code (make it not parseable due to connecting parts which used to be separated by whitespace or even worse).
So I think we should not demonstrate filtering of edits here. Instead we should only call layoutDiff on the focused element, and think about recovering indentation of a nested tree automatically with layoutDiff (treeDiff already has this feature so layoutDiff could have it too).
|
|
||
| list[TextEdit] picoFormattingService(Focus input, FormattingOptions opts) { | ||
| str original = "<input[-1]>"; | ||
| box = toBox(input[-1]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we formatting the whole file?
| } | ||
| @synopsis{Determine the most-used newline character in a string.} | ||
| str mostUsedNewline(str input, list[str] lineseps = newLineCharacters, str(list[str]) tieBreaker = getFirstFrom) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we hide this under the hood as one of the formatting options?
| * `trimFinalNewlines`; if `true`, and the file ends in one or more new lines, remove them. | ||
| Note: formatting with `insertFinalNewline && trimFinalNewlines` is expected to return a file that ends in a single newline. | ||
| } | ||
| data FormattingOptions( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add newlineCharacter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
although we don't want people to write conditional code based on windows vs unix or something... should we hide this too under Box2Text functonality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should align a bit more with toBox and format (Box2Text) and map the LSP parameters to their implementations. This way we can hide the string manipulation details from the Pico formatter contrubuton.



Implementing the formatting and rangeFormatting LSP APIs. Both use the same contribution, where for (whole-file) formatting
_range == _input@\loc.Design:
Closes #130.