Parametric: formatting #677

toinehartman · 2025-07-11T08:50:24Z

Implementing the formatting and rangeFormatting LSP APIs. Both use the same contribution, where for (whole-file) formatting _range == _input@\loc.

Design:

data LanguageService = formatting(list[TextEdit](Focus _focus, FormattingOptions _opts) formattingService);

data FormattingOptions(
        int tabSize = 4
      , bool insertSpaces = true
      , bool trimTrailingWhitespace = false
      , bool insertFinalNewline = false
      , bool trimFinalNewlines = false
) = formattingOptions();

Closes #130.

rascal-lsp/src/main/java/org/rascalmpl/vscode/lsp/parametric/ParametricTextDocumentService.java

rascal-lsp/src/main/rascal/library/util/LanguageServer.rsc

sonarqubecloud · 2025-07-15T08:33:26Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

DavyLandman

I see this is a bit of a challenging API, especially if we want to make this more user-friendly for DSL users.

I have the feeling we should involve @jurgenvinju who has been on the topic of formatting/pretty-printing for quite some time already. I suspect that we're missing some opportunities of matching this up with something already in the standard library, or something that we might should add to the standard library first.

rascal-lsp/src/main/rascal/library/demo/lang/pico/LanguageServer.rsc

rascal-lsp/src/main/rascal/library/util/LanguageServer.rsc

DavyLandman · 2025-08-11T11:17:50Z

This PR is missing the adjacent: rangeFormatting functionality.

rascal-lsp/src/main/rascal/library/util/Format.rsc

DavyLandman

I think this is going in the right direction, but I'm not sure what I should have been reviewing, the implementation or the design?

DavyLandman · 2025-08-19T15:17:15Z

rascal-lsp/src/main/rascal/library/util/Format.rsc

@@ -0,0 +1,157 @@
+module util::Format


We should think about moving a version of this to stdlib, I don't want us to have custom vs code only std lib like modules.

So as you suggested, some might be nice to migrate there? I don't think all of them are.

usethesource/rascal#2373

Before we merge this PR, we'll have to get rid of this module. So that would be a reason to wait with merging this PR.

I think this code or something that does the same belongs in Box2Text. This has the same abstraction level, and also this is the same code for all DSLs. It should reside in a reusable generic component like Box2Text.

If we want quick reuse, we could move these functions to Box2Text and call them in the top-level format function. Later some of the features could be fused into the layout algorithm to avoid revisiting the string again and again.

rascal-lsp/src/main/rascal/library/demo/lang/pico/LanguageServer.rsc

rascal-lsp/src/main/rascal/library/util/LanguageServer.rsc

...-lsp/src/main/java/org/rascalmpl/vscode/lsp/parametric/InterpretedLanguageContributions.java

rascal-lsp/src/main/java/org/rascalmpl/vscode/lsp/util/locations/impl/ArrayLineOffsetMap.java

jurgenvinju · 2025-08-29T12:46:41Z

Is there an opportunity to use the Focus abstraction for this contribution? Especially for the range alternative.

That way language engineers don't have to go search for the right trees anymore, and they could easily support only a few top-level types for starters, and they could easily recover the required indentation level from the parent tree's layout siblings.

toinehartman · 2025-08-29T12:52:34Z

Is there an opportunity to use the Focus abstraction for this contribution? Especially for the range alternative.

@jurgenvinju I thought about this, but the tricky part there is that a range does not necessarily correspond to an exact tree. Even if we give a focus tree that ends at the largest tree encapsulating the range, there would still be the need to filter the edits so they are not outside of the given range.

sonarqubecloud · 2025-09-15T11:30:26Z

Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
0 Security Hotspots
50.7% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

DavyLandman · 2025-09-15T14:25:31Z

rascal-lsp/pom.xml

        </dependencies>
+        <configuration>
+          <systemPropertyVariables>
+            <forkMode>always</forkMode>


what is this change in the pom.xml has to do with this PR?

DavyLandman · 2025-09-15T14:27:54Z

rascal-lsp/src/main/java/org/rascalmpl/vscode/lsp/parametric/ParametricTextDocumentService.java

+        final ILanguageContributions contribs = contributions(uri);
+
+        // call the `formatting` implementation of the relevant language contribution
+        var fileState = getFile(uri);


nit: fileState can be inlined in the next line.

DavyLandman · 2025-09-15T14:28:15Z

rascal-lsp/src/main/java/org/rascalmpl/vscode/lsp/parametric/ParametricTextDocumentService.java

+                // just a range
+                var start = Locations.toRascalPosition(uri, range.getStart(), columns);
+                var end = Locations.toRascalPosition(uri, range.getEnd(), columns);
+                // compute the focus list at the end of the range


I think this comment is not accurate?

DavyLandman · 2025-09-15T14:30:24Z

rascal-lsp/src/main/java/org/rascalmpl/vscode/lsp/parametric/ParametricTextDocumentService.java

+        var optionsType = tf.abstractDataType(typeStore, "FormattingOptions");
+        var consType = tf.constructor(typeStore, optionsType, "formattingOptions");


can be stored in a field, and only looked-up at the start?

DavyLandman · 2025-09-15T14:33:30Z

rascal-lsp/src/main/java/org/rascalmpl/vscode/lsp/util/locations/impl/TreeSearch.java

+        logger.trace("Common focus suffix length: {}", commonSuffix.length());
+        // The range spans multiple subtrees. The easy way out is not to focus farther down than
+        // their smallest common subtree (i.e. `commonSuffix`) - let's see if we can do any better.
+        if (TreeAdapter.isList((ITree) commonSuffix.get(0))) {


nice, this is quite clean 👍

DavyLandman · 2025-09-15T14:41:47Z

rascal-lsp/src/main/java/org/rascalmpl/vscode/lsp/util/locations/impl/TreeSearch.java

+        final var selected = elements.stream()
+            .map(ITree.class::cast)
+            .dropWhile(t -> {
+                final var l = TreeAdapter.getLocation(t);
+                // only include layout if the element before it is selected as well
+                return TreeAdapter.isLayout(t)
+                    ? rightOfBegin(l, startLine, startColumn)
+                    : rightOfEnd(l, startLine, startColumn);
+            })
+            .takeWhile(t -> {
+                final var l = TreeAdapter.getLocation(t);
+                // only include layout if the element after it is selected as well
+                return TreeAdapter.isLayout(t)
+                    ? rightOfEnd(l, endLine, endColumn)
+                    : rightOfBegin(l, endLine, endColumn);
+            })
+            .collect(VF.listWriter());
+


alternative:

var result = VF.listWriter(); boolean inside = false; for (var e : elements) { var t = (ITree)e; var l = TreeAdapter.getLocation(t); var isLayout = TreeAdapter.isLayout(t); if (!inside) { inside = isLayout ? rightOfBegin(l, startLine, startColumn) : rightOfEnd(l, startLine, startColumn) } else if (isLayout ? rightOfEnd(l, endLine, endColumn) : rightOfBegin(l, endLine, endColumn)) { break; } if (inside) { result.add(t); } } var selected = result.done();

not sure if better, was just wondering if it got more compact by turning it into a loop.

DavyLandman · 2025-09-15T14:44:35Z

rascal-lsp/src/main/rascal/library/demo/lang/pico/examples/fac.pico

-begin 
-    declare 
-    input   : natural,  
-    output  : natural,           
-    repnr   : natural,
-    rep     : natural;
-
+begin
+    declare
+        input : natural, output : natural, repnr : natural, rep : natural
+    ;


doesn't this change break the UI test that have some line numbers from this example?

also, shouldn't we commit the unformatted version, and then run a test that triggers the formatter and see that indeed it worked?

This is an error/incompleteness in the Pico toBox function. It should pretty print like the original does. The reason is that the default heuristics of toBox pick HOV layout for comma-separated lists.

DavyLandman · 2025-09-15T14:46:08Z

rascal-lsp/src/main/rascal/library/util/Format.rsc

@@ -0,0 +1,157 @@
+module util::Format


Before we merge this PR, we'll have to get rid of this module. So that would be a reason to wait with merging this PR.

DavyLandman · 2025-09-15T14:47:34Z

rascal-lsp/src/main/rascal/library/util/LanguageServer.rsc

   * The optional `prepareRename` service argument discovers places in the editor where a ((util::LanguageServer::rename)) is possible. If renameing the location is not supported, it should throw an exception.
 * The ((didRenameFiles)) service collects ((DocumentEdit))s corresponding to renamed files (e.g. to rename a class when the class file was renamed). The IDE applies the edits after moving the files. It might fail and report why in diagnostics.
 * The ((selectionRange)) service discovers selections around a cursor, that a user might want to select. It expects the list of source locations to be in ascending order of size (each location should be contained by the next) - similar to ((Focus)) trees.
+* The ((formatting)) service determines what edits to do to format (part of) a file. The `range` parameter determines what part of the file to format. For whole-file formatting, `_tree.top == range`. ((FormattingOptions)) influence how formatting treats whitespace.


there is a problem with tree.top since that drops the comments at the start & end of the file. shouldn't tree == range? So including the start part?

There is no range parameter anymore since we have a Focus list now.

This is indeed outdated now

DavyLandman · 2025-09-15T14:48:18Z

rascal-lsp/src/test/java/engineering/swat/rascal/lsp/util/TreeSearchTests.java

+import org.rascalmpl.values.parsetrees.TreeAdapter;
+import org.rascalmpl.vscode.lsp.util.RascalServices;
+
+public class TreeSearchTests {


great that we have some actual tests for this piece of code 👍🏼

jurgenvinju · 2025-09-16T12:23:02Z

rascal-lsp/src/main/rascal/library/demo/lang/pico/LanguageServer.rsc

+    str original = "<input[-1]>";
+    box = toBox(input[-1]);
+    box = visit (box) { case i:I(_) => i[is=opts.tabSize] }
+    formatted = format(box);


this code would look better if the options were passed directly to format and it would implement them.

For me the code drops under a certain level of abstraction (under the syntax-directed level) with all the string-level operations like replaceLast, perLine etc, while format and box2text which indeed take care of these details normally.

jurgenvinju · 2025-09-16T12:24:48Z

rascal-lsp/src/main/rascal/library/demo/lang/pico/LanguageServer.rsc

+list[TextEdit] picoFormattingService(Focus input, FormattingOptions opts) {
+    str original = "<input[-1]>";
+    box = toBox(input[-1]);
+    box = visit (box) { case i:I(_) => i[is=opts.tabSize] }


This is not a good example for users on how to parameterize the indentation level. The reason is that not all I boxes will correspond to the default indentation level in a typical language. Some are different.

So indentation level should be a parameter to toBox where the spec writer can decide where to use it and where not.

It's good to discover this now and here. A few minor additions to toBox should help. We have to think hard how to make passing the parameters simple and easy, because toBox is highly recursive and we might skip accidentally to pass it on..

jurgenvinju · 2025-09-16T12:31:16Z

rascal-lsp/src/main/rascal/library/demo/lang/pico/LanguageServer.rsc

+
+    // instead of computing all edits and filtering, we can be more efficient by only formatting certain trees.
+    loc range = input[0]@\loc;
+    filteredEdits = [e | e <- edits, isContainedIn(e.range, range)];


I have some troubles with this filtering, for the Pico example and in general. The idea would be that the Focus decides already which part to format and which part to leave alone. So here we take the range of the smallest focus while we format the entire file.. but that doesn't seem natural (also it could break parsing).

Filtering the edits on a much larger tree is expensive but it could also be wrong. Maybe you only wrote a formatter for statements and not for expressions, so you selected the smallest statement in the focus, then filtering only the edits in the range will possibly even break the code (make it not parseable due to connecting parts which used to be separated by whitespace or even worse).

So I think we should not demonstrate filtering of edits here. Instead we should only call layoutDiff on the focused element, and think about recovering indentation of a nested tree automatically with layoutDiff (treeDiff already has this feature so layoutDiff could have it too).

jurgenvinju · 2025-09-16T12:32:48Z

rascal-lsp/src/main/rascal/library/demo/lang/pico/LanguageServer.rsc


+list[TextEdit] picoFormattingService(Focus input, FormattingOptions opts) {
+    str original = "<input[-1]>";
+    box = toBox(input[-1]);


Why are we formatting the whole file?

jurgenvinju · 2025-09-16T13:01:38Z

rascal-lsp/src/main/rascal/library/util/Format.rsc

+}
+
+@synopsis{Determine the most-used newline character in a string.}
+str mostUsedNewline(str input, list[str] lineseps = newLineCharacters, str(list[str]) tieBreaker = getFirstFrom) {


Can we hide this under the hood as one of the formatting options?

jurgenvinju · 2025-09-16T13:03:45Z

rascal-lsp/src/main/rascal/library/util/LanguageServer.rsc

+* `trimFinalNewlines`; if `true`, and the file ends in one or more new lines, remove them.
+  Note: formatting with `insertFinalNewline && trimFinalNewlines` is expected to return a file that ends in a single newline.
+}
+data FormattingOptions(


add newlineCharacter?

although we don't want people to write conditional code based on windows vs unix or something... should we hide this too under Box2Text functonality?

jurgenvinju

I think we should align a bit more with toBox and format (Box2Text) and map the LSP parameters to their implementations. This way we can hide the string manipulation details from the Pico formatter contrubuton.

toinehartman force-pushed the feature/lsp-formatting branch from 79abb8f to f0930d1 Compare July 11, 2025 09:07

toinehartman commented Jul 11, 2025

View reviewed changes

rascal-lsp/src/main/java/org/rascalmpl/vscode/lsp/parametric/ParametricTextDocumentService.java Outdated Show resolved Hide resolved

toinehartman commented Jul 11, 2025

View reviewed changes

rascal-lsp/src/main/rascal/library/util/LanguageServer.rsc Outdated Show resolved Hide resolved

rodinaarssen reviewed Jul 11, 2025

View reviewed changes

rascal-lsp/src/main/rascal/library/util/LanguageServer.rsc Outdated Show resolved Hide resolved

toinehartman requested a review from DavyLandman July 14, 2025 17:19

DavyLandman requested changes Jul 15, 2025

View reviewed changes

toinehartman mentioned this pull request Aug 7, 2025

Box2Text formatting of I behaves unexpectedly usethesource/rascal#2336

Closed

toinehartman force-pushed the feature/lsp-formatting branch 3 times, most recently from 4b58c98 to 4aca661 Compare August 18, 2025 14:50

toinehartman commented Aug 18, 2025

View reviewed changes

rascal-lsp/src/main/rascal/library/util/Format.rsc Show resolved Hide resolved

toinehartman mentioned this pull request Aug 18, 2025

Optionally preserve trailing spaces in layoutDiff usethesource/rascal#2360

Open

toinehartman force-pushed the feature/lsp-formatting branch from bac08fb to 2b7fb47 Compare August 19, 2025 08:55

toinehartman requested review from DavyLandman and jurgenvinju August 19, 2025 08:56

DavyLandman reviewed Aug 19, 2025

View reviewed changes

toinehartman force-pushed the feature/lsp-formatting branch from 2b7fb47 to e05f700 Compare August 25, 2025 15:02

toinehartman changed the base branch from main to migrate-dap August 25, 2025 15:02

toinehartman force-pushed the feature/lsp-formatting branch 2 times, most recently from 9e7fe1a to 523c2fe Compare August 26, 2025 09:41

Base automatically changed from migrate-dap to main August 27, 2025 09:25

toinehartman force-pushed the feature/lsp-formatting branch from 523c2fe to ab448fc Compare August 27, 2025 12:06

DavyLandman reviewed Aug 28, 2025

View reviewed changes

rascal-lsp/src/main/java/org/rascalmpl/vscode/lsp/util/locations/impl/ArrayLineOffsetMap.java Outdated Show resolved Hide resolved

rascal-lsp/src/main/java/org/rascalmpl/vscode/lsp/util/locations/impl/ArrayLineOffsetMap.java Outdated Show resolved Hide resolved

toinehartman marked this pull request as ready for review August 29, 2025 12:35

toinehartman force-pushed the feature/lsp-formatting branch 2 times, most recently from 7a9d722 to 01d9853 Compare September 1, 2025 12:05

toinehartman added 18 commits September 15, 2025 13:19

Document & fix string/formatting utils.

84c5a35

Add rangeFormatting, reuse formatting.

c4b3561

Match contribution sig in implementation.

1748678

Add missing license header.

1d56010

Document formatting service.

20a71ae

Use formatter from stdlib.

cda1531

Change defaults based on discussion with @DavyLandman.

206a57a

Fix import to renamed module.

6cdab51

Format Pico examples.

05b2955

Inline function.

9561883

Use focus instead of Tree+loc.

7b92033

Implement range focus with special case for lists.

244da24

Fix comment about Rascal charcter encoding.

69307a7

Add basic TreeSearch tests.

c1b08a8

Consider layout as well.

6f5657f

Include partially selected element at end of range.

49cf9be

Simplify prepending element.

b23a5ea

Test & fix list focus.

ce46294

toinehartman force-pushed the feature/lsp-formatting branch from a9d7907 to ce46294 Compare September 15, 2025 11:19

toinehartman marked this pull request as ready for review September 15, 2025 11:25

DavyLandman reviewed Sep 15, 2025

View reviewed changes

jurgenvinju reviewed Sep 16, 2025

View reviewed changes

toinehartman marked this pull request as draft September 17, 2025 07:32

		var optionsType = tf.abstractDataType(typeStore, "FormattingOptions");
		var consType = tf.constructor(typeStore, optionsType, "formattingOptions");

Parametric: formatting #677

Are you sure you want to change the base?

Parametric: formatting #677

Uh oh!

Conversation

toinehartman commented Jul 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sonarqubecloud bot commented Jul 15, 2025

Quality Gate passed

Uh oh!

DavyLandman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DavyLandman commented Aug 11, 2025

Uh oh!

Uh oh!

DavyLandman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jurgenvinju commented Aug 29, 2025

Uh oh!

toinehartman commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sonarqubecloud bot commented Sep 15, 2025

Quality Gate passed

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

toinehartman commented Jul 11, 2025 •

edited

Loading

toinehartman commented Aug 29, 2025 •

edited

Loading

jurgenvinju Sep 16, 2025 •

edited

Loading