Skip to content

Behaviour of begin/end and while patterns do not match TextMate #241

@DanTup

Description

@DanTup

I originally raised this as microsoft/vscode#189940 but it seems like it should be moved here.

The original report is as follows:


This was reported at dart-lang/dart-syntax-highlight#11 (comment). Dart highlighting on GitHub doesn't handle unterminated triple-backticks as expected. VS Code does handle it as expected.

However, while debugging this, I've become less certain that GitHub is wrong, and feel like VS Code might be.

Here's a trimmed down version of the grammar that shows the problem. It defines triple-slash comments, and supports triple-backtick code blocks inside:

It renders like this:

image

It the triple backticks are unclosed, it looks reasonable:

image

However, it's not clear why the variable.other.source.dart scope was exited, because the "end" condition was never found. On GitHub, this does not happen and the rest of the document is consumed (note the first void here is red, but the second one is not because the variable context eats the rest of the document):

image

I can't find anything in the spec for textmate grammars to explain VS Code's behaviour. The most information I've found on it is here:

https://macromates.com/manual/en/language_grammars

The other type of match is the one used by the second rule (lines 9-17). Here two regular expressions are given using the begin and end keys. [...] If there is no match for the end pattern, the end of the document is used.

https://www.apeth.com/nonblog/stories/textmatebundle.html

With begin/end, if the end pattern is not found, the overall match does not fail: rather, once the begin pattern is matched, the overall match runs to the end pattern or to the end of the document, whichever comes first.

While VS Code's behaviour is convenient for me (because I'm not sure how to handle these unclosed triple-backticks if it behaved like GitHub), it doesn't seem correct, and it's more inconvenient if VS Code and GitHub disagree on what the behaviour should be because it makes it more difficult to author a grammar.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions