Skip to content

Conversation

m4rch3n1ng
Copy link
Contributor

fixes #226 as outlined in #226 (comment)

rust currently only strips shebangs that are on the same line (i.e. not separated by a \n, see strip_shebang in the rustc_lexer), but the previous regex allowed newlines and didn't correctly parse empty shebangs or shebangs with only a / chracter.

now the regex explicitly disallows \n characters and parses until it finds a newline or fails to parse if it finds a [. this sadly also introduces a new small regression: tree-sitter can no longer parse rust files that are just a shebang and nothing else, no newline. i would like to fix it and the fix would be to replace the \n at the end of the regex with a (\n|$), but if i do that then tree-sitter will complain with Error processing rule shebang: Grammar error: Unexpected rule ExpandRegex(Assertion), so i just left it. i don't think that should be much of an issue probably.

technically speaking the rust lexer allows all of these characters as whitespace, but i thought the current selection of ascii whitespaces is probably enough.

@maxbrunsfeld maxbrunsfeld merged commit 5ebbba1 into tree-sitter:master Mar 5, 2025
4 checks passed
@m4rch3n1ng m4rch3n1ng deleted the fix-shebangs branch March 5, 2025 21:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: shebang does not require semicolon

2 participants