Skip to content

Conversation

@kubouch
Copy link
Contributor

@kubouch kubouch commented Dec 28, 2024

The on-demand lexer is replaced with a standalone lexing pass using the Logos crate. The lexing code is refactored away from the parser into a separate data structure. The Parser is refactored and simplified accordingly.

The new lexer correctly recognizes more syntactic shapes than the previous one.

The new lexer+parser seems slightly faster (5-6 ms together on the combined1000.nu benchmark) than the previous version, but the main benefit is the lexer disentangled from the parser code. It is now possible to change the lexer more easily, including turning it into an on-demand lexer again if need be.

The main shortcomings are:

  • Difficulty lexing string interpolation, e.g., $"foo(1 + 2)bar". We'd need to switch to another lexer. It might be easier to use on-demand lexing for it.
  • Unmatched delimiters. If the file ends with an unmatched delimiter, an extremely unhelpful error is emitted. This should be solvable by using callbacks.

Notable API changes:

  • self.peek() -> self.tokens.peek()
  • self.next() -> gone, use self.tokens.advance() to point at the next tokens followed by self.tokens.peek().

@kubouch kubouch merged commit 6899325 into nushell:main Dec 28, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant