v2 was released in November 2020. It contains the following changes, some of which are backwards-incompatible:
-
Added optional
LexString()andLexBytes()methods that lexer definitions can implement to fast-path lexing of bytes and strings. -
A new stateful lexer has been added.
-
A
filenamemust now be passed to allParse*()andLex*()methods. -
The
text/scannerlexer no longer automatically unquotes strings or supports arbitary length single quoted strings. The tokens it produces are identical to that of thetext/scannerpackage. UseUnquote()to remove quotes. -
TokandEndTokwill no longer be populated. -
If a field named
Token []lexer.Tokenexists it will be populated with the raw tokens that the node parsed from the lexer. -
Support capturing directly into lexer.Token fields. eg.
type ast struct { Head lexer.Token `@Ident` Tail []lexer.Token `@(Ident*)` } -
Add an
experimental/codegenfor stateful lexers. This provides ~10x performance improvement with zero garbage when lexing strings. -
The
regexlexer has been removed. -
The
ebnflexer has been removed. -
All future work on lexing will be put into the stateful lexer.
-
The need for
DropTokenhas been removed.