Replies: 1 comment 3 replies
-
|
Hey @anderay our underlying parser/lexer library (Chevrotain) performs first-fit lexing. I.e. it returns the terminal type that first matches the input at a given position. To match longer token sequences, Chevrotain requires to declare a The problem is pretty difficult, as identifying whether a regex is a longer alt of another regex isn't as simple compared to the keyword case (and this problem/property might be symmetrical, making it potentially ambiguous). As for why it works in a datatype rule: A datatype rule isn't evaluated during the lexing phase, so it is handled by the algorithm mentioned above. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Can someone explain why:
grammar HelloWorld entry Model: (persons+=Person | greetings+=Greeting)*; Person: 'person' name=ID; Greeting: 'Hello' problem=PROBLEM person=[Person:ID] '!'; hidden terminal WS: /\s+/; terminal PROBLEM: 'Problem'; terminal ID: /[_a-zA-Z][\w_]*/; terminal INT returns number: /[0-9]+/; terminal STRING: /"(\\.|[^"\\])*"|'(\\.|[^'\\])*'/; hidden terminal ML_COMMENT: /\/\*[\s\S]*?\*\//; hidden terminal SL_COMMENT: /\/\/[^\n\r]*/;introduces problem e.g. for following input:
person ProblemPersonwhereas grammar with terminal as inline token gives no problem:
grammar HelloWorld entry Model: (persons+=Person | greetings+=Greeting)*; Person: 'person' name=ID; Greeting: 'Hello' 'Problem' person=[Person:ID] '!'; hidden terminal WS: /\s+/; terminal ID: /[_a-zA-Z][\w_]*/; terminal INT returns number: /[0-9]+/; terminal STRING: /"(\\.|[^"\\])*"|'(\\.|[^'\\])*'/; hidden terminal ML_COMMENT: /\/\*[\s\S]*?\*\//; hidden terminal SL_COMMENT: /\/\/[^\n\r]*/;note that also data-type-rule works fine:
grammar HelloWorld entry Model: (persons+=Person | greetings+=Greeting)*; Person: 'person' name=ID; Greeting: 'Hello' problem=PROBLEM person=[Person:ID] '!'; hidden terminal WS: /\s+/; PROBLEM returns string: 'Problem'; terminal ID: /[_a-zA-Z][\w_]*/; terminal INT returns number: /[0-9]+/; terminal STRING: /"(\\.|[^"\\])*"|'(\\.|[^'\\])*'/; hidden terminal ML_COMMENT: /\/\*[\s\S]*?\*\//; hidden terminal SL_COMMENT: /\/\/[^\n\r]*/;Beta Was this translation helpful? Give feedback.
All reactions