LL(2) parsing error: parser commits to subrule and is not able to get out #2095
Replies: 5 comments 2 replies
-
|
Hey @jitsedesmet, unlike ANTLR4, Chevrotain does not look into the outer context when constructing the lookahead table. I.e. when constructing the lookahead for Note that the LL(2) statement only holds true for ANTLR in particular - Chevrotain would say that your grammar is LL(1) but with the outer context issue mentioned above. Solutions to this consist mainly of using |
Beta Was this translation helpful? Give feedback.
-
|
Hi @msujew, Thank you for the swift reply! It does indeed look like adding a gate resolves the issue (EDIT: previous one was wrong): this.gramB = this.RULE('gramB', () => {
this.CONSUME(lexC);
this.MANY({
GATE: () => this.LA(2).tokenType === lexE || this.LA(3).tokenType === lexE,
DEF: () => {
this.SUBRULE(this.gramD, undefined);
this.CONSUME(lexE);
},
});
return <const> 'gramB';
});Using backtracking or gates is not ideal in this use case because the optional rule (ruleD) that I use in my grammar is called before every CONSUME. That means that this case will happen often. (And using gates often might make debugging a nightmare). |
Beta Was this translation helpful? Give feedback.
-
|
In theory you can use it. I'm currently using it for chevrotain-allstar, but as outlined in TypeFox/chevrotain-allstar#1, it also (currently) is incapable of taking the outer context into account.
Depending on how optimized your solution is :) |
Beta Was this translation helpful? Give feedback.
-
|
Those are 2 very interesting resources, thank you! Luckily, that should mean that in my use case of parsing the otherwise ignored tokens, I can resolve the issue by changing the semantics of my consumptions. Instead of a requiring a consumption to consume ignored tokens before, it should consume the ignored tokens after. That way my parser will not commit to a certain subrule if it cannot be handled (also reducing the complexity to LL1 again (I think)). I wonder whether we could add some kind of warning of this behavior to the Chevrotain documentation? Anyway, thank you very much! Test code verifying that swapping to the end solves the issue: this.gramB = this.RULE('gramB', () => {
this.CONSUME(lexC);
this.MANY({
// GATE: () => this.LA(2).tokenType === lexE || this.LA(3).tokenType === lexE,
DEF: () => {
this.CONSUME(lexE);
this.SUBRULE(this.gramD, undefined);
},
});
return <const> 'gramB';
});And the main rule is still the same: this.gramMain = this.RULE('main', () => {
this.SUBRULE(this.gramB, undefined);
this.SUBRULE(this.gramD, undefined);
this.CONSUME(lexF);
return <const> 'gramMain';
});The grammar works as intended, now parsing what previously didn't work: |
Beta Was this translation helpful? Give feedback.
-
|
@jitsedesmet you wrote:
My understanding is that taking into account the outer context is a special Antlr feature (Which is an LL(star) / adaptive LL(star) parser generator), and not a common LL(K) capability. Perhaps a runtime validation identifying these scenarios and outputting a useful error / warning message would be the most useful approach. If you want to try implementing such a validation a good place to start would be here: |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi! This library has been amazing so far. I've been trying to create a round tripping parser that includes round tripping in the syntax.
I am in need of a rule that parser everything that is otherwise skipped.
I came across the issue where the following grammar works in ANTLR4, but not in Chevrotain:
I expect (and using ANTLR4 this works) to be in the grammar:
CDFCDEDF(does not work in Chevrotain) but alsoCFandCDEF(does work in Chevrotain).When using Chevrotain it looks like the parser get's stuck in the gramB rule forgetting that a parse of the gramD rule also allows him to continue in the compilationUnit rule.
I am fairly sure the grammar above is LL(2).
I wonder whether this is a mistake on my part, or whether this is a genuine bug. I have no issue with attempting a PR myself. Maybe you could give some pointers :D
(Issue is also present when the gramB rule uses
gramC ( gramD gramE )?;)Chevrotain code I used:
Source files: ANTLR4 and Chevrotain.
Using the Intellij Plugin for ANTLR 4 I get the following out or ANTLR4:
A parsetree and What I think is a confirmation that it is LL(2)
Beta Was this translation helpful? Give feedback.
All reactions