-
-
Notifications
You must be signed in to change notification settings - Fork 48
How to change or add a new custom parser error message
Stanc3 uses Menhir's Incremental API to provide custom error messages for when the parser detects an error. Menhir and the Incremental API are documented in the reference manual. We write the custom error messages in src/frontend/parser.messages file, which follows Menhir's messages format (documented here). Those messages are automatically integrated into the parser automatically by the build rules defined in src/frontend/dune.
Each rule in parser.messages indicates the error state it corresponds to by specifying any stack of tokens which result in that error state. There may be many such stacks of tokens which could correspond to the same error, and Menhir will point out if there are two rules defined for the same error.
- Add a new rule to parser.messages corresponding to any stack of tokens which could trigger the error I'm interested in
- Let Menhir tell me which existing rule the new rule collides with
- Update the message of the existing rule
Suppose I want to change the error message for when the program is missing a '{' after the 'model' keyword.
I add the following lines to the end of parser.message:
program: MODELBLOCK REAL
Model blocks should start with a '{' symbol.
The first line shows one possible stack of tokens that could trigger the error I'm interested in - maybe the user started declaring something with `real` after skipping the '{'. The second line is my new message.
Now, I compile stanc3. I get the following message:
File "parser.messages", line 3921, characters 0-25:
File "parser.messages", line 13573, characters 0-24:
Error: these sentences both cause an error in state 629.
make: *** [Makefile:2: all] Error 1
This tells me that the rule I added (apparently on line 13573) corresponds to the same error state as a rule already defined on line 3921. At line 3921 of parser.messages I find:
program: MODELBLOCK WHILE
##
## Ends in an error in state: 625.
##
## model_block -> MODELBLOCK . LBRACE list(vardecl_or_statement) RBRACE [ GENERATEDQUANTITIESBLOCK EOF ]
##
## The known suffix of the stack is as follows:
## MODELBLOCK
##
Expected "{" after "model".
It makes sense that `MODELBLOCK WHILE` and `MODELBLOCK REAL` give the same error, since they're both missing '{'. I can now replace this message with my new version, and remove my redundant `MODELBLOCK REAL` rule.
Suppose I actually wanted to add a new message that would only show when a more specific error is made; maybe I want a special message for when `model` followed by a declaration. To do this I need to change the parse grammar in src/frontent/parser.mly to split off a new error state for the situation I want to catch. However, I need to make sure not to change the behavior of the parser, so I should guarantee that he new rule will never successfully be built by terminating it with a token that can never occur. For example, I could change `model = MODELBLOCK LBRACE …` to `model = MODELBLOCK LBRACE … | MODELBLOCK REAL UNREACHABLE`. This is enough to create a new parser state with a different error message.