Skip to content
This repository was archived by the owner on Dec 21, 2025. It is now read-only.

Commit ab00964

Browse files
bugengineyngwe@fry
authored andcommitted
Implemented counter-example support in Sly.
When conflicts are encountered, Sly will output examples of sequences of symbols and how the parser could interpret them. A couple of examples are provided and a small explanation is added in the documentation.
1 parent 4000988 commit ab00964

File tree

7 files changed

+2655
-43
lines changed

7 files changed

+2655
-43
lines changed

CHANGES

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,13 @@
1+
Version 0.6
2+
-----------
3+
4+
01/22/2023 Experimental support for counterexamples. SLY will now give
5+
counterexamples for shift/reduce and reduce/reduce conflicts.
6+
7+
18
Version 0.5
29
-----------
10+
311
10/25/2022 ***IMPORTANT NOTE*** This is the last release to be made
412
on PyPi. If you want the latest version go to
513
https://github.com/dabeaz/sly.
@@ -19,7 +27,7 @@ Version 0.5
1927
index of the matching text. This is used to do more
2028
precise location tracking for the purpose of issuing
2129
more useful error messages.
22-
30+
2331
05/09/2020 Experimental support for EBNF choices. For example:
2432

2533
@('term { PLUS|MINUS term }')

docs/sly.rst

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1174,6 +1174,114 @@ also be stressed that not all shift-reduce conflicts are bad.
11741174
However, the only way to be sure that they are resolved correctly is
11751175
to look at the debugging file.
11761176

1177+
Conflict counterexamples
1178+
^^^^^^^^^^^^^^^^^^^^^^^^
1179+
1180+
To help tracking conflicts, SLY generates counterexamples in the debug file.
1181+
For each conflict, SLY will generate one or more examples for each
1182+
possibility in a shift/reduce or reduce/reduce conflict. The examples are
1183+
a sequence of terminals and nonterminals that the grammar could interpret
1184+
in two ways. SLY will show the different derivations, showing clearly what
1185+
the ambiguities were. The counterexamples are listed at the end of the debug
1186+
file and look like this::
1187+
1188+
shift/reduce conflict for ELSE in state 11 resolved as shift
1189+
shift using rule if_statement -> IF LPAREN expr RPAREN statement . ELSE statement
1190+
╭╴
1191+
│ IF LPAREN expr RPAREN IF LPAREN expr RPAREN statement ♦ ELSE statement
1192+
│ ╰if_statement──────────────────────────────────╯
1193+
│ ╰statement─────────────────────────────────────╯
1194+
│ ╰if_statement────────────────────────────────────────────────────────╯
1195+
│ ╰statement───────────────────────────────────────────────────────────╯
1196+
╰╴
1197+
1198+
reduce using rule if_statement -> IF LPAREN expr RPAREN statement .
1199+
╭╴
1200+
│ IF LPAREN expr RPAREN IF LPAREN expr RPAREN statement ♦ ELSE statement
1201+
│ ╰if_statement───────────────────╯
1202+
│ ╰statement──────────────────────╯
1203+
│ ╰if_statement────────────────────────────────────────────────────────╯
1204+
│ ╰statement───────────────────────────────────────────────────────────╯
1205+
╰╴
1206+
1207+
For each counterexample, the display starts with the list of symbols that cause
1208+
an ambiguity. The diamond shows the current location of the parser, and the
1209+
symbol following the diamond is the lookahead.
1210+
The lines below the symol sequence show the possible reductions according to the
1211+
grammar rules. The problem displayed here is the `dangling else
1212+
<https://en.wikipedia.org/wiki/Dangling_else>`_ issue;
1213+
the first counterexample shows the reduction sequence if the shift path is taken;
1214+
the ``ELSE`` is attached to the rightmost ``if_statement``. In the second example,
1215+
SLY shows that another interpretation could be to reduce the second
1216+
``if_statement`` early and attach the ``ELSE`` sequence to the leftmost
1217+
``if_statement`` instead.
1218+
1219+
Here is an example of a reduce/reduce conflict that occurs in the C language::
1220+
1221+
reduce/reduce conflict for ) in state 21 resolved using rule expr -> IDENTIFIER
1222+
rejected rule (declarator -> IDENTIFIER) in state 21
1223+
reduce using expr -> IDENTIFIER with lookahead )
1224+
╭╴
1225+
│ TYPENAME ( IDENTIFIER ♦ )
1226+
│ ╰expr──────╯
1227+
│ ╰expr───────────────────╯
1228+
╰╴
1229+
1230+
reduce using declarator -> IDENTIFIER with lookahead )
1231+
╭╴
1232+
│ TYPENAME ( IDENTIFIER ♦ ) ;
1233+
│ ╰declarator╯
1234+
│ ╰declarator────╯
1235+
│ ╰decl─────────────────────╯
1236+
╰╴
1237+
1238+
In the same way as the shift/reduce conflict, SLY shows here the two ways of
1239+
understanding the sequence. It will always backtrack far enough to find the lookahead
1240+
symbol after a reduction.
1241+
1242+
Sometimes, it can be hard to understand why SLY encounters a conflict in the
1243+
first place. Consider this example taken from the C11 grammar::
1244+
1245+
shift/reduce conflict for [ in state 561 resolved as shift
1246+
shift using rule attribute_specifier -> . [ [ attribute_list ] ]
1247+
╭╴
1248+
│ identifier ♦ [ [ attribute_list ] ]
1249+
│ ╰attribute_specifier───╯
1250+
│ ╰attribute_specifier_sequence╯
1251+
│ ╰direct_declarator──────────────────────╯
1252+
╰╴
1253+
1254+
reduce using rule direct_declarator -> identifier .
1255+
╭╴
1256+
│ identifier ♦ [ ]
1257+
│ ╰direct_declarator╯
1258+
│ ╰array_declarator─────╯
1259+
╰╴
1260+
1261+
Here, the two sequences are not ambiguous when considered in their entirety;
1262+
it is clear that the symbol following the first ``[`` will determine if the
1263+
input was supposed to be interpreted as an ``array_declarator`` or as a
1264+
``direct_declarator`` including an ``attribute_specifier``. While the grammar
1265+
is not ambiguous in this example, it is not LR(1) (in this case, LR(2) would
1266+
have removed the conflict) and the state machine does not have enough information
1267+
at the time it encounters the lookahead symbol to unambiguously determine what
1268+
to do next.
1269+
1270+
Viewing the state machine
1271+
^^^^^^^^^^^^^^^^^^^^^^^^^
1272+
1273+
SLY can save the state machine in a file in the DOT format that can be used with
1274+
graphing tools such as `graphviz <https://graphviz.org/>`_ or even viewed online
1275+
in `Edotor <https://edotor.net/>`_. The graph can help visualize how the state
1276+
machine is built. Such graphs can become very big for big grammars and are not
1277+
always practical, but building such a graph for a small grammar.
1278+
1279+
In order to generate such a file, add a ``dotfile`` attribute to your
1280+
class like this::
1281+
1282+
class CalcParser(Parser):
1283+
dotfile = 'parser.gv'
1284+
11771285
Syntax Error Handling
11781286
^^^^^^^^^^^^^^^^^^^^^
11791287

0 commit comments

Comments
 (0)