You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: book/src/formality_core/parse.md
+24-1Lines changed: 24 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,9 +25,23 @@ When parsing an enum there will be multiple possibilities. We will attempt to pa
25
25
26
26
* Explicit precedence: By default, every variant has precedence 0, but you can override this by annotating variants with `#[precedence(N)]` (where `N` is some integer). This will override the precedence for that variant. Variants with higher precedences are preferred.
27
27
* Reduction prefix: When parsing, we track the list of things we had to parse. If there are two variants at the same precedence level, but one of them had to parse strictly more things than the other and in the same way, we'll prefer the longer one. So for example if one variant parsed a `Ty` and the other parsed a `Ty Ty`, we'd take the `Ty Ty`.
28
+
* When considering whether a reduction is "significant", we take casts into account. See `ActiveVariant::mark_as_cast_variant` for a more detailed explanation and set of examples.
28
29
29
30
Otherwise, the parser will panic and report ambiguity. The parser panics rather than returning an error because ambiguity doesn't mean that there is no way to parse the given text as the nonterminal -- rather that there are multiple ways. Errors mean that the text does not match the grammar for that nonterminal.
30
31
32
+
### Left-recursive grammars
33
+
34
+
We permit left recursive grammars like:
35
+
36
+
```
37
+
Expr = Expr + Expr
38
+
| integer
39
+
```
40
+
41
+
We *always* bias towards greedy parses, so `a + b + c` parses as `(a + b) + c`.
42
+
This might occasionally not be what you want.
43
+
Sorry.
44
+
31
45
### Symbols
32
46
33
47
A grammar consists of a series of *symbols*. Each symbol matches some text in the input string. Symbols come in two varieties:
@@ -39,11 +53,20 @@ A grammar consists of a series of *symbols*. Each symbol matches some text in th
39
53
* If fields have names, then `$field` should name the field.
40
54
* For position fields (e.g., the T and U in `Mul(Expr, Expr)`), use `$v0`, `$v1`, etc.
41
55
* Exception: `$$` is treated as the terminal `'$'`.
42
-
* Nonterminals can also accept modes:
56
+
* Nonterminals have various modes:
43
57
*`$field` -- just parse the field's type
44
58
*`$*field` -- the field must be a `Vec<T>` -- parse any number of `T` instances. Something like `[ $*field ]` would parse `[f1 f2 f3]`, assuming `f1`, `f2`, and `f3` are valid values for `field`.
45
59
*`$,field` -- similar to the above, but uses a comma separated list (with optional trailing comma). So `[ $,field ]` will parse something like `[f1, f2, f3]`.
46
60
*`$?field` -- will parse `field` and use `Default::default()` value if not present.
61
+
*`$<field>` -- parse `<E1, E2, E3>`, where `field: Vec<E>`
62
+
*`$<?field>` -- parse `<E1, E2, E3>`, where `field: Vec<E>`, but accept empty string as empty vector
63
+
*`$(field)` -- parse `(E1, E2, E3)`, where `field: Vec<E>`
64
+
*`$(?field)` -- parse `(E1, E2, E3)`, where `field: Vec<E>`, but accept empty string as empty vector
65
+
*`$[field]` -- parse `[E1, E2, E3]`, where `field: Vec<E>`
66
+
*`$[?field]` -- parse `[E1, E2, E3]`, where `field: Vec<E>`, but accept empty string as empty vector
67
+
*`${field}` -- parse `{E1, E2, E3}`, where `field: Vec<E>`
68
+
*`${?field}` -- parse `{E1, E2, E3}`, where `field: Vec<E>`, but accept empty string as empty vector
69
+
*`$:guard <nonterminal>` -- parses `<nonterminal>` but only if the keyword `guard` is present. For example, `$:where $,where_clauses` would parse `where WhereClause1, WhereClause2, WhereClause3`
0 commit comments