Skip to content

Commit 96f7077

Browse files
committed
document and add some more tests
1 parent 3cf2831 commit 96f7077

File tree

2 files changed

+47
-15
lines changed

2 files changed

+47
-15
lines changed

book/src/formality_core/parse.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,23 @@ When parsing an enum there will be multiple possibilities. We will attempt to pa
2525

2626
* Explicit precedence: By default, every variant has precedence 0, but you can override this by annotating variants with `#[precedence(N)]` (where `N` is some integer). This will override the precedence for that variant. Variants with higher precedences are preferred.
2727
* Reduction prefix: When parsing, we track the list of things we had to parse. If there are two variants at the same precedence level, but one of them had to parse strictly more things than the other and in the same way, we'll prefer the longer one. So for example if one variant parsed a `Ty` and the other parsed a `Ty Ty`, we'd take the `Ty Ty`.
28+
* When considering whether a reduction is "significant", we take casts into account. See `ActiveVariant::mark_as_cast_variant` for a more detailed explanation and set of examples.
2829

2930
Otherwise, the parser will panic and report ambiguity. The parser panics rather than returning an error because ambiguity doesn't mean that there is no way to parse the given text as the nonterminal -- rather that there are multiple ways. Errors mean that the text does not match the grammar for that nonterminal.
3031

32+
### Left-recursive grammars
33+
34+
We permit left recursive grammars like:
35+
36+
```
37+
Expr = Expr + Expr
38+
| integer
39+
```
40+
41+
We *always* bias towards greedy parses, so `a + b + c` parses as `(a + b) + c`.
42+
This might occasionally not be what you want.
43+
Sorry.
44+
3145
### Symbols
3246

3347
A grammar consists of a series of *symbols*. Each symbol matches some text in the input string. Symbols come in two varieties:
@@ -39,11 +53,20 @@ A grammar consists of a series of *symbols*. Each symbol matches some text in th
3953
* If fields have names, then `$field` should name the field.
4054
* For position fields (e.g., the T and U in `Mul(Expr, Expr)`), use `$v0`, `$v1`, etc.
4155
* Exception: `$$` is treated as the terminal `'$'`.
42-
* Nonterminals can also accept modes:
56+
* Nonterminals have various modes:
4357
* `$field` -- just parse the field's type
4458
* `$*field` -- the field must be a `Vec<T>` -- parse any number of `T` instances. Something like `[ $*field ]` would parse `[f1 f2 f3]`, assuming `f1`, `f2`, and `f3` are valid values for `field`.
4559
* `$,field` -- similar to the above, but uses a comma separated list (with optional trailing comma). So `[ $,field ]` will parse something like `[f1, f2, f3]`.
4660
* `$?field` -- will parse `field` and use `Default::default()` value if not present.
61+
* `$<field>` -- parse `<E1, E2, E3>`, where `field: Vec<E>`
62+
* `$<?field>` -- parse `<E1, E2, E3>`, where `field: Vec<E>`, but accept empty string as empty vector
63+
* `$(field)` -- parse `(E1, E2, E3)`, where `field: Vec<E>`
64+
* `$(?field)` -- parse `(E1, E2, E3)`, where `field: Vec<E>`, but accept empty string as empty vector
65+
* `$[field]` -- parse `[E1, E2, E3]`, where `field: Vec<E>`
66+
* `$[?field]` -- parse `[E1, E2, E3]`, where `field: Vec<E>`, but accept empty string as empty vector
67+
* `${field}` -- parse `{E1, E2, E3}`, where `field: Vec<E>`
68+
* `${?field}` -- parse `{E1, E2, E3}`, where `field: Vec<E>`, but accept empty string as empty vector
69+
* `$:guard <nonterminal>` -- parses `<nonterminal>` but only if the keyword `guard` is present. For example, `$:where $,where_clauses` would parse `where WhereClause1, WhereClause2, WhereClause3`
4770

4871
### Greediness
4972

tests/parser-torture-tests/precedence.rs

Lines changed: 23 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,33 @@
11
use formality_core::term;
22
use std::sync::Arc;
33

4-
#[test]
5-
fn precedence() {
6-
#[term]
7-
pub enum Root {
8-
#[cast]
9-
Id(Id),
4+
#[term]
5+
pub enum Expr {
6+
#[cast]
7+
Id(Id),
108

11-
#[grammar($v0 + $v1)]
12-
Add(Arc<Root>, Arc<Root>),
9+
#[grammar($v0 + $v1)]
10+
Add(Arc<Expr>, Arc<Expr>),
1311

14-
#[grammar($v0 * $v1)]
15-
#[precedence(1)]
16-
Mul(Arc<Root>, Arc<Root>),
17-
}
12+
#[grammar($v0 * $v1)]
13+
#[precedence(1)]
14+
Mul(Arc<Expr>, Arc<Expr>),
15+
}
1816

19-
formality_core::id!(Id);
17+
formality_core::id!(Id);
2018

21-
let term: Root = crate::ptt::term("a + b * c");
19+
#[test]
20+
fn mul_is_higher_precedence() {
21+
let term: Expr = crate::ptt::term("a + b * c");
22+
expect_test::expect![[r#"
23+
a + b * c
24+
"#]]
25+
.assert_debug_eq(&term);
26+
}
27+
28+
#[test]
29+
fn equal_precedence_panics() {
30+
let term: Expr = crate::ptt::term("a + b * c");
2231
expect_test::expect![[r#"
2332
a + b * c
2433
"#]]

0 commit comments

Comments
 (0)