You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+69-19Lines changed: 69 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -366,10 +366,75 @@ Expr(:ncat)
366
366
367
367
## Tree differences between GreenNode and Expr
368
368
369
-
Wherever possible, the tree structure of `GreenNode`/`SyntaxNode` is 1:1 with
370
-
`Expr`. There are, however, some exceptions. First, `GreenNode` inherently
371
-
stores source position, so there's no need for the `LineNumberNode`s used by
372
-
`Expr`. There's also a small number of other differences
369
+
The tree structure of `GreenNode`/`SyntaxNode` is similar to Julia's `Expr`
370
+
data structure but there are various differences:
371
+
372
+
### Source ordered children
373
+
374
+
The children of our trees are strictly in source order. This has many
375
+
consequences in places where `Expr` reorders child expressions.
376
+
377
+
* Infix and postfix operator calls have the operator name in the *second* child position. `a + b` is parsed as `(call-i a + b)` - where the infix `-i` flag indicates infix child position - rather than `Expr(:call, :+, :a, :b)`.
378
+
* Flattened generators are represented in source order
379
+
380
+
### No `LineNumberNode`s
381
+
382
+
Our syntax nodes inherently stores source position, so there's no need for the
383
+
`LineNumberNode`s used by `Expr`.
384
+
385
+
### More consistent / less redundant `block`s
386
+
387
+
Sometimes `Expr` needs redundant block constructs to store `LineNumberNode`s,
388
+
but we don't need these. Also in cases which do use blocks we try to use them
389
+
consistently.
390
+
391
+
* No block is used on the right hand side of short form function syntax
392
+
* No block is used for the conditional in `elseif`
393
+
* No block is used for the body of anonymous functions after the `->`
394
+
*`let` argument lists always use a block regardless of number or form of bindings
395
+
396
+
### Faithful representation of the source text / avoid premature lowering
397
+
398
+
Some cases of "premature lowering" have been removed, preferring to represent
399
+
the source text more closely.
400
+
401
+
*`K"macrocall"` - allow users to easily distinguish macrocalls with parentheses from those without them (#218)
402
+
* Grouping parentheses are represented with a node of kind `K"parens"` (#222)
403
+
* Ternary syntax is not immediately lowered to an `if` node: `a ? b : c` parses as `(? a b c)` rather than `Expr(:if, :a, :b, :c)` (#85)
404
+
*`global const` and `const global` are not normalized by the parser. This is done in `Expr` conversion (#130)
405
+
* The AST for `do` is flatter and not lowered to a lambda by the parser: `f(x) do y ; body end` is parsed as `(do (call f x) (tuple y) (block body))` (#98)
406
+
*`@.` is not lowered to `@__dot__` inside the parser (#146)
407
+
* Docstrings use the `K"doc"` kind, and are not lowered to `Core.@doc` until later (#217)
408
+
409
+
### Containers for string-like constructs
410
+
411
+
String-like constructs always come within a container node, not as a single
412
+
token. These are useful for tooling which works with the tokens of the source
413
+
text. Also separating the delimiters from the text they delimit removes a whole
414
+
class of tokenization errors and lets the parser deal with them.
415
+
416
+
* string always use `K"string"` to wrap strings, even when they only contain a single string chunk (#94)
417
+
* char literals are wrapped in the `K"char"` kind, containing the character literal string along with their delimiters (#121)
418
+
* backticks use the `K"cmdstring"` kind
419
+
*`var""` syntax uses `K"var"` as the head (#127)
420
+
* The parser splits triple quoted strings into string chunks interspersed with whitespace trivia
421
+
422
+
### Improvements for AST inconsistencies
423
+
424
+
* Dotted call syntax like `f.(a,b)` and `a .+ b` has been made consistent with the `K"dotcall"` head (#90)
425
+
* Standalone dotted operators are always parsed as `(. op)`. For example `.*(x,y)` is parsed as `(call (. *) x y)` (#240)
426
+
* The `K"="` kind is used for keyword syntax rather than `kw`, to avoid various inconsistencies and ambiguities (#103)
427
+
* Unadorned postfix adjoint is parsed as `call` rather than as a syntactic operator for consistency with suffixed versions like `x'ᵀ` (#124)
428
+
429
+
### Improvements to awkward AST forms
430
+
431
+
* Frakentuples with multiple parameter blocks like `(a=1, b=2; c=3; d=4)` are flattened into the parent tuple instead of using nested `K"parameters"` nodes (#133)
432
+
* Using `try catch else finally end` is parsed with `K"catch"``K"else"` and `K"finally"` children to avoid the awkwardness of the optional child nodes in the `Expr` representation (#234)
433
+
* The dotted import path syntax as in `import A.b.c` is parsed with a `K"importpath"` kind rather than `K"."`, because a bare `A.b.c` has a very different nested/quoted expression representation (#244)
434
+
* We use flags rather than child nodes to represent the difference between `struct` and `mutable struct`, `module` and `baremodule` (#220)
0 commit comments