-
Notifications
You must be signed in to change notification settings - Fork 47
[V2] Changes to language definition #1516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: teofr/node_checker
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -27,3 +27,54 @@ We should consider adding validation for these at a later stage if needed: | |
| - They were only enabled after `0.7.0`. | ||
| - `HexLiteral` and `YulHexLiteral` and `DecimalLiteral` and `YulDecimalLiteral`: | ||
| - It was illegal for them to be followed by `IdentifierStart`. Now we will produce two separate tokens rather than rejecting it. | ||
|
|
||
| ## Grammar | ||
|
|
||
| The following changes modify the language definition to support the new parser and resolve grammar ambiguities. | ||
| In some cases we also try to simplify the model. | ||
|
|
||
| ### AddressKeyword | ||
|
|
||
| - Made the `address` keyword reserved in all versions, handling the few cases where it can be used as an identifier separately. | ||
| - `IdentifierPathElement` handles the cases where `address` can be used as an `Identifier`, either in an `IdentifierPath` or a `MemberAccessExpression`. | ||
|
|
||
| ### IdentifierPathElement | ||
|
|
||
| New enum added to allow the reserved `address` keyword in identifier paths and member access expressions | ||
| (from Solidity 0.6.0): | ||
|
|
||
| - Variants: `Identifier` | `AddressKeyword` (enabled from 0.6.0) | ||
| - Used in `MemberAccessExpression` and in `IdentifierPath` | ||
|
|
||
| ### IdentifierPath | ||
|
|
||
| Changed from a `Separated` list of `Identifier`, to a list of `IdentifierPathElement`, to capture the reserved | ||
| `AddressKeyword` as part of the path. | ||
|
|
||
| - **Before**: `Separated(name = IdentifierPath, reference = Identifier, separator = Period)` | ||
| - **After**: `Separated(name = IdentifierPath, reference = IdentifierPathElement, separator = Period)` | ||
|
|
||
| ### TupleDeconstructionStatement | ||
|
|
||
| Major restructuring to resolve ambiguities with tuple expressions and assignment expressions: | ||
|
|
||
| - **Before**: Single struct with optional `var_keyword`, `open_paren`, `elements: TupleDeconstructionElements`, `close_paren`, `equal`, `expression`, `semicolon`. | ||
| - **After**: Split into typed and untyped (var) variants: | ||
| - `TupleDeconstructionStatement`: Contains `target: TupleDeconstructionTarget`, `equal`, `expression`, `semicolon` | ||
| - `TupleDeconstructionTarget`: Enum with `VarTupleDeconstructionTarget` (till 0.5.0) | `TypedTupleDeconstructionTarget` | ||
| - `VarTupleDeconstructionTarget`: For `var (a, b) = ...` syntax (till 0.5.0) | ||
| - `TypedTupleDeconstructionTarget`: For `(uint a, uint b) = ...` syntax | ||
|
|
||
| This makes certain cases that were allowed before disallowed in V2, in particular having untyped declarations (like `(a, bool b) = ...`) | ||
| or having typed together with `var` (like `var (a, bool b) = ...`). | ||
| The cases where using empty tuples are still ambiguous, `(,,,) = ...` can still be a `TupleDeconstructionStatement` or a | ||
| an `AssignmentExpression` with a `TupleExpression` on the lhs. | ||
|
Comment on lines
+70
to
+71
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Which one? I wonder if we have existing
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, how about
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is a tricky question since there's still no parser on V2, I'll try to answer it:
We have some cases, I added a few more (they're only testing V1 for now)
This is legal with the new definition as well, since the elements of the Struct(
name = UntypedTupleDeconstructionElement,
enabled = Till("0.5.0"),
fields = (name = Optional(reference = Identifier))
), |
||
|
|
||
| Removed types: `TupleDeconstructionElements`, `TupleDeconstructionElement`, `TupleMember`, `TypedTupleMember`, `UntypedTupleMember` | ||
|
|
||
| Added types: `TupleDeconstructionTarget`, `VarTupleDeconstructionTarget`, `UntypedTupleDeconstructionElements`, `UntypedTupleDeconstructionElement`, `TypedTupleDeconstructionTarget`, `TypedTupleDeconstructionElements`, `TypedTupleDeconstructionElement`, `TypedTupleDeconstructionMember` | ||
|
|
||
| ### NamedArgumentsDeclaration | ||
|
|
||
| - Changed `arguments` field from `Optional(NamedArgumentGroup)` to `Required(NamedArgumentGroup)`. | ||
| - This avoids ambiguity with empty argument lists `()` which could be either positional or named. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having this distinction (declaring new vars VS assigning to existing ones) is worth the destinction, both syntactically and semantically. WDYT of having the following structure, if it works with LALR?
VariableDeclarationStatementfor any syntax that declares a new name:var x = ...already supportedint x = ...already supportedVariableDeclarationStatement::namefield to an enum with two variants:name: Identifier-> existingelements-> a struct holdingLeftParen+Separated(elements)+RightParenAssignmentExpressionfor any syntax that just assigns values to the LHS:x = ....(x, y) = ....(,,,) = ....There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely agree
The problem I see with the proposed structure is that currently the
intand thevarin the first 2 cases are captured by the same definition, so you could end up with a language accepting something likeint (a, b, c) = ....But also, since you want to allow for the elements of the tuple to have the type within it, you'd maybe want to make the
var/intstruct optional, to allow(bool a, uint b) = ..., but that would also parsex = ...as valid.Coming from a perspective of appeasing LALRPOP, I'd say the distinction has to be a bit stronger, so allowing
VariableDeclarationStatementto be an enum over:SingleExplicitDeclaration:int x = ...MultiExplicitDeclaration:(bool a, , int b) = ...ImplicitDeclaration(until0.5.0):var a = ...andvar (a, , b) = ...(so this one would have the enum allowing either a singleIdentifieror a tuple ofIdentifier)This could be simplified when lowering it to an AST
What do you think? I'll try to push a single commit with these changes so you can review them.