Skip to content

Conversation

@teofr
Copy link
Contributor

@teofr teofr commented Jan 28, 2026

Changes to V2 language definition, the main reason is to facilitate creating an LR(1) parser. The more complex ones are:

  • Changes to TupleDeconstructionStatement, making it more strict and with a clear separation between var style declarations and explicit ones.
  • Changes to the IdentifierPath, due to making address a reserved keyword.

For another PR/discussion, we considered merging TupleDeconstructionStatement and VariableDeclaration, to merge all variable declarations together, however I think this will look a bit artificial since their shape is quite different. We can still force it if we consider there's value in it, but I think not worth it for now; they'll probably be joined in one of the passes simplifying the ast.

@teofr teofr requested review from a team as code owners January 28, 2026 10:12
@changeset-bot
Copy link

changeset-bot bot commented Jan 28, 2026

⚠️ No Changeset found

Latest commit: 516671f

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Contributor

@OmarTawfik OmarTawfik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few questions/suggestions. Thanks!

- `HexLiteral` and `YulHexLiteral` and `DecimalLiteral` and `YulDecimalLiteral`:
- It was illegal for them to be followed by `IdentifierStart`. Now we will produce two separate tokens rather than rejecting it.

## Language Definition Changes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest renaming this section to Grammar, since the rest of the doc also lists language definition changes:

## Grammar


### IdentifierPath

Changed from a simple `Separated` list to a structured format to allow the reserved `address` keyword to appear in identifier paths (but not as the head):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we are able to just use Separated(MemberAccessIdentifier, Period) for simplicity? We won't need the extra type, given how commmon IdentifierPath is used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, WDYT of IdentifierPathElement instead of MemberAccessIdentifier? the latter conflicts with the fact that one of its two variants is no longer an identifier.

Comment on lines +68 to +69
The cases where using empty tuples are still ambiguous, `(,,,) = ...` can still be a `TupleDeconstructionStatement` or a
an `AssignmentExpression` with a `TupleExpression` on the lhs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can still be a TupleDeconstructionStatement or a an AssignmentExpression

Which one? I wonder if we have existing cst_output tests for this case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, how about var (,,,,)? this is legal AFAICT.


This makes certain cases that were allowed before disallowed in V2, in particular having untyped declarations (like `(a, bool b) = ...`)
or having typed together with `var` (like `var (a, bool b) = ...`).
The cases where using empty tuples are still ambiguous, `(,,,) = ...` can still be a `TupleDeconstructionStatement` or a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For another PR/discussion, we considered merging TupleDeconstructionStatement and VariableDeclaration, to merge all variable declarations together, however I think this will look a bit artificial since their shape is quite different.

I think having this distinction (declaring new vars VS assigning to existing ones) is worth the destinction, both syntactically and semantically. WDYT of having the following structure, if it works with LALR?

  • use VariableDeclarationStatement for any syntax that declares a new name:
    • var x = ... already supported
    • int x = ... already supported
    • change VariableDeclarationStatement::name field to an enum with two variants:
      • name: Identifier -> existing
      • elements -> a struct holding LeftParen + Separated(elements) + RightParen
  • use AssignmentExpression for any syntax that just assigns values to the LHS:
    • x = ....
    • (x, y) = ....
    • (,,,) = ....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants