-
Notifications
You must be signed in to change notification settings - Fork 22
Scopes: Spec text for the "scopes" proposal #196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
26d4baf
to
115f0b0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I only went through the data structures so far)
spec.emu
Outdated
</tr> | ||
</table> | ||
</emu-table> | ||
<p>A root Decoded Generated Record describes a distinct part of the generated code. Each Generated Range Record may be associated with a Original Scope Record as indicated by the presence of the [[Definition]] field.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Decoded Generated Record" is not auto-linked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the ecmarkup gets confused as there is currently a definition for both the terms "generated range" and "generated range record". Maybe we have to rename it to "range record", or remove the definition for "generated range". Wdyt?
* Combine 'IsStackFrame' and 'IsHidden' of generated range into a single enum field. * Get rid of Sub-Range Binding Record in favor of a list of list of binding records.
<emu-alg> | ||
1. Let _relativeName_ be the VLQSignedValue of |Vlq|. | ||
1. Set _state_.[[ScopeNameIndex]] to _state_.[[ScopeNameIndex]] + _relativeName_. | ||
1. Set _scope_.[[Name]] to _names_[_state_.[[ScopeNameIndex]]]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to check names
length?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on how lists are defined. If out-of-bound reads results in *null*
, this would be fine. But we probably want to "optionally report an error" in such cases anyway.
1. If |RangeLine| is present, perform DecodeGeneratedRangeItem of |RangeLine| with arguments _range_, _state_ and _names_. | ||
1. Set _range_.[[Start]].[[Line]] to _state_.[[RangeLine]]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto this creates ambiguity in the parser.
spec.emu
Outdated
1. Assert: |ScopeName| is present if and only if, _flags_ & 0x1 = 0x1. | ||
1. If |ScopeName| is present, perform DecodeOriginalScopeItem of |ScopeName| with arguments _scope_, _state_ and _names_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we can assert this because ScopeName
and ScopeKind
are ambiguous. We could pass different args based on the value, though, if both were defined as Vlq
instead of special named productions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the grammar currently is not context-free. If we define both as Vlq
, how do I reference them from the SDO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can reference them as the first Vlq
or the second Vlq
(eg https://tc39.es/ecma262/multipage/abstract-operations.html#sec-runtime-semantics-stringnumericvalue:~:text=Let%20a%20be%20the%20MV%20of%20the%20first). We'd need to call a differently named SDO instead of reusing DecodeOriginalScopeItem
. Or we could define an AO that operates on Parse Nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave it a shot. See commit f15b006 for the diff. I'm not a huge fan as it makes the production/decoding more complicated. Does it really need to be unambiguous?
Note that we should probably turn these assertions into "optionally report an error".
Co-authored-by: Justin Ridgewell <[email protected]>
* Include 'scopes' in the overall source map example * Clarify pre-order for scope/range start/end items * Clarify ~none~ for StackFrametype * Introduce 'BindingExpression' as a dedicated name for sub-range binding VLQs * Don't reserve 'A' for as an item tag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grammar review
Tag | ||
Tag UnknownVlqList |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tag | |
Tag UnknownVlqList | |
Tag UnknownVlqList? |
Scopes : | ||
OriginalScopeTreeList | ||
TopLevelItemList | ||
OriginalScopeTreeList `,` TopLevelItemList |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be helpful to have either a comment saying that OriginalScopeTreeList
is guaranteed to only start with ,
/B
or to be empty, and that TopLevelItemList
is guaranteed to not be empty and not start with ,
/B
.
It's not obvious that there is no ambiguity here otherwise, since it requires going down into the grammar multiple levels across different sections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to that, I don't really like that an eager parser cannot parse this Scopes
production. When parsing the string ,,X
, a parser would:
- parse
,,
as aOriginalScopeTreeList
- see that
,,
ends with a,
is followed by something that is a validTopLevelItemList
, and backtrack by 1 so that now theOriginalScopeTreeList
is only,
- Consume the now available
,
- Parse
TopLevelItemList
I also slightly dislike that in ,,X
we have two original scope trees, while in ,,
we have three.
What if the last trailing comma for OriginalScopeTreeList
was always ignored, like it is inside array literals in JS, so that the grammar for Scope
would become
Scope :
OriginalScopeTreeList ScopeCommas?
OriginalScopeTreeList ScopeCommas TopLevelItemList
OriginalScopeTreeList :
[empty]
OriginalScopeTree
OriginalScopeTreeList ScopeCommas OriginalScopeTree
ScopeCommas :
`,`
ScopeCommas `,`
and:
OriginalScopeTreeList
and noScopeCommas
,
would have an emptyOriginalScopeTreeList
followed by aScopeCommas
with one comma,,
would have an emptyOriginalScopeTreeList
followed by aScopeCommas
with two commas,,X
would have an emptyOriginalScopeTreeList
followed by aScopeCommas
with two commasB...,,B...,,X
would have aB...,,B...
OriginalScopeTreeList
followed by aScopeCommas
with two commas and thenTopLevelItemList
When decoding, for we would first decode OriginalScopeTreeList
and then add as many nulls as the number of commas in Scope
's ScopeCommas
minus 1.
<p>In grammar form:</p> | ||
<emu-grammar type="definition"> | ||
OriginalScopeTree : | ||
OriginalScopeStart OriginalScopeVariablesItem? OriginalScopeItemList? `,` OriginalScopeEnd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very slight preference here for doing OriginalScopeStart `,` OriginalScopeVariablesItem? OriginalScopeItemList? OriginalScopeEnd
and giving OriginalScopeVariablesItem
and OriginalScopeItemList
a trailing rather than leading comma, so that the disambiguation for whether they are present or not is on the first character and not on the second.
ScopeVariableList : | ||
ScopeVariable | ||
ScopeVariableList ScopeVariable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't checked yet the SDOs, but it might make sense to have a generic VlqList
production that we can reuse in multiple places.
<p>In grammar form:</p> | ||
<emu-grammar type="definition"> | ||
GeneratedRangeTree : | ||
GeneratedRangeStart GeneratedRangeBindingsItem? GeneratedRangeCallSiteItem? GeneratedRangeItemList? `,` GeneratedRangeEnd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here about using trailing rather than leading commas
</h1> | ||
<dl class="header"></dl> | ||
<emu-alg> | ||
1. Let _scope_ be a new Original Scope Record with all fields default initialized. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things like [[Start]] do not have a clear default, it'd be better to explicitly initialize the record.
I find the decoding algorithms very hard to follow, with multiple mutable data structures used at the same time. I'm gong to submit a PR to your branch with a potential alternative, if I can find a way to make it clearer. |
For ScopeFlags and ScopeNameOrKind/ScopeKind, what if instead of having a bit that says "is the first one present" and one that says "is the second one present" (leading to potentially invalid states, such as saying that both are present but then they are not), we had a single bit that says "if both are present, then it's the name/kind". By doing so if neither or both are present we don't even need to check the bit. |
1. Set _scope_.[[Start]].[[Column]] to _state_.[[ScopeColumn]]. | ||
1. Let _flags_ be the VLQUnsignedValue of |ScopeFlags|. | ||
1. If _flags_ & 0x3 = 0x1, then | ||
1. Assert: |ScopeNameOrKind| is present and |ScopeKind| is not present. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually guaranteed? What if I set 0x1
in the flags, and then put both |ScopeNameOrKind| and |ScopeKind| in it? Should we optionally report an error and return instead? Or maybe have an "early error" in the grammar that ensures it?
1. Let _flags_ be the VLQUnsignedValue of |ScopeFlags|. | ||
1. If _flags_ & 0x3 = 0x1, then | ||
1. Assert: |ScopeNameOrKind| is present and |ScopeKind| is not present. | ||
1. DecodeOriginalScopeName(|ScopeNameOrKind|, _scope_, _state_, _names_). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1. DecodeOriginalScopeName(|ScopeNameOrKind|, _scope_, _state_, _names_). | |
1. Perform DecodeOriginalScopeName(|ScopeNameOrKind|, _scope_, _state_, _names_). |
(and the others)
Preview
Rebased on top of #193 and #194.
This is an early draft of the "scopes" proposal spec text. Sharing it already to get some early feedback and to write it out in the open :)
Done
ToDo