-
-
Notifications
You must be signed in to change notification settings - Fork 35
Rationalize name-char #1008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rationalize name-char #1008
Changes from 4 commits
3293cba
677878d
5ad3a72
f0d9d54
39ece5a
14eb173
0eb3c22
70c3c9c
a668429
479e41c
1bf206a
8ff73f7
074346a
bffc098
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -54,13 +54,39 @@ match = %s".match" | |||||||||||||
| identifier = [namespace ":"] name | ||||||||||||||
| namespace = name | ||||||||||||||
| name = [bidi] name-start *name-char [bidi] | ||||||||||||||
| name-start = ALPHA / "_" | ||||||||||||||
| / %xC0-D6 / %xD8-F6 / %xF8-2FF | ||||||||||||||
| / %x370-37D / %x37F-61B / %x61D-1FFF / %x200C-200D | ||||||||||||||
| / %x2070-218F / %x2C00-2FEF / %x3001-D7FF | ||||||||||||||
| / %xF900-FDCF / %xFDF0-FFFC / %x10000-EFFFF | ||||||||||||||
| name-char = name-start / DIGIT / "-" / "." | ||||||||||||||
| / %xB7 / %x300-36F / %x203F-2040 | ||||||||||||||
| name-start = ALPHA | ||||||||||||||
| / %x2B ; 【+】 omit Cc %x0-1F, Whitespace %20, Ascii 【!"#$%&'()*】 | ||||||||||||||
| / %x5F ; 【_】 omit Ascii 【,-./0123456789:;<=>?@】 【[\]^】 | ||||||||||||||
| / %xA1-61B ; omit Cc %x7F-9F, Whitespace %xA0, Ascii 【`】 【{|}~】 | ||||||||||||||
| / %x61D-167F ; omit BidiControl %x61C | ||||||||||||||
| / %x1681-1FFF ; omit Whitespace %x1680 | ||||||||||||||
|
||||||||||||||
| / %xA1-61B ; omit Cc %x7F-9F, Whitespace %xA0, Ascii 【`】 【{|}~】 | |
| / %x61D-167F ; omit BidiControl %x61C | |
| / %x1681-1FFF ; omit Whitespace %x1680 | |
| / %xA1-61B ; omit BidiControl %x61C | |
| / %x61D-167F ; omit Whitespace %x1680 | |
| / %x1681-1FFF ; omit Whitespace %x2000-200A |
The same style should be used in all these comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a difference on my screen. Do you want an additional space before or after 'omit', or a space deleted before or after 'omit'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the EA brackets to guillemets, since they line up better for monospace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant offset in a vertical direction, so a comment like "omit BidiControl %x61C" should follow the range %xA1-61B, rather than the range %x61D-167F.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will work on that.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This set is ZWSP, ZWNJ, and ZWJ. Should they really be included in name-start? That seems surprising to me, and with no positive utility.
We will need ZWNJ and ZWJ within names, though, so maybe it's fine for them to be here. But why ZWSP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still have name-char. Why not put the joiners in there?
I kind of also question ZWSP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no particular utility to having ZWSP start a name-char, nor any utility to having it end a name-char. (
It doesn't hurt to move that one (ZWSP) to name-char, but it doesn't really make a dent either — and we really wouldn't want to go too far down the very long and slippery slope. That's for linters and guidance.
That being said, if people want it out I can remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question here is why put these characters in name-start, where they have no utility? At least in name-char they would be enclosed or at the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm afraid of is that if we move that to name-char, it will just open it up to people endlessly complaining that:
"ZWSP" is in name-chart instead of name-start: why is XXX in name-start when it should also be just be in name-chart???:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is, the basic difference between name-char and name-start is that
- name is used in identifiers and variables, and can't start with a digit, -, or .
- name-char is used in literals, and can start with digit, -, .
The syntactic motivation is clear: to make sure that identifiers and variables are distinguishable from numbers. That is a clear syntactic need.
ZWSP certainly isn't needed at the start of an identifier or variable, but there is an large and complicated list of characters that are also not needed at start of identifiers and variables, and plucking just one of those characters out, without any syntactic need, doesn't actually provide much value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we do not allow space characters or control characters, I'd prefer not allowing zero-width spaces in names or unquoted literals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A zero-width space is not a space; that is just a name used for familiarity. It is a Format character, like many others.
https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7Bgc%3Dformat%7D&g=&i=
Uh oh!
There was an error while loading. Please reload this page.