-
-
Notifications
You must be signed in to change notification settings - Fork 35
Rationalize name-char #1008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rationalize name-char #1008
Changes from 10 commits
3293cba
677878d
5ad3a72
f0d9d54
39ece5a
14eb173
0eb3c22
70c3c9c
a668429
479e41c
1bf206a
8ff73f7
074346a
bffc098
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -786,11 +786,9 @@ has been applied to both. | |
| > implementations can often substitute checking for actually applying normalization | ||
| > to _name_ values. | ||
|
|
||
| Valid content for _names_ is based on <cite>Namespaces in XML 1.0</cite>'s | ||
| [NCName](https://www.w3.org/TR/xml-names/#NT-NCName). | ||
| This is different from XML's [Name](https://www.w3.org/TR/xml/#NT-Name) | ||
| in that it MUST NOT contain a U+003A COLON `:`. | ||
| Otherwise, the set of characters allowed in a _name_ is large. | ||
| The _names_ are [immutable identifiers](https://www.unicode.org/reports/tr31/#Immutable_Identifier_Syntax). | ||
| They are similar to <cite>Namespaces in XML 1.0</cite>'s [NCName](https://www.w3.org/TR/xml-names/#NT-NCName), | ||
| but have been updated to be more consistent. | ||
|
||
|
|
||
| > [!NOTE] | ||
| > _External variables_ can be passed in that are not valid _names_. | ||
|
|
@@ -843,15 +841,66 @@ option = identifier o "=" o (literal / variable) | |
| identifier = [namespace ":"] name | ||
| namespace = name | ||
| name = [bidi] name-start *name-char [bidi] | ||
| name-start = ALPHA / "_" | ||
| / %xC0-D6 / %xD8-F6 / %xF8-2FF | ||
| / %x370-37D / %x37F-61B / %x61D-1FFF / %x200C-200D | ||
| / %x2070-218F / %x2C00-2FEF / %x3001-D7FF | ||
| / %xF900-FDCF / %xFDF0-FFFC / %x10000-EFFFF | ||
| name-char = name-start / DIGIT / "-" / "." | ||
| / %xB7 / %x300-36F / %x203F-2040 | ||
| name-start = ALPHA | ||
| / %x2B ; «+» omit Cc: %x0-1F, Whitespace: « », Ascii: «!"#$%&'()*» | ||
| / %x5F ; «_» omit Ascii: «,-./0123456789:;<=>?@» «[\]^» | ||
| / %xA1-61B ; omit Cc: %x7F-9F, Whitespace: %xA0, Ascii: «`» «{|}~» | ||
| / %x61D-167F ; omit BidiControl: %x61C | ||
| / %x1681-1FFF ; omit Whitespace: %x1680 | ||
| / %x200B-200D ; omit Whitespace: %x2000-200A | ||
| / %x2010-2027 ; omit BidiControl: %x200E-200F | ||
| / %x2030-205E ; omit Whitespace: %x2028-2029 %x202F, BidiControl: %x202A-202E | ||
| / %x2060-2065 ; omit Whitespace: %x205F | ||
| / %x206A-2FFF ; omit BidiControl: %x2066-2069 | ||
| / %x3001-D7FF ; omit Whitespace: %x3000 | ||
| / %xE000-FDCF ; omit Cs: %xD800-DFFF | ||
| / %xFDF0-FFFD ; omit NChar: %xFDD0-FDEF | ||
| / %x10000-1FFFD ; omit NChar: %xFFFE-FFFF | ||
| / %x20000-2FFFD ; omit NChar: %x1FFFE-1FFFF | ||
| / %x30000-3FFFD ; omit NChar: %x2FFFE-2FFFF | ||
| / %x40000-4FFFD ; omit NChar: %x3FFFE-3FFFF | ||
| / %x50000-5FFFD ; omit NChar: %x4FFFE-4FFFF | ||
| / %x60000-6FFFD ; omit NChar: %x5FFFE-5FFFF | ||
| / %x70000-7FFFD ; omit NChar: %x6FFFE-6FFFF | ||
| / %x80000-8FFFD ; omit NChar: %x7FFFE-7FFFF | ||
| / %x90000-9FFFD ; omit NChar: %x8FFFE-8FFFF | ||
| / %xA0000-AFFFD ; omit NChar: %x9FFFE-9FFFF | ||
| / %xB0000-BFFFD ; omit NChar: %xAFFFE-AFFFF | ||
| / %xC0000-CFFFD ; omit NChar: %xBFFFE-BFFFF | ||
| / %xD0000-DFFFD ; omit NChar: %xCFFFE-CFFFF | ||
| / %xE0000-EFFFD ; omit NChar: %xDFFFE-DFFFF | ||
| / %xF0000-FFFFD ; omit NChar: %xEFFFE-EFFFF | ||
| / %x100000-10FFFD ; omit NChar: %xFFFFE-FFFFF | ||
| ; omit NChar: %x10FFFE-10FFFF | ||
|
|
||
| name-char = name-start / DIGIT | ||
| / %x2D-2E ; «-.» omit Cc: %x0-1F, Whitespace: « », Ascii: «!"#$%&'()*+,» | ||
| ``` | ||
|
|
||
| > [!NOTE] | ||
| > Syntactically, the definitions of `identifier` and `name-char` provide backwards compatibility over time by allowing a stable, | ||
| > wide range of characters. | ||
| > So when there is a new character in a version of Unicode, it can be used in any conformant implementation of MessageFormat. | ||
| > The definition currently excludes: | ||
| > * Most ASCII except for letters and characters used for numbers | ||
| > * This avoids conflicts with syntax characters, and reserves some characters for future syntax. | ||
| > * Bidirectional controls (`Bidi_C`) | ||
| > * Control characters (`GC=Cc`, but not Format characters: `GC=Cf`) | ||
| > * Whitespace characters (`WSpace`) | ||
| > * Surrogate code points (`GC=Cs`) | ||
| > * Non-Characters (`NChar`) | ||
|
|
||
| This syntax allows a wide range of characters in _names_ and _identifiers_. | ||
| Implementers and authors of _functions_ and _messages_, | ||
| including _functions_, _options_, and _operands_ (variable names), | ||
| SHOULD avoid creating _names_ that could produce confusion or harm usability | ||
| by choosing names consistent with the following guidelines. | ||
| MessageFormat tools, such as linters, SHOULD warn when _names_ chosen by users | ||
| violate these constraints. | ||
| > | ||
| > 1. [Unicode Default Identifier Syntax](https://www.unicode.org/reports/tr31/#Default_Identifier_Syntax) | ||
| > 2. [Unicode General Security Profile for Identifiers](https://www.unicode.org/reports/tr39/#General_Security_Profile) | ||
|
|
||
| ### Escape Sequences | ||
|
|
||
| An **_<dfn>escape sequence</dfn>_** is a two-character sequence starting with | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.