From 4c3cf9b0d77ac225d443eb2a9bb32199a63201ab Mon Sep 17 00:00:00 2001 From: Nigel-Ecma Date: Mon, 10 Mar 2025 14:24:13 +1300 Subject: [PATCH 1/3] =?UTF-8?q?Addresses=20issue=20#1283.=20Updates=20?= =?UTF-8?q?=C2=A76.2.5=20as=20per=20issue=20and=20also=20changes=20the=20d?= =?UTF-8?q?escriptive=20style=20to=20the=20proscriptive=20requirement=20of?= =?UTF-8?q?=20the=20Standard?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- standard/expressions.md | 2 +- standard/lexical-structure.md | 96 ++++++++++++++++++++--------------- 2 files changed, 55 insertions(+), 43 deletions(-) diff --git a/standard/expressions.md b/standard/expressions.md index d87c5338b..2c7bc5a03 100644 --- a/standard/expressions.md +++ b/standard/expressions.md @@ -4161,7 +4161,7 @@ equality_expression ; ``` -> *Note*: Lookup for the right operand of the `is` operator must first test as a *type*, then as an *expression* which may span multiple tokens. In the case where the operand is an *expreesion*, the pattern expression must have precedence at least as high as *shift_expression*. *end note* +> *Note*: Lookup for the right operand of the `is` operator must first test as a *type*, then as an *expression* which may span multiple tokens. In the case where the operand is an *expression*, the pattern expression must have precedence at least as high as *shift_expression*. *end note* The `is` operator is described in [§12.12.12](expressions.md#121212-the-is-operator) and the `as` operator is described in [§12.12.13](expressions.md#121213-the-as-operator). diff --git a/standard/lexical-structure.md b/standard/lexical-structure.md index 49c0a1008..3c2bab45a 100644 --- a/standard/lexical-structure.md +++ b/standard/lexical-structure.md @@ -52,7 +52,18 @@ Every compilation unit in a C# program shall conform to the *compilation_unit* ### 6.2.5 Grammar ambiguities -The productions for *simple_name* ([§12.8.4](expressions.md#1284-simple-names)) and *member_access* ([§12.8.7](expressions.md#1287-member-access)) can give rise to ambiguities in the grammar for expressions. +The productions for: + +- *simple_name* ([§12.8.4](expressions.md#1284-simple-names)), +- *member_access* ([§12.8.7](expressions.md#1287-member-access)), +- *null_conditional_member_access* ([§12.8.8](expressions.md#1288-null-conditional-member-access)), +- *dependent_access* ([§12.8.8](expressions.md#1288-dependent-access)), +- *base_access* ([§12.8.15](expressions.md#12815-base-access)) and +- *pointer_member_access* ([§23.6.3](unsafe-code.md#2363-pointer-member-access)); + +(the “disambiguated productions”) can give rise to ambiguities in the grammar for expressions. + +These productions occur in contexts where a value can occur in an expression, and have one or more alternatives that end with the grammar “`identifier type_argument_list?`”. It is the optional *type_argument_list* which results in the possible ambiguity. > *Example*: The statement: > @@ -65,16 +76,17 @@ The productions for *simple_name* ([§12.8.4](expressions.md#1284-simple-names)) > > *end example* -If a sequence of tokens can be parsed (in context) as a *simple_name* ([§12.8.4](expressions.md#1284-simple-names)), *member_access* ([§12.8.7](expressions.md#1287-member-access)), or *pointer_member_access* ([§23.6.3](unsafe-code.md#2363-pointer-member-access)) ending with a *type_argument_list* ([§8.4.2](types.md#842-type-arguments)), the token immediately following the closing `>` token is examined, to see if it is +If a sequence of tokens can be parsed, in context, as one of the disambiguated productions +including an optional *type_argument_list* ([§8.4.2](types.md#842-type-arguments)), then +the token immediately following the closing `>` token shall be examined and if it is: -- One of `( ) ] } : ; , . ? == != | ^ && || & [`; or -- One of the relational operators `< <= >= is as`; or -- A contextual query keyword appearing inside a query expression; or -- In certain contexts, *identifier* is treated as a disambiguating token. Those contexts are where the sequence of tokens being disambiguated is immediately preceded by one of the keywords `is`, `case` or `out`, or arises while parsing the first element of a tuple literal (in which case the tokens are preceded by `(` or `:` and the identifier is followed by a `,`) or a subsequent element of a tuple literal. +- one of `( ) ] } : ; , . ? == != | ^ && || & [`; or +- one of the relational operators `< <= >= is as`; or +- a contextual query keyword appearing inside a query expression. -If the following token is among this list, or an identifier in such a context, then the *type_argument_list* is retained as part of the *simple_name*, *member_access* or *pointer_member-access* and any other possible parse of the sequence of tokens is discarded. Otherwise, the *type_argument_list* is not considered to be part of the *simple_name*, *member_access* or *pointer_member_access*, even if there is no other possible parse of the sequence of tokens. +then the *type_argument_list* shall be retained as part of the disambiguated production and any other possible parse of the sequence of tokens discarded. Otherwise, the tokens parsed as a *type_argument_list* shall not be considered to be part of the disambiguated production, even if there is no other possible parse of those tokens. -> *Note*: These rules are not applied when parsing a *type_argument_list* in a *namespace_or_type_name* ([§7.8](basic-concepts.md#78-namespace-and-type-names)). *end note* +> *Note*: These disambiguation rules shall not be applied when parsing other productions even if they similarly end in “`identifier type_argument_list?`”; such productions shall be parsed as normal. Examples include: *namespace_or_type_name* ([§7.8](basic-concepts.md#78-namespace-and-type-names)); *named_entity* ([§12.8.23](expressions.md#12823-named-entity)); *null_conditional_projection_initializer* ([§12.8.8](expressions.md#§1288-null_conditional-projection-initializer)); and *qualified_alias_member* ([§14.8.1](namespaces.md#1481-qualified-alias-member)). *end note* @@ -124,7 +136,7 @@ If the following token is among this list, or an identifier in such a context, t > > *end example* -When recognising a *relational_expression* ([§12.12.1](expressions.md#12121-general)) if both the “*relational_expression* `is` *type*” and “*relational_expression* `is` *constant_pattern*” alternatives are applicable, and *type* resolves to an accessible type, then the “*relational_expression* `is` *type*” alternative shall be chosen. +When recognising a *relational_expression* ([§12.12.1](expressions.md#12121-general)) if both the “*relational_expression* `is` *type*” and “*relational_expression* `is` *pattern*” alternatives are applicable, and *type* resolves to an accessible type, then the “*relational_expression* `is` *type*” alternative shall be chosen. ## 6.3 Lexical analysis @@ -189,7 +201,7 @@ Line terminators divide the characters of a C# compilation unit into lines. ```ANTLR New_Line : New_Line_Character - | '\u000D\u000A' // carriage return, line feed + | '\u000D\u000A' // carriage return, line feed ; ``` @@ -262,7 +274,7 @@ fragment Input_Character // anything but New_Line_Character : ~('\u000D' | '\u000A' | '\u0085' | '\u2028' | '\u2029') ; - + fragment New_Line_Character : '\u000D' // carriage return | '\u000A' // line feed @@ -270,11 +282,11 @@ fragment New_Line_Character | '\u2028' // line separator | '\u2029' // paragraph separator ; - + fragment Delimited_Comment : '/*' Delimited_Comment_Section* ASTERISK+ '/' ; - + fragment Delimited_Comment_Section : SLASH | ASTERISK* Not_Slash_Or_Asterisk @@ -428,7 +440,7 @@ fragment Available_Identifier fragment Escaped_Identifier // Includes keywords and contextual keywords prefixed by '@'. // See note below. - : '@' Basic_Identifier + : '@' Basic_Identifier ; fragment Basic_Identifier @@ -664,16 +676,16 @@ fragment Decimal_Integer_Literal fragment Decorated_Decimal_Digit : '_'* Decimal_Digit ; - + fragment Decimal_Digit : '0'..'9' ; - + fragment Integer_Type_Suffix : 'U' | 'u' | 'L' | 'l' | 'UL' | 'Ul' | 'uL' | 'ul' | 'LU' | 'Lu' | 'lU' | 'lu' ; - + fragment Hexadecimal_Integer_Literal : ('0x' | '0X') Decorated_Hex_Digit+ Integer_Type_Suffix? ; @@ -681,11 +693,11 @@ fragment Hexadecimal_Integer_Literal fragment Decorated_Hex_Digit : '_'* Hex_Digit ; - + fragment Hex_Digit : '0'..'9' | 'A'..'F' | 'a'..'f' ; - + fragment Binary_Integer_Literal : ('0b' | '0B') Decorated_Binary_Digit+ Integer_Type_Suffix? ; @@ -693,7 +705,7 @@ fragment Binary_Integer_Literal fragment Decorated_Binary_Digit : '_'* Binary_Digit ; - + fragment Binary_Digit : '0' | '1' ; @@ -723,14 +735,14 @@ To permit the smallest possible `int` and `long` values to be written as integer > 1_2__3___4____5 // decimal, int > _123 // not a numeric literal; identifier due to leading _ > 123_ // invalid; no trailing _allowed -> +> > 0xFf // hex, int > 0X1b_a0_44_fEL // hex, long > 0x1ade_3FE1_29AaUL // hex, ulong > 0x_abc // hex, int > _0x123 // not a numeric literal; identifier due to leading _ > 0xabc_ // invalid; no trailing _ allowed -> +> > 0b101 // binary, int > 0B1001_1010u // binary, uint > 0b1111_1111_0000UL // binary, ulong @@ -774,7 +786,7 @@ If no *Real_Type_Suffix* is specified, the type of the *Real_Literal* is `double - A real literal suffixed by `D` or `d` is of type `double`. > *Example*: The literals `1d`, `1.5d`, `1e10d`, and `123.456D` are all of type `double`. *end example* - A real literal suffixed by `M` or `m` is of type `decimal`. - > *Example*: The literals `1m`, `1.5m`, `1e10m`, and `123.456M` are all of type `decimal`. *end example* + > *Example*: The literals `1m`, `1.5m`, `1e10m`, and `123.456M` are all of type `decimal`. *end example* This literal is converted to a `decimal` value by taking the exact value, and, if necessary, rounding to the nearest representable value using banker’s rounding ([§8.3.8](types.md#838-the-decimal-type)). Any scale apparent in the literal is preserved unless the value is rounded. > *Note*: Hence, the literal `2.900m` will be parsed to form the `decimal` with sign `0`, coefficient `2900`, and scale `3`. *end note* @@ -812,24 +824,24 @@ A character literal represents a single character, and consists of a character i Character_Literal : '\'' Character '\'' ; - + fragment Character : Single_Character | Simple_Escape_Sequence | Hexadecimal_Escape_Sequence | Unicode_Escape_Sequence ; - + fragment Single_Character // anything but ', \, and New_Line_Character : ~['\\\u000D\u000A\u0085\u2028\u2029] ; - + fragment Simple_Escape_Sequence : '\\\'' | '\\"' | '\\\\' | '\\0' | '\\a' | '\\b' | '\\f' | '\\n' | '\\r' | '\\t' | '\\v' ; - + fragment Hexadecimal_Escape_Sequence : '\\x' Hex_Digit Hex_Digit? Hex_Digit? Hex_Digit? ; @@ -890,11 +902,11 @@ String_Literal : Regular_String_Literal | Verbatim_String_Literal ; - + fragment Regular_String_Literal : '"' Regular_String_Literal_Character* '"' ; - + fragment Regular_String_Literal_Character : Single_Regular_String_Literal_Character | Simple_Escape_Sequence @@ -910,16 +922,16 @@ fragment Single_Regular_String_Literal_Character fragment Verbatim_String_Literal : '@"' Verbatim_String_Literal_Character* '"' ; - + fragment Verbatim_String_Literal_Character : Single_Verbatim_String_Literal_Character | Quote_Escape_Sequence ; - + fragment Single_Verbatim_String_Literal_Character : ~["] // anything but quotation mark (U+0022) ; - + fragment Quote_Escape_Sequence : '""' ; @@ -1102,7 +1114,7 @@ Pre-processing directives are not part of the syntactic grammar of C#. However, > #endif > #if B > void H() {} -> #else +> #else > void I() {} > #endif > } @@ -1155,11 +1167,11 @@ Pre-processing expressions can occur in `#if` and `#elif` directives. The operat fragment PP_Expression : PP_Whitespace? PP_Or_Expression PP_Whitespace? ; - + fragment PP_Or_Expression : PP_And_Expression (PP_Whitespace? '||' PP_Whitespace? PP_And_Expression)* ; - + fragment PP_And_Expression : PP_Equality_Expression (PP_Whitespace? '&&' PP_Whitespace? PP_Equality_Expression)* @@ -1169,12 +1181,12 @@ fragment PP_Equality_Expression : PP_Unary_Expression (PP_Whitespace? ('==' | '!=') PP_Whitespace? PP_Unary_Expression)* ; - + fragment PP_Unary_Expression : PP_Primary_Expression | '!' PP_Whitespace? PP_Unary_Expression ; - + fragment PP_Primary_Expression : TRUE | FALSE @@ -1282,15 +1294,15 @@ fragment PP_Conditional fragment PP_If_Section : 'if' PP_Whitespace PP_Expression ; - + fragment PP_Elif_Section : 'elif' PP_Whitespace PP_Expression ; - + fragment PP_Else_Section : 'else' ; - + fragment PP_Endif : 'endif' ; @@ -1488,11 +1500,11 @@ fragment PP_Line_Indicator | DEFAULT | 'hidden' ; - + fragment PP_Compilation_Unit_Name : '"' PP_Compilation_Unit_Name_Character* '"' ; - + fragment PP_Compilation_Unit_Name_Character // Any Input_Character except " : ~('\u000D' | '\u000A' | '\u0085' | '\u2028' | '\u2029' | '"') From 4ed61a83cc8ab410c908addcf9044ef4f032ad99 Mon Sep 17 00:00:00 2001 From: Nigel-Ecma Date: Mon, 10 Mar 2025 14:50:55 +1300 Subject: [PATCH 2/3] Fix x-refs. --- standard/lexical-structure.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/standard/lexical-structure.md b/standard/lexical-structure.md index 3c2bab45a..da93a845e 100644 --- a/standard/lexical-structure.md +++ b/standard/lexical-structure.md @@ -57,7 +57,7 @@ The productions for: - *simple_name* ([§12.8.4](expressions.md#1284-simple-names)), - *member_access* ([§12.8.7](expressions.md#1287-member-access)), - *null_conditional_member_access* ([§12.8.8](expressions.md#1288-null-conditional-member-access)), -- *dependent_access* ([§12.8.8](expressions.md#1288-dependent-access)), +- *dependent_access* ([§12.8.8](expressions.md#1288-null-conditional-member-access)), - *base_access* ([§12.8.15](expressions.md#12815-base-access)) and - *pointer_member_access* ([§23.6.3](unsafe-code.md#2363-pointer-member-access)); @@ -86,7 +86,7 @@ the token immediately following the closing `>` token shall be examined and if then the *type_argument_list* shall be retained as part of the disambiguated production and any other possible parse of the sequence of tokens discarded. Otherwise, the tokens parsed as a *type_argument_list* shall not be considered to be part of the disambiguated production, even if there is no other possible parse of those tokens. -> *Note*: These disambiguation rules shall not be applied when parsing other productions even if they similarly end in “`identifier type_argument_list?`”; such productions shall be parsed as normal. Examples include: *namespace_or_type_name* ([§7.8](basic-concepts.md#78-namespace-and-type-names)); *named_entity* ([§12.8.23](expressions.md#12823-named-entity)); *null_conditional_projection_initializer* ([§12.8.8](expressions.md#§1288-null_conditional-projection-initializer)); and *qualified_alias_member* ([§14.8.1](namespaces.md#1481-qualified-alias-member)). *end note* +> *Note*: These disambiguation rules shall not be applied when parsing other productions even if they similarly end in “`identifier type_argument_list?`”; such productions shall be parsed as normal. Examples include: *namespace_or_type_name* ([§7.8](basic-concepts.md#78-namespace-and-type-names)); *named_entity* ([§12.8.23](expressions.md#12823-the-nameof-operator)); *null_conditional_projection_initializer* ([§12.8.8](expressions.md#§1288-null-conditional-member-access)); and *qualified_alias_member* ([§14.8.1](namespaces.md#1481-general)). *end note* From e55a9b5196fd88800f120d6617f0f3f342c33dd3 Mon Sep 17 00:00:00 2001 From: Nigel-Ecma Date: Mon, 10 Mar 2025 19:24:05 +1300 Subject: [PATCH 3/3] Fix x-refs. --- standard/lexical-structure.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/standard/lexical-structure.md b/standard/lexical-structure.md index da93a845e..cdc01ce91 100644 --- a/standard/lexical-structure.md +++ b/standard/lexical-structure.md @@ -76,9 +76,7 @@ These productions occur in contexts where a value can occur in an expression, an > > *end example* -If a sequence of tokens can be parsed, in context, as one of the disambiguated productions -including an optional *type_argument_list* ([§8.4.2](types.md#842-type-arguments)), then -the token immediately following the closing `>` token shall be examined and if it is: +If a sequence of tokens can be parsed, in context, as one of the disambiguated productions including an optional *type_argument_list* ([§8.4.2](types.md#842-type-arguments)), then the token immediately following the closing `>` token shall be examined and if it is: - one of `( ) ] } : ; , . ? == != | ^ && || & [`; or - one of the relational operators `< <= >= is as`; or @@ -86,7 +84,7 @@ the token immediately following the closing `>` token shall be examined and if then the *type_argument_list* shall be retained as part of the disambiguated production and any other possible parse of the sequence of tokens discarded. Otherwise, the tokens parsed as a *type_argument_list* shall not be considered to be part of the disambiguated production, even if there is no other possible parse of those tokens. -> *Note*: These disambiguation rules shall not be applied when parsing other productions even if they similarly end in “`identifier type_argument_list?`”; such productions shall be parsed as normal. Examples include: *namespace_or_type_name* ([§7.8](basic-concepts.md#78-namespace-and-type-names)); *named_entity* ([§12.8.23](expressions.md#12823-the-nameof-operator)); *null_conditional_projection_initializer* ([§12.8.8](expressions.md#§1288-null-conditional-member-access)); and *qualified_alias_member* ([§14.8.1](namespaces.md#1481-general)). *end note* +> *Note*: These disambiguation rules shall not be applied when parsing other productions even if they similarly end in “`identifier type_argument_list?`”; such productions shall be parsed as normal. Examples include: *namespace_or_type_name* ([§7.8](basic-concepts.md#78-namespace-and-type-names)); *named_entity* ([§12.8.23](expressions.md#12823-the-nameof-operator)); *null_conditional_projection_initializer* ([§12.8.8](expressions.md#1288-null-conditional-member-access)); and *qualified_alias_member* ([§14.8.1](namespaces.md#1481-general)). *end note*