Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 4 additions & 8 deletions src/identifiers.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,19 @@ r[ident]

r[ident.syntax]
```grammar,lexer
IDENTIFIER_OR_KEYWORD ->
XID_Start XID_Continue*
| `_` XID_Continue+
IDENTIFIER_OR_KEYWORD -> ( XID_Start | `_` ) XID_Continue*

XID_Start -> <`XID_Start` defined by Unicode>

XID_Continue -> <`XID_Continue` defined by Unicode>

RAW_IDENTIFIER -> `r#` IDENTIFIER_OR_KEYWORD _except `crate`, `self`, `super`, `Self`_
RAW_IDENTIFIER -> `r#` IDENTIFIER_OR_KEYWORD

NON_KEYWORD_IDENTIFIER -> IDENTIFIER_OR_KEYWORD _except a [strict][lex.keywords.strict] or [reserved][lex.keywords.reserved] keyword_

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RESERVED_RAW_IDENTIFIER a few lines below may be redundant now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I spent some time thinking about this and how to correctly express that these keywords are rejected. The current except clause didn't express that in the way that I was intending. I pushed up a commit that instead of removing the reserved rule, it moves the except part into the reserved rule.

Or, to put it in another way, r#crate is a token, it's just rejected as an error. The previous lexical grammar wasn't really conveying that.

IDENTIFIER -> NON_KEYWORD_IDENTIFIER | RAW_IDENTIFIER

RESERVED_RAW_IDENTIFIER -> `r#_`
RESERVED_RAW_IDENTIFIER -> `r#` (`_` | `crate` | `self` | `Self` | `super`)
```

<!-- When updating the version, update the UAX links, too. -->
Expand All @@ -37,8 +35,6 @@ The profile used from UAX #31 is:
* Continue := [`XID_Continue`]
* Medial := empty

with the additional constraint that a single underscore character is not an identifier.

> [!NOTE]
> Identifiers starting with an underscore are typically used to indicate an identifier that is intentionally unused, and will silence the unused warning in `rustc`.

Expand Down Expand Up @@ -76,7 +72,7 @@ Unlike a normal identifier, a raw identifier may be any strict or reserved
keyword except the ones listed above for `RAW_IDENTIFIER`.

r[ident.raw.reserved]
It is an error to use the [RESERVED_RAW_IDENTIFIER] token `r#_` in order to avoid confusion with the [WildcardPattern].
It is an error to use the [RESERVED_RAW_IDENTIFIER] token.

[`extern crate`]: items/extern-crates.md
[`no_mangle`]: abi.md#the-no_mangle-attribute
Expand Down
1 change: 1 addition & 0 deletions src/keywords.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ be used as the names of:
r[lex.keywords.strict.list]
The following keywords are in all editions:

- `_`
- `as`
- `break`
- `const`
Expand Down
4 changes: 2 additions & 2 deletions src/macros-by-example.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ MacroMatcher ->
MacroMatch ->
Token _except `$` and [delimiters][lex.token.delim]_
| MacroMatcher
| `$` ( IDENTIFIER_OR_KEYWORD _except `crate`_ | RAW_IDENTIFIER | `_` ) `:` MacroFragSpec
| `$` ( IDENTIFIER_OR_KEYWORD _except `crate`_ | RAW_IDENTIFIER ) `:` MacroFragSpec
| `$` `(` MacroMatch+ `)` MacroRepSep? MacroRepOp

MacroFragSpec ->
Expand Down Expand Up @@ -134,7 +134,7 @@ Valid fragment specifiers are:
* `block`: a [BlockExpression]
* `expr`: an [Expression]
* `expr_2021`: an [Expression] except [UnderscoreExpression] and [ConstBlockExpression] (see [macro.decl.meta.edition2024])
* `ident`: an [IDENTIFIER_OR_KEYWORD], [RAW_IDENTIFIER], or [`$crate`]
* `ident`: an [IDENTIFIER_OR_KEYWORD] except `_`, [RAW_IDENTIFIER], or [`$crate`]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing, but this doesn't seem correct, ident accepts raw identifiers and expanded $crate (and unexpanded $crate is two tokens and not IDENTIFIER_OR_KEYWORD).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, we have a few issues related to this (like #588 and #587). I was also uneasy adding this in the first place.

I don't know the right way to document that the expanded $crate can be accepted by ident. Perhaps this needs to just be more explicit what it means?

* `item`: an [Item]
* `lifetime`: a [LIFETIME_TOKEN]
* `literal`: matches `-`<sup>?</sup>[LiteralExpression]
Expand Down
30 changes: 13 additions & 17 deletions src/tokens.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ r[lex.token]
r[lex.token.syntax]
```grammar,lexer
Token ->
IDENTIFIER_OR_KEYWORD
RESERVED_TOKEN
| RAW_IDENTIFIER
| CHAR_LITERAL
| STRING_LITERAL
Expand All @@ -18,7 +18,7 @@ Token ->
| FLOAT_LITERAL
| LIFETIME_TOKEN
| PUNCTUATION
| RESERVED_TOKEN
| IDENTIFIER_OR_KEYWORD
```

r[lex.token.intro]
Expand Down Expand Up @@ -116,7 +116,7 @@ A suffix is a sequence of characters following the primary part of a literal (wi

r[lex.token.literal.suffix.syntax]
```grammar,lexer
SUFFIX -> IDENTIFIER_OR_KEYWORD
SUFFIX -> IDENTIFIER_OR_KEYWORD _except `_`_

SUFFIX_NO_E -> SUFFIX _not beginning with `e` or `E`_
```
Expand Down Expand Up @@ -762,17 +762,16 @@ r[lex.token.life.syntax]
```grammar,lexer
LIFETIME_TOKEN ->
`'` IDENTIFIER_OR_KEYWORD _not immediately followed by `'`_
| `'_` _not immediately followed by `'`_
| RAW_LIFETIME

LIFETIME_OR_LABEL ->
`'` NON_KEYWORD_IDENTIFIER _not immediately followed by `'`_
| RAW_LIFETIME

RAW_LIFETIME ->
`'r#` IDENTIFIER_OR_KEYWORD _except `crate`, `self`, `super`, `Self` and not immediately followed by `'`_
`'r#` IDENTIFIER_OR_KEYWORD _not immediately followed by `'`_

RESERVED_RAW_LIFETIME -> `'r#_` _not immediately followed by `'`_
RESERVED_RAW_LIFETIME -> `'r#` (`_` | `crate` | `self` | `Self` | `super`) _not immediately followed by `'`_
```

r[lex.token.life.intro]
Expand All @@ -787,7 +786,7 @@ r[lex.token.life.raw.allowed]
Unlike a normal lifetime, a raw lifetime may be any strict or reserved keyword except the ones listed above for `RAW_LIFETIME`.

r[lex.token.life.raw.reserved]
It is an error to use the RESERVED_RAW_LIFETIME token `'r#_` in order to avoid confusion with the [placeholder lifetime].
It is an error to use the [RESERVED_RAW_LIFETIME] token.

r[lex.token.life.raw.edition2021]
> [!EDITION-2021]
Expand Down Expand Up @@ -845,7 +844,6 @@ PUNCTUATION ->
| `#`
| `$`
| `?`
| `_`
| `{`
| `}`
| `[`
Expand Down Expand Up @@ -891,7 +889,6 @@ usages and meanings are defined in the linked pages.
| `>=` | Ge | [Greater than or equal to][comparison], [Generics]
| `<=` | Le | [Less than or equal to][comparison]
| `@` | At | [Subpattern binding]
| `_` | Underscore | [Wildcard patterns], [Inferred types], Unnamed items in [constants], [extern crates], [use declarations], and [destructuring assignment]
| `.` | Dot | [Field access][field], [Tuple index]
| `..` | DotDot | [Range][range], [Struct expressions], [Patterns], [Range Patterns][rangepat]
| `...` | DotDotDot | [Variadic functions][extern], [Range patterns]
Expand Down Expand Up @@ -925,7 +922,7 @@ r[lex.token.reserved]
## Reserved tokens

r[lex.token.reserved.intro]
Several token forms are reserved for future use. It is an error for the source input to match one of these forms.
Several token forms are reserved for future use or to avoid confusion. It is an error for the source input to match one of these forms.

r[lex.token.reserved.syntax]
```grammar,lexer
Expand All @@ -947,23 +944,23 @@ r[lex.token.reserved-prefix]
r[lex.token.reserved-prefix.syntax]
```grammar,lexer
RESERVED_TOKEN_DOUBLE_QUOTE ->
( IDENTIFIER_OR_KEYWORD _except `b` or `c` or `r` or `br` or `cr`_ | `_` ) `"`
IDENTIFIER_OR_KEYWORD _except `b` or `c` or `r` or `br` or `cr`_ `"`

RESERVED_TOKEN_SINGLE_QUOTE ->
( IDENTIFIER_OR_KEYWORD _except `b`_ | `_` ) `'`
IDENTIFIER_OR_KEYWORD _except `b`_ `'`

RESERVED_TOKEN_POUND ->
( IDENTIFIER_OR_KEYWORD _except `r` or `br` or `cr`_ | `_` ) `#`
IDENTIFIER_OR_KEYWORD _except `r` or `br` or `cr`_ `#`

RESERVED_TOKEN_LIFETIME ->
`'` ( IDENTIFIER_OR_KEYWORD _except `r`_ | `_` ) `#`
`'` IDENTIFIER_OR_KEYWORD _except `r`_ `#`
```

r[lex.token.reserved-prefix.intro]
Some lexical forms known as _reserved prefixes_ are reserved for future use.

r[lex.token.reserved-prefix.id]
Source input which would otherwise be lexically interpreted as a non-raw identifier (or a keyword or `_`) which is immediately followed by a `#`, `'`, or `"` character (without intervening whitespace) is identified as a reserved prefix.
Source input which would otherwise be lexically interpreted as a non-raw identifier (or a keyword) which is immediately followed by a `#`, `'`, or `"` character (without intervening whitespace) is identified as a reserved prefix.

r[lex.token.reserved-prefix.raw-token]
Note that raw identifiers, raw string literals, and raw byte string literals may contain a `#` character but are not interpreted as containing a reserved prefix.
Expand All @@ -972,7 +969,7 @@ r[lex.token.reserved-prefix.strings]
Similarly the `r`, `b`, `br`, `c`, and `cr` prefixes used in raw string literals, byte literals, byte string literals, raw byte string literals, C string literals, and raw C string literals are not interpreted as reserved prefixes.

r[lex.token.reserved-prefix.life]
Source input which would otherwise be lexically interpreted as a non-raw lifetime (or a keyword or `_`) which is immediately followed by a `#` character (without intervening whitespace) is identified as a reserved lifetime prefix.
Source input which would otherwise be lexically interpreted as a non-raw lifetime (or a keyword) which is immediately followed by a `#` character (without intervening whitespace) is identified as a reserved lifetime prefix.

r[lex.token.reserved-prefix.edition2021]
> [!EDITION-2021]
Expand Down Expand Up @@ -1061,7 +1058,6 @@ r[lex.token.reserved-guards.edition2024]
[numeric types]: types/numeric.md
[paths]: paths.md
[patterns]: patterns.md
[placeholder lifetime]: lifetime-elision.md
[question]: expressions/operator-expr.md#the-try-propagation-expression
[range]: expressions/range-expr.md
[rangepat]: patterns.md#range-patterns
Expand Down