Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 2 additions & 6 deletions src/identifiers.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,13 @@ r[ident]

r[ident.syntax]
```grammar,lexer
IDENTIFIER_OR_KEYWORD ->
XID_Start XID_Continue*
| `_` XID_Continue+
IDENTIFIER_OR_KEYWORD -> ( XID_Start | `_` ) XID_Continue*

XID_Start -> <`XID_Start` defined by Unicode>

XID_Continue -> <`XID_Continue` defined by Unicode>

RAW_IDENTIFIER -> `r#` IDENTIFIER_OR_KEYWORD _except `crate`, `self`, `super`, `Self`_
RAW_IDENTIFIER -> `r#` IDENTIFIER_OR_KEYWORD _except `crate`, `self`, `super`, `Self`, `_`_

NON_KEYWORD_IDENTIFIER -> IDENTIFIER_OR_KEYWORD _except a [strict][lex.keywords.strict] or [reserved][lex.keywords.reserved] keyword_

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RESERVED_RAW_IDENTIFIER a few lines below may be redundant now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I spent some time thinking about this and how to correctly express that these keywords are rejected. The current except clause didn't express that in the way that I was intending. I pushed up a commit that instead of removing the reserved rule, it moves the except part into the reserved rule.

Or, to put it in another way, r#crate is a token, it's just rejected as an error. The previous lexical grammar wasn't really conveying that.

Expand All @@ -37,8 +35,6 @@ The profile used from UAX #31 is:
* Continue := [`XID_Continue`]
* Medial := empty

with the additional constraint that a single underscore character is not an identifier.

> [!NOTE]
> Identifiers starting with an underscore are typically used to indicate an identifier that is intentionally unused, and will silence the unused warning in `rustc`.

Expand Down
1 change: 1 addition & 0 deletions src/keywords.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ be used as the names of:
r[lex.keywords.strict.list]
The following keywords are in all editions:

- `_`
- `as`
- `break`
- `const`
Expand Down
4 changes: 2 additions & 2 deletions src/macros-by-example.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ MacroMatcher ->
MacroMatch ->
Token _except `$` and [delimiters][lex.token.delim]_
| MacroMatcher
| `$` ( IDENTIFIER_OR_KEYWORD _except `crate`_ | RAW_IDENTIFIER | `_` ) `:` MacroFragSpec
| `$` ( IDENTIFIER_OR_KEYWORD _except `crate`_ | RAW_IDENTIFIER ) `:` MacroFragSpec
| `$` `(` MacroMatch+ `)` MacroRepSep? MacroRepOp

MacroFragSpec ->
Expand Down Expand Up @@ -134,7 +134,7 @@ Valid fragment specifiers are:
* `block`: a [BlockExpression]
* `expr`: an [Expression]
* `expr_2021`: an [Expression] except [UnderscoreExpression] and [ConstBlockExpression] (see [macro.decl.meta.edition2024])
* `ident`: an [IDENTIFIER_OR_KEYWORD], [RAW_IDENTIFIER], or [`$crate`]
* `ident`: an [IDENTIFIER_OR_KEYWORD] except `_`, [RAW_IDENTIFIER], or [`$crate`]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing, but this doesn't seem correct, ident accepts raw identifiers and expanded $crate (and unexpanded $crate is two tokens and not IDENTIFIER_OR_KEYWORD).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, we have a few issues related to this (like #588 and #587). I was also uneasy adding this in the first place.

I don't know the right way to document that the expanded $crate can be accepted by ident. Perhaps this needs to just be more explicit what it means?

* `item`: an [Item]
* `lifetime`: a [LIFETIME_TOKEN]
* `literal`: matches `-`<sup>?</sup>[LiteralExpression]
Expand Down
19 changes: 8 additions & 11 deletions src/tokens.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ A suffix is a sequence of characters following the primary part of a literal (wi

r[lex.token.literal.suffix.syntax]
```grammar,lexer
SUFFIX -> IDENTIFIER_OR_KEYWORD
SUFFIX -> IDENTIFIER_OR_KEYWORD _except `_`_

SUFFIX_NO_E -> SUFFIX _not beginning with `e` or `E`_
```
Expand Down Expand Up @@ -762,15 +762,14 @@ r[lex.token.life.syntax]
```grammar,lexer
LIFETIME_TOKEN ->
`'` IDENTIFIER_OR_KEYWORD _not immediately followed by `'`_
| `'_` _not immediately followed by `'`_
| RAW_LIFETIME

LIFETIME_OR_LABEL ->
`'` NON_KEYWORD_IDENTIFIER _not immediately followed by `'`_
| RAW_LIFETIME

RAW_LIFETIME ->
`'r#` IDENTIFIER_OR_KEYWORD _except `crate`, `self`, `super`, `Self` and not immediately followed by `'`_
`'r#` IDENTIFIER_OR_KEYWORD _except `crate`, `self`, `super`, `Self`, `_` and not immediately followed by `'`_

RESERVED_RAW_LIFETIME -> `'r#_` _not immediately followed by `'`_
```
Expand Down Expand Up @@ -845,7 +844,6 @@ PUNCTUATION ->
| `#`
| `$`
| `?`
| `_`
| `{`
| `}`
| `[`
Expand Down Expand Up @@ -891,7 +889,6 @@ usages and meanings are defined in the linked pages.
| `>=` | Ge | [Greater than or equal to][comparison], [Generics]
| `<=` | Le | [Less than or equal to][comparison]
| `@` | At | [Subpattern binding]
| `_` | Underscore | [Wildcard patterns], [Inferred types], Unnamed items in [constants], [extern crates], [use declarations], and [destructuring assignment]
| `.` | Dot | [Field access][field], [Tuple index]
| `..` | DotDot | [Range][range], [Struct expressions], [Patterns], [Range Patterns][rangepat]
| `...` | DotDotDot | [Variadic functions][extern], [Range patterns]
Expand Down Expand Up @@ -947,23 +944,23 @@ r[lex.token.reserved-prefix]
r[lex.token.reserved-prefix.syntax]
```grammar,lexer
RESERVED_TOKEN_DOUBLE_QUOTE ->
( IDENTIFIER_OR_KEYWORD _except `b` or `c` or `r` or `br` or `cr`_ | `_` ) `"`
IDENTIFIER_OR_KEYWORD _except `b` or `c` or `r` or `br` or `cr`_ `"`

RESERVED_TOKEN_SINGLE_QUOTE ->
( IDENTIFIER_OR_KEYWORD _except `b`_ | `_` ) `'`
IDENTIFIER_OR_KEYWORD _except `b`_ `'`

RESERVED_TOKEN_POUND ->
( IDENTIFIER_OR_KEYWORD _except `r` or `br` or `cr`_ | `_` ) `#`
IDENTIFIER_OR_KEYWORD _except `r` or `br` or `cr`_ `#`

RESERVED_TOKEN_LIFETIME ->
`'` ( IDENTIFIER_OR_KEYWORD _except `r`_ | `_` ) `#`
`'` IDENTIFIER_OR_KEYWORD _except `r`_ `#`
```

r[lex.token.reserved-prefix.intro]
Some lexical forms known as _reserved prefixes_ are reserved for future use.

r[lex.token.reserved-prefix.id]
Source input which would otherwise be lexically interpreted as a non-raw identifier (or a keyword or `_`) which is immediately followed by a `#`, `'`, or `"` character (without intervening whitespace) is identified as a reserved prefix.
Source input which would otherwise be lexically interpreted as a non-raw identifier (or a keyword) which is immediately followed by a `#`, `'`, or `"` character (without intervening whitespace) is identified as a reserved prefix.

r[lex.token.reserved-prefix.raw-token]
Note that raw identifiers, raw string literals, and raw byte string literals may contain a `#` character but are not interpreted as containing a reserved prefix.
Expand All @@ -972,7 +969,7 @@ r[lex.token.reserved-prefix.strings]
Similarly the `r`, `b`, `br`, `c`, and `cr` prefixes used in raw string literals, byte literals, byte string literals, raw byte string literals, C string literals, and raw C string literals are not interpreted as reserved prefixes.

r[lex.token.reserved-prefix.life]
Source input which would otherwise be lexically interpreted as a non-raw lifetime (or a keyword or `_`) which is immediately followed by a `#` character (without intervening whitespace) is identified as a reserved lifetime prefix.
Source input which would otherwise be lexically interpreted as a non-raw lifetime (or a keyword) which is immediately followed by a `#` character (without intervening whitespace) is identified as a reserved lifetime prefix.

r[lex.token.reserved-prefix.edition2021]
> [!EDITION-2021]
Expand Down