Skip to content

Commit 1009ea9

Browse files
authored
Disallow U+3000 as the last character of a reserved-body. (#727)
In all places where a 'reserved-body' can occur: - in a reserved-statement, - after a reserved-annotation or private-use-annotation, in a literal-expression, variable-expression, or annotation-expression, it can be followed by whitespace ('s' nonterminal). A syntax ambiguity exists, because - as reported in #721 and #725 - a U+3000 character can occur as the last character of a 'reserved-body' (via a 'reserved-char') and also as first character of whitespace ('s' nonterminal). According to the principles explained in #725, it is not desired that a 'reserved-body' ends with a U+3000 character; rather, the U+3000 character is meant to be interpreted as part of the following whitespace. Test cases (written with \u escapes, for legibility): For reserved-statement: .regex /foo/\u3000\u3000{xyz}{{hello}} For reserved-annotation: { % foo bar \u3000\u3000 } For private-use-annotation: { & foo bar \u3000\u3000 @x } This patch removes this ambiguity, by disallowing U+3000 as the last character of a 'reserved-body'. It thus fixes #725 and the second part of #721. Details: - U+3000 gets removed from 'content-char' and 'reserved-char'. - Whereas simple-start-char, text-char, quoted-char stay the same (since U+3000 is already part of 's').
1 parent afc7ff8 commit 1009ea9

File tree

2 files changed

+4
-2
lines changed

2 files changed

+4
-2
lines changed

spec/message.abnf

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,8 @@ content-char = %x01-08 ; omit NULL (%x00), HTAB (%x09) and LF (%x0A)
9595
/ %x2F-3F ; omit @ (%x40)
9696
/ %x41-5B ; omit \ (%x5C)
9797
/ %x5D-7A ; omit { | } (%x7B-7D)
98-
/ %x7E-D7FF ; omit surrogates
98+
/ %x7E-2FFF ; omit IDEOGRAPHIC SPACE (%x3000)
99+
/ %x3001-D7FF ; omit surrogates
99100
/ %xE000-10FFFF
100101

101102
; Character escapes

spec/syntax.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -311,7 +311,8 @@ content-char = %x01-08 ; omit NULL (%x00), HTAB (%x09) and LF (%x0A)
311311
/ %x2F-3F ; omit @ (%x40)
312312
/ %x41-5B ; omit \ (%x5C)
313313
/ %x5D-7A ; omit { | } (%x7B-7D)
314-
/ %x7E-D7FF ; omit surrogates
314+
/ %x7E-2FFF ; omit IDEOGRAPHIC SPACE (%x3000)
315+
/ %x3001-D7FF ; omit surrogates
315316
/ %xE000-10FFFF
316317
```
317318

0 commit comments

Comments
 (0)