Skip to content

Commit 4e0b7cb

Browse files
committed
update unicode section in toc.md
1 parent c1f75b8 commit 4e0b7cb

File tree

1 file changed

+6
-5
lines changed

1 file changed

+6
-5
lines changed

docs/toc.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -89,13 +89,14 @@ can bring down the service, both lexer and parser are configurable to mitigate t
8989
- Input string:
9090
- Accepted encoding for input string are UTF-8.
9191
- Escaped unicode in quoted string take the form of UTF-16 BE:
92-
- Fixed 4 digit hex: e.g. `\u000A`
93-
- variable length: `\u{1F4A9}` with range (>= 0x0000 and <= 0xD7FF or >= 0xE000 and <= 0x10FFFF)
92+
- Fixed length notation using 4 digit hex: e.g. `\u000A`.
93+
- Variable length notation using curly braces `\u{1F4A9}` with range (0x0000..0xD7FF, 0xE000..0x10FFFF).
9494
- Escape sequences are only meaningful within a single-quoted string.
95-
In multiline string, unicode char must be encoded using UTF-8.
96-
- SurrogatePair: "\uD83D\uDCA9" is equal to "\u{1F4A9}"
95+
- In multiline string, unicode char must be encoded using UTF-8.
96+
- Surrogate pair using fixed length notation "\uD83D\uDCA9" is equal to variable length notation "\u{1F4A9}".
97+
- Orphaned surrogate will result in error.
9798

9899
- Output string:
99100
- Output string subject to output serialization format specification.
100101
- For example, output using json as serialization format will result in UTF-8 encoded string.
101-
- Or if the escape flag is set, it will use UTF-16 BE 4 digit hex fixed length similar to GraphQL escape sequence.
102+
- If the escape flag is set, it will use UTF-16 BE 4 digit hex fixed length similar to GraphQL escape sequence.

0 commit comments

Comments
 (0)