Skip to content

Commit b572f60

Browse files
committed
feat(core): limit max marker to 0xD7FF 🙀
- updates per review, off by one in max count - updated the spec
1 parent e010b6a commit b572f60

File tree

3 files changed

+6
-4
lines changed

3 files changed

+6
-4
lines changed

core/include/ldml/keyboardprocessor_ldml.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@
9595
#define LDML_LENGTH_VKEY_ITEM 0x8
9696
#define LDML_MARKER_ANY_INDEX 0xD7FF
9797
#define LDML_MARKER_CODE 0x8
98-
#define LDML_MARKER_MAX_COUNT 0xD7FD
98+
#define LDML_MARKER_MAX_COUNT 0xD7FE
9999
#define LDML_MARKER_MAX_INDEX 0xD7FE
100100
#define LDML_MARKER_MIN_INDEX 0x1
101101
#define LDML_META_SETTINGS_FALLBACK_OMIT 0x1

core/include/ldml/keyboardprocessor_ldml.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -626,7 +626,7 @@ class Constants {
626626
/** maximum marker index prior to the 'any' value */
627627
readonly marker_max_index = this.marker_any_index - 1;
628628
/** maximum count of markers (not including 'any') */
629-
readonly marker_max_count = this.marker_max_index - this.marker_min_index;
629+
readonly marker_max_count = this.marker_max_index - this.marker_min_index + 1;
630630

631631
};
632632

core/src/ldml/C9134_ldml_markers.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,8 +34,10 @@ Markers can appear in both 'emitting' and 'matching-only' areas:
3434
- Keyman already uses `UC_SENTINEL` `U+FFFF` (noncharacter), with `CODE_DEADKEY` (0x0008)
3535
- The general proposal here is to use the sequence `U+FFFF U+0008 U+XXXX` to represent marker #XXXX (starting with `U+0001`)
3636
- `U+FFFF` cannot otherwise occur in text, so it is unique
37-
- `U+FFFF U+0008 U+FFFE` to indicate 'any marker' corresponds to `\m{.}`
38-
- This scheme allows for 65,533 (0xFFFD) unique markers, from `U+FFFF U+0008 U+0001` through `U+FFFF U+0008 U+FFFD`
37+
- `U+FFFF U+0008 U+D7FF` to indicate 'any marker' corresponds to `\m{.}`
38+
- The max marker identifier will be `0xD7FE`, with `0xD7FF` reserved to represent 'any marker' if that is needed in the text stream.
39+
- This scheme allows for 55,294 unique markers, from `U+FFFF U+0008 U+0001` through `U+FFFF U+0008 U+D7FE` inclusive.
40+
- This scheme avoids the Unicode surrogate space beginning at `U+D800` and other noncharacters.
3941

4042
## Terminology
4143
- A marker's "number" is its position in the `markers` list, starting at index 1 (U+0001) being the first element in that list.

0 commit comments

Comments
 (0)