Skip to content

Commit 0ec4052

Browse files
authored
Compound ˈ&ˌ (#1052)
* UnicodeData line from the proposal * lb=BB like ˈˌ * Common * Diacritic * Regenerate UCD * Failing test * Change proposed Sk to Lm * Regenerate UCD * Ignore IDNA2008_Category
1 parent b4ba46c commit 0ec4052

File tree

17 files changed

+81
-52
lines changed

17 files changed

+81
-52
lines changed

unicodetools/data/ucd/dev/DerivedAge.txt

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# DerivedAge-18.0.0.txt
2-
# Date: 2025-11-27, 17:33:04 GMT
2+
# Date: 2025-11-27, 17:49:12 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -2128,6 +2128,7 @@ FDC8..FDCE ; 17.0 # [7] ARABIC LIGATURE RAHIMAHU ALLAAH TAAALAA..ARABIC LIG
21282128
0984 ; 18.0 # BENGALI SIGN COMBINING ANUSVARA ABOVE
21292129
1ADE..1ADF ; 18.0 # [2] COMBINING GRAVE-DOT..COMBINING DOT-ACUTE
21302130
1AEC..1AF0 ; 18.0 # [5] COMBINING CARON-ACUTE..COMBINING DOUBLE COMMA ABOVE
2131+
208F ; 18.0 # MODIFIER LETTER HIGH AND LOW VERTICAL LINE
21312132
209D..209F ; 18.0 # [3] LATIN SUBSCRIPT SMALL LETTER W..LATIN SUBSCRIPT SMALL LETTER Z
21322133
20C2..20C3 ; 18.0 # [2] RUFIYAA SIGN..UAE DIRHAM SIGN
21332134
107BB..107BF ; 18.0 # [5] MODIFIER LETTER SMALL TURNED T..MODIFIER LETTER SMALL ESH WITH DOUBLE BAR
@@ -2149,6 +2150,6 @@ FDC8..FDCE ; 17.0 # [7] ARABIC LIGATURE RAHIMAHU ALLAAH TAAALAA..ARABIC LIG
21492150
2B81E ; 18.0 # CJK UNIFIED IDEOGRAPH-2B81E
21502151
3D000..3FC3F ; 18.0 # [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
21512152

2152-
# Total code points: 11859
2153+
# Total code points: 11860
21532154

21542155
# EOF

unicodetools/data/ucd/dev/DerivedCoreProperties.txt

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# DerivedCoreProperties-18.0.0.txt
2-
# Date: 2025-11-27, 17:33:28 GMT
2+
# Date: 2025-11-27, 17:49:36 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -752,7 +752,7 @@ FFE9..FFEC ; Math # Sm [4] HALFWIDTH LEFTWARDS ARROW..HALFWIDTH DOWNWARDS A
752752
1FF6..1FFC ; Alphabetic # L& [7] GREEK SMALL LETTER OMEGA WITH PERISPOMENI..GREEK CAPITAL LETTER OMEGA WITH PROSGEGRAMMENI
753753
2071 ; Alphabetic # Lm SUPERSCRIPT LATIN SMALL LETTER I
754754
207F ; Alphabetic # Lm SUPERSCRIPT LATIN SMALL LETTER N
755-
2090..209F ; Alphabetic # Lm [16] LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT SMALL LETTER Z
755+
208F..209F ; Alphabetic # Lm [17] MODIFIER LETTER HIGH AND LOW VERTICAL LINE..LATIN SUBSCRIPT SMALL LETTER Z
756756
2102 ; Alphabetic # L& DOUBLE-STRUCK CAPITAL C
757757
2107 ; Alphabetic # L& EULER CONSTANT
758758
210A..2113 ; Alphabetic # L& [10] SCRIPT SMALL G..SCRIPT SMALL L
@@ -1476,7 +1476,7 @@ FFDA..FFDC ; Alphabetic # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANG
14761476
31350..33479 ; Alphabetic # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
14771477
3D000..3FC3F ; Alphabetic # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
14781478

1479-
# Total code points: 159243
1479+
# Total code points: 159244
14801480

14811481
# ================================================
14821482

@@ -3294,7 +3294,7 @@ FF41..FF5A ; Cased # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN
32943294
2066..206F ; Case_Ignorable # Cf [10] LEFT-TO-RIGHT ISOLATE..NOMINAL DIGIT SHAPES
32953295
2071 ; Case_Ignorable # Lm SUPERSCRIPT LATIN SMALL LETTER I
32963296
207F ; Case_Ignorable # Lm SUPERSCRIPT LATIN SMALL LETTER N
3297-
2090..209F ; Case_Ignorable # Lm [16] LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT SMALL LETTER Z
3297+
208F..209F ; Case_Ignorable # Lm [17] MODIFIER LETTER HIGH AND LOW VERTICAL LINE..LATIN SUBSCRIPT SMALL LETTER Z
32983298
20D0..20DC ; Case_Ignorable # Mn [13] COMBINING LEFT HARPOON ABOVE..COMBINING FOUR DOTS ABOVE
32993299
20DD..20E0 ; Case_Ignorable # Me [4] COMBINING ENCLOSING CIRCLE..COMBINING ENCLOSING CIRCLE BACKSLASH
33003300
20E1 ; Case_Ignorable # Mn COMBINING LEFT RIGHT ARROW ABOVE
@@ -3575,7 +3575,7 @@ E0001 ; Case_Ignorable # Cf LANGUAGE TAG
35753575
E0020..E007F ; Case_Ignorable # Cf [96] TAG SPACE..CANCEL TAG
35763576
E0100..E01EF ; Case_Ignorable # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
35773577

3578-
# Total code points: 2865
3578+
# Total code points: 2866
35793579

35803580
# ================================================
35813581

@@ -6594,7 +6594,7 @@ FF41..FF5A ; Changes_When_Casemapped # L& [26] FULLWIDTH LATIN SMALL LETTER
65946594
1FF6..1FFC ; ID_Start # L& [7] GREEK SMALL LETTER OMEGA WITH PERISPOMENI..GREEK CAPITAL LETTER OMEGA WITH PROSGEGRAMMENI
65956595
2071 ; ID_Start # Lm SUPERSCRIPT LATIN SMALL LETTER I
65966596
207F ; ID_Start # Lm SUPERSCRIPT LATIN SMALL LETTER N
6597-
2090..209F ; ID_Start # Lm [16] LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT SMALL LETTER Z
6597+
208F..209F ; ID_Start # Lm [17] MODIFIER LETTER HIGH AND LOW VERTICAL LINE..LATIN SUBSCRIPT SMALL LETTER Z
65986598
2102 ; ID_Start # L& DOUBLE-STRUCK CAPITAL C
65996599
2107 ; ID_Start # L& EULER CONSTANT
66006600
210A..2113 ; ID_Start # L& [10] SCRIPT SMALL G..SCRIPT SMALL L
@@ -7098,7 +7098,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
70987098
31350..33479 ; ID_Start # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
70997099
3D000..3FC3F ; ID_Start # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
71007100

7101-
# Total code points: 157735
7101+
# Total code points: 157736
71027102

71037103
# ================================================
71047104

@@ -7686,7 +7686,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
76867686
2054 ; ID_Continue # Pc INVERTED UNDERTIE
76877687
2071 ; ID_Continue # Lm SUPERSCRIPT LATIN SMALL LETTER I
76887688
207F ; ID_Continue # Lm SUPERSCRIPT LATIN SMALL LETTER N
7689-
2090..209F ; ID_Continue # Lm [16] LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT SMALL LETTER Z
7689+
208F..209F ; ID_Continue # Lm [17] MODIFIER LETTER HIGH AND LOW VERTICAL LINE..LATIN SUBSCRIPT SMALL LETTER Z
76907690
20D0..20DC ; ID_Continue # Mn [13] COMBINING LEFT HARPOON ABOVE..COMBINING FOUR DOTS ABOVE
76917691
20E1 ; ID_Continue # Mn COMBINING LEFT RIGHT ARROW ABOVE
76927692
20E5..20F0 ; ID_Continue # Mn [12] COMBINING REVERSE SOLIDUS OVERLAY..COMBINING ASTERISK ABOVE
@@ -8542,7 +8542,7 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
85428542
3D000..3FC3F ; ID_Continue # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
85438543
E0100..E01EF ; ID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
85448544

8545-
# Total code points: 161081
8545+
# Total code points: 161082
85468546

85478547
# ================================================
85488548

@@ -8833,7 +8833,7 @@ E0100..E01EF ; ID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR
88338833
1FF6..1FFC ; XID_Start # L& [7] GREEK SMALL LETTER OMEGA WITH PERISPOMENI..GREEK CAPITAL LETTER OMEGA WITH PROSGEGRAMMENI
88348834
2071 ; XID_Start # Lm SUPERSCRIPT LATIN SMALL LETTER I
88358835
207F ; XID_Start # Lm SUPERSCRIPT LATIN SMALL LETTER N
8836-
2090..209F ; XID_Start # Lm [16] LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT SMALL LETTER Z
8836+
208F..209F ; XID_Start # Lm [17] MODIFIER LETTER HIGH AND LOW VERTICAL LINE..LATIN SUBSCRIPT SMALL LETTER Z
88378837
2102 ; XID_Start # L& DOUBLE-STRUCK CAPITAL C
88388838
2107 ; XID_Start # L& EULER CONSTANT
88398839
210A..2113 ; XID_Start # L& [10] SCRIPT SMALL G..SCRIPT SMALL L
@@ -9341,7 +9341,7 @@ FFDA..FFDC ; XID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGU
93419341
31350..33479 ; XID_Start # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
93429342
3D000..3FC3F ; XID_Start # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
93439343

9344-
# Total code points: 157712
9344+
# Total code points: 157713
93459345

93469346
# ================================================
93479347

@@ -9925,7 +9925,7 @@ FFDA..FFDC ; XID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGU
99259925
2054 ; XID_Continue # Pc INVERTED UNDERTIE
99269926
2071 ; XID_Continue # Lm SUPERSCRIPT LATIN SMALL LETTER I
99279927
207F ; XID_Continue # Lm SUPERSCRIPT LATIN SMALL LETTER N
9928-
2090..209F ; XID_Continue # Lm [16] LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT SMALL LETTER Z
9928+
208F..209F ; XID_Continue # Lm [17] MODIFIER LETTER HIGH AND LOW VERTICAL LINE..LATIN SUBSCRIPT SMALL LETTER Z
99299929
20D0..20DC ; XID_Continue # Mn [13] COMBINING LEFT HARPOON ABOVE..COMBINING FOUR DOTS ABOVE
99309930
20E1 ; XID_Continue # Mn COMBINING LEFT RIGHT ARROW ABOVE
99319931
20E5..20F0 ; XID_Continue # Mn [12] COMBINING REVERSE SOLIDUS OVERLAY..COMBINING ASTERISK ABOVE
@@ -10786,7 +10786,7 @@ FFDA..FFDC ; XID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HA
1078610786
3D000..3FC3F ; XID_Continue # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
1078710787
E0100..E01EF ; XID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
1078810788

10789-
# Total code points: 161062
10789+
# Total code points: 161063
1079010790

1079110791
# ================================================
1079210792

@@ -11878,7 +11878,7 @@ E0100..E01EF ; Grapheme_Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELE
1187811878
208A..208C ; Grapheme_Base # Sm [3] SUBSCRIPT PLUS SIGN..SUBSCRIPT EQUALS SIGN
1187911879
208D ; Grapheme_Base # Ps SUBSCRIPT LEFT PARENTHESIS
1188011880
208E ; Grapheme_Base # Pe SUBSCRIPT RIGHT PARENTHESIS
11881-
2090..209F ; Grapheme_Base # Lm [16] LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT SMALL LETTER Z
11881+
208F..209F ; Grapheme_Base # Lm [17] MODIFIER LETTER HIGH AND LOW VERTICAL LINE..LATIN SUBSCRIPT SMALL LETTER Z
1188211882
20A0..20C3 ; Grapheme_Base # Sc [36] EURO-CURRENCY SIGN..UAE DIRHAM SIGN
1188311883
2100..2101 ; Grapheme_Base # So [2] ACCOUNT OF..ADDRESSED TO THE SUBJECT
1188411884
2102 ; Grapheme_Base # L& DOUBLE-STRUCK CAPITAL C
@@ -13086,7 +13086,7 @@ FFFC..FFFD ; Grapheme_Base # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEME
1308613086
31350..33479 ; Grapheme_Base # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
1308713087
3D000..3FC3F ; Grapheme_Base # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
1308813088

13089-
# Total code points: 169341
13089+
# Total code points: 169342
1309013090

1309113091
# ================================================
1309213092

unicodetools/data/ucd/dev/EastAsianWidth.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# EastAsianWidth-18.0.0.txt
2-
# Date: 2025-11-27, 17:33:35 GMT
2+
# Date: 2025-11-27, 17:49:44 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -973,7 +973,7 @@
973973
208A..208C ; N # Sm [3] SUBSCRIPT PLUS SIGN..SUBSCRIPT EQUALS SIGN
974974
208D ; N # Ps SUBSCRIPT LEFT PARENTHESIS
975975
208E ; N # Pe SUBSCRIPT RIGHT PARENTHESIS
976-
2090..209F ; N # Lm [16] LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT SMALL LETTER Z
976+
208F..209F ; N # Lm [17] MODIFIER LETTER HIGH AND LOW VERTICAL LINE..LATIN SUBSCRIPT SMALL LETTER Z
977977
20A0..20A8 ; N # Sc [9] EURO-CURRENCY SIGN..RUPEE SIGN
978978
20A9 ; H # Sc WON SIGN
979979
20AA..20AB ; N # Sc [2] NEW SHEQEL SIGN..DONG SIGN

unicodetools/data/ucd/dev/LineBreak.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# LineBreak-18.0.0.txt
2-
# Date: 2025-11-27, 17:33:36 GMT
2+
# Date: 2025-11-27, 17:49:45 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -961,6 +961,7 @@
961961
208A..208C ; AL # Sm [3] SUBSCRIPT PLUS SIGN..SUBSCRIPT EQUALS SIGN
962962
208D ; OP # Ps SUBSCRIPT LEFT PARENTHESIS
963963
208E ; CL # Pe SUBSCRIPT RIGHT PARENTHESIS
964+
208F ; BB # Lm MODIFIER LETTER HIGH AND LOW VERTICAL LINE
964965
2090..209F ; AL # Lm [16] LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT SMALL LETTER Z
965966
20A0..20A6 ; PR # Sc [7] EURO-CURRENCY SIGN..NAIRA SIGN
966967
20A7 ; PO # Sc PESETA SIGN

unicodetools/data/ucd/dev/PropList.txt

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# PropList-18.0.0.txt
2-
# Date: 2025-11-27, 17:33:47 GMT
2+
# Date: 2025-11-27, 17:49:58 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -1034,6 +1034,7 @@ FA70..FAD9 ; Ideographic # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COM
10341034
1FDD..1FDF ; Diacritic # Sk [3] GREEK DASIA AND VARIA..GREEK DASIA AND PERISPOMENI
10351035
1FED..1FEF ; Diacritic # Sk [3] GREEK DIALYTIKA AND VARIA..GREEK VARIA
10361036
1FFD..1FFE ; Diacritic # Sk [2] GREEK OXIA..GREEK DASIA
1037+
208F ; Diacritic # Lm MODIFIER LETTER HIGH AND LOW VERTICAL LINE
10371038
2CEF..2CF1 ; Diacritic # Mn [3] COPTIC COMBINING NI ABOVE..COPTIC COMBINING SPIRITUS LENIS
10381039
2E2F ; Diacritic # Lm VERTICAL TILDE
10391040
302A..302D ; Diacritic # Mn [4] IDEOGRAPHIC LEVEL TONE MARK..IDEOGRAPHIC ENTERING TONE MARK
@@ -1172,7 +1173,7 @@ FFE3 ; Diacritic # Sk FULLWIDTH MACRON
11721173
1E944..1E946 ; Diacritic # Mn [3] ADLAM ALIF LENGTHENER..ADLAM GEMINATION MARK
11731174
1E948..1E94A ; Diacritic # Mn [3] ADLAM CONSONANT MODIFIER..ADLAM NUKTA
11741175

1175-
# Total code points: 1306
1176+
# Total code points: 1307
11761177

11771178
# ================================================
11781179

unicodetools/data/ucd/dev/Scripts.txt

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Scripts-18.0.0.txt
2-
# Date: 2025-11-27, 17:34:04 GMT
2+
# Date: 2025-11-27, 17:50:15 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -154,6 +154,7 @@
154154
208A..208C ; Common # Sm [3] SUBSCRIPT PLUS SIGN..SUBSCRIPT EQUALS SIGN
155155
208D ; Common # Ps SUBSCRIPT LEFT PARENTHESIS
156156
208E ; Common # Pe SUBSCRIPT RIGHT PARENTHESIS
157+
208F ; Common # Lm MODIFIER LETTER HIGH AND LOW VERTICAL LINE
157158
20A0..20C3 ; Common # Sc [36] EURO-CURRENCY SIGN..UAE DIRHAM SIGN
158159
2100..2101 ; Common # So [2] ACCOUNT OF..ADDRESSED TO THE SUBJECT
159160
2102 ; Common # L& DOUBLE-STRUCK CAPITAL C
@@ -638,7 +639,7 @@ FFFC..FFFD ; Common # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHAR
638639
E0001 ; Common # Cf LANGUAGE TAG
639640
E0020..E007F ; Common # Cf [96] TAG SPACE..CANCEL TAG
640641

641-
# Total code points: 9141
642+
# Total code points: 9142
642643

643644
# ================================================
644645

unicodetools/data/ucd/dev/UnicodeData.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7541,6 +7541,7 @@
75417541
208C;SUBSCRIPT EQUALS SIGN;Sm;0;ON;<sub> 003D;;;;N;;;;;
75427542
208D;SUBSCRIPT LEFT PARENTHESIS;Ps;0;ON;<sub> 0028;;;;Y;SUBSCRIPT OPENING PARENTHESIS;;;;
75437543
208E;SUBSCRIPT RIGHT PARENTHESIS;Pe;0;ON;<sub> 0029;;;;Y;SUBSCRIPT CLOSING PARENTHESIS;;;;
7544+
208F;MODIFIER LETTER HIGH AND LOW VERTICAL LINE;Lm;0;ON;;;;;N;;;;;
75447545
2090;LATIN SUBSCRIPT SMALL LETTER A;Lm;0;L;<sub> 0061;;;;N;;;;;
75457546
2091;LATIN SUBSCRIPT SMALL LETTER E;Lm;0;L;<sub> 0065;;;;N;;;;;
75467547
2092;LATIN SUBSCRIPT SMALL LETTER O;Lm;0;L;<sub> 006F;;;;N;;;;;

unicodetools/data/ucd/dev/VerticalOrientation.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# VerticalOrientation-18.0.0.txt
2-
# Date: 2025-11-27, 17:34:06 GMT
2+
# Date: 2025-11-27, 17:50:18 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -903,7 +903,7 @@
903903
208A..208C ; R # Sm [3] SUBSCRIPT PLUS SIGN..SUBSCRIPT EQUALS SIGN
904904
208D ; R # Ps SUBSCRIPT LEFT PARENTHESIS
905905
208E ; R # Pe SUBSCRIPT RIGHT PARENTHESIS
906-
2090..209F ; R # Lm [16] LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT SMALL LETTER Z
906+
208F..209F ; R # Lm [17] MODIFIER LETTER HIGH AND LOW VERTICAL LINE..LATIN SUBSCRIPT SMALL LETTER Z
907907
20A0..20C3 ; R # Sc [36] EURO-CURRENCY SIGN..UAE DIRHAM SIGN
908908
20D0..20DC ; R # Mn [13] COMBINING LEFT HARPOON ABOVE..COMBINING FOUR DOTS ABOVE
909909
20DD..20E0 ; U # Me [4] COMBINING ENCLOSING CIRCLE..COMBINING ENCLOSING CIRCLE BACKSLASH

unicodetools/data/ucd/dev/auxiliary/SentenceBreakProperty.txt

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# SentenceBreakProperty-18.0.0.txt
2-
# Date: 2025-11-27, 17:34:05 GMT
2+
# Date: 2025-11-27, 17:50:16 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -2237,6 +2237,7 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT
22372237
1CEE..1CF3 ; OLetter # Lo [6] VEDIC SIGN HEXIFORM LONG ANUSVARA..VEDIC SIGN ROTATED ARDHAVISARGA
22382238
1CF5..1CF6 ; OLetter # Lo [2] VEDIC SIGN JIHVAMULIYA..VEDIC SIGN UPADHMANIYA
22392239
1CFA ; OLetter # Lo VEDIC SIGN DOUBLE ANUSVARA ANTARGOMUKHA
2240+
208F ; OLetter # Lm MODIFIER LETTER HIGH AND LOW VERTICAL LINE
22402241
2135..2138 ; OLetter # Lo [4] ALEF SYMBOL..DALET SYMBOL
22412242
2180..2182 ; OLetter # Nl [3] ROMAN NUMERAL ONE THOUSAND C D..ROMAN NUMERAL TEN THOUSAND
22422243
2185..2188 ; OLetter # Nl [4] ROMAN NUMERAL SIX LATE FORM..ROMAN NUMERAL ONE HUNDRED THOUSAND
@@ -2637,7 +2638,7 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
26372638
31350..33479 ; OLetter # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
26382639
3D000..3FC3F ; OLetter # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
26392640

2640-
# Total code points: 153211
2641+
# Total code points: 153212
26412642

26422643
# ================================================
26432644

unicodetools/data/ucd/dev/auxiliary/WordBreakProperty.txt

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# WordBreakProperty-18.0.0.txt
2-
# Date: 2025-11-27, 17:34:07 GMT
2+
# Date: 2025-11-27, 17:50:18 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -945,7 +945,7 @@ FF71..FF9D ; Katakana # Lo [45] HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAK
945945
1FF6..1FFC ; ALetter # L& [7] GREEK SMALL LETTER OMEGA WITH PERISPOMENI..GREEK CAPITAL LETTER OMEGA WITH PROSGEGRAMMENI
946946
2071 ; ALetter # Lm SUPERSCRIPT LATIN SMALL LETTER I
947947
207F ; ALetter # Lm SUPERSCRIPT LATIN SMALL LETTER N
948-
2090..209F ; ALetter # Lm [16] LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT SMALL LETTER Z
948+
208F..209F ; ALetter # Lm [17] MODIFIER LETTER HIGH AND LOW VERTICAL LINE..LATIN SUBSCRIPT SMALL LETTER Z
949949
2102 ; ALetter # L& DOUBLE-STRUCK CAPITAL C
950950
2107 ; ALetter # L& EULER CONSTANT
951951
210A..2113 ; ALetter # L& [10] SCRIPT SMALL G..SCRIPT SMALL L
@@ -1388,7 +1388,7 @@ FFDA..FFDC ; ALetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
13881388
1F150..1F169 ; ALetter # So [26] NEGATIVE CIRCLED LATIN CAPITAL LETTER A..NEGATIVE CIRCLED LATIN CAPITAL LETTER Z
13891389
1F170..1F189 ; ALetter # So [26] NEGATIVE SQUARED LATIN CAPITAL LETTER A..NEGATIVE SQUARED LATIN CAPITAL LETTER Z
13901390

1391-
# Total code points: 34456
1391+
# Total code points: 34457
13921392

13931393
# ================================================
13941394

0 commit comments

Comments
 (0)