Skip to content

Commit daed3b5

Browse files
authored
InSC and InPC changes for 17.0β (#1130)
* UTC-181-A111 Propose text for comments on IndicSyllabicCategory.txt in the description of InSC=Consonant and InSC=Consonant_Placeholder to clarify the classification of the group of characters mentioned in L2/24-203, including a specific mention of U+0F68 TIBETAN LETTER A, for Unicode Version 17.0. [Ref. Section 4.6 of document L2/24-228] * Regenerate UCD * UTC-181-A110 Change the Indic_Syllabic_Category of U+11A50 SOYOMBO LETTER A and U+11A00 ZANABAZAR SQUARE LETTER A from Vowel_Independent to Consonant, and that of U+1900 LIMBU VOWEL-CARRIER LETTER from Consonant_Placeholder to Consonant, for Unicode Version 17.0. [Ref. Section 4.6 of document L2/24-228] * UTC-183-A33 Change the Indic_Syllabic_Category for U+11A3A ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA to Consonant_With_Stacker (and remove it from IndicPositionalCategory.txt) and for U+11A86 SOYOMBO CLUSTER-INITIAL LETTER RA to Consonant_Preceding_Repha, as described in L2/25-119, for Unicode Version 17.0. [Ref. 3.2 in L2/25-091R] * Regenerate UCD
1 parent 5237754 commit daed3b5

File tree

4 files changed

+30
-13
lines changed

4 files changed

+30
-13
lines changed

unicodetools/data/ucd/dev/IndicPositionalCategory.txt

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -591,7 +591,6 @@ ABE5 ; Top # Mn MEETEI MAYEK VOWEL SIGN ANAP
591591
11A01 ; Top # Mn ZANABAZAR SQUARE VOWEL SIGN I
592592
11A04..11A09 ; Top # Mn [6] ZANABAZAR SQUARE VOWEL SIGN E..ZANABAZAR SQUARE VOWEL SIGN REVERSED I
593593
11A35..11A38 ; Top # Mn [4] ZANABAZAR SQUARE SIGN CANDRABINDU..ZANABAZAR SQUARE SIGN ANUSVARA
594-
11A3A ; Top # Lo ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA
595594
11A51 ; Top # Mn SOYOMBO VOWEL SIGN I
596595
11A54..11A56 ; Top # Mn [3] SOYOMBO VOWEL SIGN E..SOYOMBO VOWEL SIGN OE
597596
11A84..11A89 ; Top # Lo [6] SOYOMBO SIGN JIHVAMULIYA..SOYOMBO CLUSTER-INITIAL LETTER SA

unicodetools/data/ucd/dev/IndicSyllabicCategory.txt

Lines changed: 18 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# IndicSyllabicCategory-17.0.0.txt
2-
# Date: 2025-01-27, 18:09:16 GMT
2+
# Date: 2025-05-08, 22:20:16 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -473,8 +473,6 @@ ABD1 ; Vowel_Independent # Lo MEETEI MAYEK LETTER ATIYA
473473
11909 ; Vowel_Independent # Lo DIVES AKURU LETTER O
474474
119A0..119A7 ; Vowel_Independent # Lo [8] NANDINAGARI LETTER A..NANDINAGARI LETTER VOCALIC RR
475475
119AA..119AD ; Vowel_Independent # Lo [4] NANDINAGARI LETTER E..NANDINAGARI LETTER AU
476-
11A00 ; Vowel_Independent # Lo ZANABAZAR SQUARE LETTER A
477-
11A50 ; Vowel_Independent # Lo SOYOMBO LETTER A
478476
11C00..11C08 ; Vowel_Independent # Lo [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L
479477
11C0A..11C0D ; Vowel_Independent # Lo [4] BHAIKSUKI LETTER E..BHAIKSUKI LETTER AU
480478
11D00..11D06 ; Vowel_Independent # Lo [7] MASARAM GONDI LETTER A..MASARAM GONDI LETTER E
@@ -791,6 +789,8 @@ A926..A92A ; Vowel # Mn [5] KAYAH LI VOWEL UE..KAYAH LI VOWEL O
791789
# Indic script layout (NBSP and dotted circle), as well as a few script-
792790
# specific vowel-holder characters which are not technically
793791
# consonants, but serve instead as bases for placement of vowel marks.
792+
# Vowel carriers that are null consonants instead have the
793+
# Indic_Syllabic_Category Consonant.
794794

795795
# [Not derivable]
796796

@@ -801,7 +801,6 @@ A926..A92A ; Vowel # Mn [5] KAYAH LI VOWEL UE..KAYAH LI VOWEL O
801801
0A72..0A73 ; Consonant_Placeholder # Lo [2] GURMUKHI IRI..GURMUKHI URA
802802
104B ; Consonant_Placeholder # Po MYANMAR SIGN SECTION
803803
104E ; Consonant_Placeholder # Po MYANMAR SYMBOL AFOREMENTIONED
804-
1900 ; Consonant_Placeholder # Lo LIMBU VOWEL-CARRIER LETTER
805804
1CFA ; Consonant_Placeholder # Lo VEDIC SIGN DOUBLE ANUSVARA ANTARGOMUKHA
806805
2010..2014 ; Consonant_Placeholder # Pd [5] HYPHEN..EM DASH
807806
25CC ; Consonant_Placeholder # So DOTTED CIRCLE
@@ -814,7 +813,14 @@ AA74..AA76 ; Consonant_Placeholder # Lo [3] MYANMAR LOGOGRAM KHAMTI OAY..MY
814813

815814
# Indic_Syllabic_Category=Consonant
816815

817-
# Consonant (ordinary abugida consonants, with inherent vowels)
816+
# Consonant
817+
# This includes ordinary abugida consonants with inherent vowels.
818+
# In scripts that do not have distinct independent vowel characters, but instead
819+
# form independent vowels by adding dependent vowels to a vowel carrier which
820+
# otherwise represents the inherent vowel, that vowel carrier has the
821+
# Indic_Syllabic_Category Consonant, as a null consonant. Such vowel carriers
822+
# can often also be analyzed as glottal stops with inherent vowels.
823+
# An example is U+0F68 ཨ TIBETAN LETTER A.
818824

819825
# [Not derivable]
820826

@@ -893,7 +899,7 @@ AA74..AA76 ; Consonant_Placeholder # Lo [3] MYANMAR LOGOGRAM KHAMTI OAY..MY
893899
1763..176C ; Consonant # Lo [10] TAGBANWA LETTER KA..TAGBANWA LETTER YA
894900
176E..1770 ; Consonant # Lo [3] TAGBANWA LETTER LA..TAGBANWA LETTER SA
895901
1780..17A2 ; Consonant # Lo [35] KHMER LETTER KA..KHMER LETTER QA
896-
1901..191E ; Consonant # Lo [30] LIMBU LETTER KA..LIMBU LETTER TRA
902+
1900..191E ; Consonant # Lo [31] LIMBU VOWEL-CARRIER LETTER..LIMBU LETTER TRA
897903
1950..1962 ; Consonant # Lo [19] TAI LE LETTER KA..TAI LE LETTER NA
898904
1980..19AB ; Consonant # Lo [44] NEW TAI LUE LETTER HIGH QA..NEW TAI LUE LETTER LOW SUA
899905
1A00..1A16 ; Consonant # Lo [23] BUGINESE LETTER KA..BUGINESE LETTER HA
@@ -970,7 +976,9 @@ ABD2..ABDA ; Consonant # Lo [9] MEETEI MAYEK LETTER GOK..MEETEI MAYEK LETTE
970976
11915..11916 ; Consonant # Lo [2] DIVES AKURU LETTER NYA..DIVES AKURU LETTER TTA
971977
11918..1192F ; Consonant # Lo [24] DIVES AKURU LETTER DDA..DIVES AKURU LETTER ZA
972978
119AE..119D0 ; Consonant # Lo [35] NANDINAGARI LETTER KA..NANDINAGARI LETTER RRA
979+
11A00 ; Consonant # Lo ZANABAZAR SQUARE LETTER A
973980
11A0B..11A32 ; Consonant # Lo [40] ZANABAZAR SQUARE LETTER KA..ZANABAZAR SQUARE LETTER KSSA
981+
11A50 ; Consonant # Lo SOYOMBO LETTER A
974982
11A5C..11A83 ; Consonant # Lo [40] SOYOMBO LETTER KA..SOYOMBO LETTER KSSA
975983
11C0E..11C2E ; Consonant # Lo [33] BHAIKSUKI LETTER KA..BHAIKSUKI LETTER HA
976984
11C72..11C8F ; Consonant # Lo [30] MARCHEN LETTER KA..MARCHEN LETTER A
@@ -1016,6 +1024,7 @@ ABD2..ABDA ; Consonant # Lo [9] MEETEI MAYEK LETTER GOK..MEETEI MAYEK LETTE
10161024
1CF5..1CF6 ; Consonant_With_Stacker # Lo [2] VEDIC SIGN JIHVAMULIYA..VEDIC SIGN UPADHMANIYA
10171025
11003..11004 ; Consonant_With_Stacker # Lo [2] BRAHMI SIGN JIHVAMULIYA..BRAHMI SIGN UPADHMANIYA
10181026
11460..11461 ; Consonant_With_Stacker # Lo [2] NEWA SIGN JIHVAMULIYA..NEWA SIGN UPADHMANIYA
1027+
11A3A ; Consonant_With_Stacker # Lo ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA
10191028

10201029
# ================================================
10211030

@@ -1027,8 +1036,8 @@ ABD2..ABDA ; Consonant # Lo [9] MEETEI MAYEK LETTER GOK..MEETEI MAYEK LETTE
10271036

10281037
111C2..111C3 ; Consonant_Prefixed # Lo [2] SHARADA SIGN JIHVAMULIYA..SHARADA SIGN UPADHMANIYA
10291038
1193F ; Consonant_Prefixed # Lo DIVES AKURU PREFIXED NASAL SIGN
1030-
11A3A ; Consonant_Prefixed # Lo ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA
1031-
11A84..11A89 ; Consonant_Prefixed # Lo [6] SOYOMBO SIGN JIHVAMULIYA..SOYOMBO CLUSTER-INITIAL LETTER SA
1039+
11A84..11A85 ; Consonant_Prefixed # Lo [2] SOYOMBO SIGN JIHVAMULIYA..SOYOMBO SIGN UPADHMANIYA
1040+
11A87..11A89 ; Consonant_Prefixed # Lo [3] SOYOMBO CLUSTER-INITIAL LETTER LA..SOYOMBO CLUSTER-INITIAL LETTER SA
10321041

10331042
# ================================================
10341043

@@ -1042,6 +1051,7 @@ ABD2..ABDA ; Consonant # Lo [9] MEETEI MAYEK LETTER GOK..MEETEI MAYEK LETTE
10421051
0D4E ; Consonant_Preceding_Repha # Lo MALAYALAM LETTER DOT REPH
10431052
113D1 ; Consonant_Preceding_Repha # Lo TULU-TIGALARI REPHA
10441053
11941 ; Consonant_Preceding_Repha # Lo DIVES AKURU INITIAL RA
1054+
11A86 ; Consonant_Preceding_Repha # Lo SOYOMBO CLUSTER-INITIAL LETTER RA
10451055
11D46 ; Consonant_Preceding_Repha # Lo MASARAM GONDI REPHA
10461056
11F02 ; Consonant_Preceding_Repha # Lo KAWI SIGN REPHA
10471057

unicodetools/data/ucd/dev/auxiliary/GraphemeBreakProperty.txt

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# GraphemeBreakProperty-17.0.0.txt
2-
# Date: 2025-01-27, 18:09:16 GMT
2+
# Date: 2025-05-08, 22:20:13 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -30,12 +30,11 @@
3030
113D1 ; Prepend # Lo TULU-TIGALARI REPHA
3131
1193F ; Prepend # Lo DIVES AKURU PREFIXED NASAL SIGN
3232
11941 ; Prepend # Lo DIVES AKURU INITIAL RA
33-
11A3A ; Prepend # Lo ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA
3433
11A84..11A89 ; Prepend # Lo [6] SOYOMBO SIGN JIHVAMULIYA..SOYOMBO CLUSTER-INITIAL LETTER SA
3534
11D46 ; Prepend # Lo MASARAM GONDI REPHA
3635
11F02 ; Prepend # Lo KAWI SIGN REPHA
3736

38-
# Total code points: 28
37+
# Total code points: 27
3938

4039
# ================================================
4140

unicodetools/src/main/resources/org/unicode/text/UCD/MakeUnicodeFiles.txt

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1168,10 +1168,19 @@ Value: Consonant_Placeholder
11681168
# Indic script layout (NBSP and dotted circle), as well as a few script-
11691169
# specific vowel-holder characters which are not technically
11701170
# consonants, but serve instead as bases for placement of vowel marks.
1171+
# Vowel carriers that are null consonants instead have the
1172+
# Indic_Syllabic_Category Consonant.
11711173

11721174
# [Not derivable]
11731175
Value: Consonant
1174-
# Consonant (ordinary abugida consonants, with inherent vowels)
1176+
# Consonant
1177+
# This includes ordinary abugida consonants with inherent vowels.
1178+
# In scripts that do not have distinct independent vowel characters, but instead
1179+
# form independent vowels by adding dependent vowels to a vowel carrier which
1180+
# otherwise represents the inherent vowel, that vowel carrier has the
1181+
# Indic_Syllabic_Category Consonant, as a null consonant. Such vowel carriers
1182+
# can often also be analyzed as glottal stops with inherent vowels.
1183+
# An example is U+0F68 ཨ TIBETAN LETTER A.
11751184

11761185
# [Not derivable]
11771186
Value: Consonant_Dead

0 commit comments

Comments
 (0)