Skip to content

Commit bf4acfc

Browse files
authored
Add five CJK confusables pairs from UTC 185 Action Items (#1236)
[185-A37] Action Item for Josh Hadley, Roozbeh Pournader, PAG: Add U+5BC9 and U+96BA as a new confusable pair, based on feedback [ID20250924032615] in document L2/25-220 and Section 09 of document L2/25-231, for a future version of the standard. [185-A46] Action Item for Josh Hadley, Roozbeh Pournader, PAG: Add U+8EB1 and U+8EB2 as a new confusable pair, based on feedback [ID20251007094419] in document L2/25-220 and Section 13 of document L2/25-231, for a future version of the standard. [185-A49] Action Item for Josh Hadley, Roozbeh Pournader, PAG: Add U+514C and U+5151 as a new confusable pair, based on feedback [ID20251007134830] in document L2/25-220 and Section 14 of document L2/25-231, for a future version of the standard. [185-A68] Action Item for Josh Hadley, Roozbeh Pournader, PAG: Add U+980B and U+2EA07 as a new confusable pair, based on Recommendation M65.05 in document IRG N2826 and Section 25 of document L2/25-231, for a future version of the standard. [185-A95] Action Item for Josh Hadley, Roozbeh Pournader, PAG: Add U+2EDB5 and U+32A8F as a new confusable pair, based on the kSpoofingVariant two additions specified in the table in Section 38 of document L2/25-231, for a future version of the standard.
1 parent 9fbb3dd commit bf4acfc

File tree

4 files changed

+49
-6
lines changed

4 files changed

+49
-6
lines changed

unicodetools/data/security/dev/confusables.txt

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# confusables.txt
2-
# Date: 2025-10-25, 07:52:31 GMT
2+
# Date: 2025-11-12, 00:37:27 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -7550,6 +7550,8 @@ FA0C ; 5140 ; MA # ( 兀 → 兀 ) CJK COMPATIBILITY IDEOGRAPH-FA0C → CJK UNIF
75507550

75517551
FA74 ; 5145 ; MA # ( 充 → 充 ) CJK COMPATIBILITY IDEOGRAPH-FA74 → CJK UNIFIED IDEOGRAPH-5145 #
75527552

7553+
5151 ; 514C ; MA # ( 兑 → 兌 ) CJK UNIFIED IDEOGRAPH-5151 → CJK UNIFIED IDEOGRAPH-514C #
7554+
75537555
FA32 ; 514D ; MA # ( 免 → 免 ) CJK COMPATIBILITY IDEOGRAPH-FA32 → CJK UNIFIED IDEOGRAPH-514D #
75547556
2F80E ; 514D ; MA # ( 免 → 免 ) CJK COMPATIBILITY IDEOGRAPH-2F80E → CJK UNIFIED IDEOGRAPH-514D #
75557557

@@ -7950,6 +7952,8 @@ FA04 ; 5B85 ; MA # ( 宅 → 宅 ) CJK COMPATIBILITY IDEOGRAPH-FA04 → CJK UNIF
79507952

79517953
2F86D ; 5BC3 ; MA # ( 寃 → 寃 ) CJK COMPATIBILITY IDEOGRAPH-2F86D → CJK UNIFIED IDEOGRAPH-5BC3 #
79527954

7955+
96BA ; 5BC9 ; MA # ( 隺 → 寉 ) CJK UNIFIED IDEOGRAPH-96BA → CJK UNIFIED IDEOGRAPH-5BC9 #
7956+
79537957
2F86E ; 5BD8 ; MA # ( 寘 → 寘 ) CJK COMPATIBILITY IDEOGRAPH-2F86E → CJK UNIFIED IDEOGRAPH-5BD8 #
79547958

79557959
F95F ; 5BE7 ; MA # ( 寧 → 寧 ) CJK COMPATIBILITY IDEOGRAPH-F95F → CJK UNIFIED IDEOGRAPH-5BE7 #
@@ -9341,6 +9345,8 @@ F9C2 ; 84FC ; MA # ( 蓼 → 蓼 ) CJK COMPATIBILITY IDEOGRAPH-F9C2 → CJK UNIF
93419345

93429346
2F9AC ; 8564 ; MA # ( 蕤 → 蕤 ) CJK COMPATIBILITY IDEOGRAPH-2F9AC → CJK UNIFIED IDEOGRAPH-8564 #
93439347

9348+
32A8F ; 2EDB5 ; MA # ( 𲪏 → 𮶵 ) CJK UNIFIED IDEOGRAPH-32A8F → CJK UNIFIED IDEOGRAPH-2EDB5 #
9349+
93449350
2F9AD ; 26F2C ; MA # ( 𦼬 → 𦼬 ) CJK COMPATIBILITY IDEOGRAPH-2F9AD → CJK UNIFIED IDEOGRAPH-26F2C #
93459351

93469352
F923 ; 85CD ; MA # ( 藍 → 藍 ) CJK COMPATIBILITY IDEOGRAPH-F923 → CJK UNIFIED IDEOGRAPH-85CD #
@@ -9581,6 +9587,8 @@ F937 ; 8DEF ; MA # ( 路 → 路 ) CJK COMPATIBILITY IDEOGRAPH-F937 → CJK UNIF
95819587

95829588
2F9D ; 8EAB ; MA #* ( ⾝ → 身 ) KANGXI RADICAL BODY → CJK UNIFIED IDEOGRAPH-8EAB #
95839589

9590+
8EB2 ; 8EB1 ; MA # ( 躲 → 躱 ) CJK UNIFIED IDEOGRAPH-8EB2 → CJK UNIFIED IDEOGRAPH-8EB1 #
9591+
95849592
F902 ; 8ECA ; MA # ( 車 → 車 ) CJK COMPATIBILITY IDEOGRAPH-F902 → CJK UNIFIED IDEOGRAPH-8ECA #
95859593
2F9E ; 8ECA ; MA #* ( ⾞ → 車 ) KANGXI RADICAL CART → CJK UNIFIED IDEOGRAPH-8ECA #
95869594

@@ -9810,6 +9818,7 @@ FACA ; 97FF ; MA # ( 響 → 響 ) CJK COMPATIBILITY IDEOGRAPH-FACA → CJK UNIF
98109818
FACB ; 980B ; MA # ( 頋 → 頋 ) CJK COMPATIBILITY IDEOGRAPH-FACB → CJK UNIFIED IDEOGRAPH-980B #
98119819
2F9FE ; 980B ; MA # ( 頋 → 頋 ) CJK COMPATIBILITY IDEOGRAPH-2F9FE → CJK UNIFIED IDEOGRAPH-980B #
98129820
2F9FF ; 980B ; MA # ( 頋 → 頋 ) CJK COMPATIBILITY IDEOGRAPH-2F9FF → CJK UNIFIED IDEOGRAPH-980B #
9821+
2EA07 ; 980B ; MA # ( 𮨇 → 頋 ) CJK UNIFIED IDEOGRAPH-2EA07 → CJK UNIFIED IDEOGRAPH-980B #
98139822

98149823
F9B4 ; 9818 ; MA # ( 領 → 領 ) CJK COMPATIBILITY IDEOGRAPH-F9B4 → CJK UNIFIED IDEOGRAPH-9818 #
98159824

@@ -10014,5 +10023,5 @@ FACE ; 9F9C ; MA # ( 龜 → 龜 ) CJK COMPATIBILITY IDEOGRAPH-FACE → CJK UNIF
1001410023

1001510024
2FD5 ; 9FA0 ; MA #* ( ⿕ → 龠 ) KANGXI RADICAL FLUTE → CJK UNIFIED IDEOGRAPH-9FA0 #
1001610025

10017-
# total: 6605
10026+
# total: 6610
1001810027

unicodetools/data/security/dev/confusablesSummary.txt

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# confusablesSummary.txt
2-
# Date: 2025-10-25, 07:52:31 GMT
2+
# Date: 2025-11-12, 00:37:27 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -14197,6 +14197,10 @@
1419714197
(‎ 充 ‎) 5145 CJK UNIFIED IDEOGRAPH-5145
1419814198
← (‎ 充 ‎) FA74 CJK COMPATIBILITY IDEOGRAPH-FA74
1419914199

14200+
# 兌 兑
14201+
(‎ 兌 ‎) 514C CJK UNIFIED IDEOGRAPH-514C
14202+
← (‎ 兑 ‎) 5151 CJK UNIFIED IDEOGRAPH-5151
14203+
1420014204
# 免 免 免
1420114205
(‎ 免 ‎) 514D CJK UNIFIED IDEOGRAPH-514D
1420214206
← (‎ 免 ‎) FA32 CJK COMPATIBILITY IDEOGRAPH-FA32
@@ -14734,6 +14738,10 @@
1473414738
(‎ 寃 ‎) 5BC3 CJK UNIFIED IDEOGRAPH-5BC3
1473514739
← (‎ 寃 ‎) 2F86D CJK COMPATIBILITY IDEOGRAPH-2F86D
1473614740

14741+
# 寉 隺
14742+
(‎ 寉 ‎) 5BC9 CJK UNIFIED IDEOGRAPH-5BC9
14743+
← (‎ 隺 ‎) 96BA CJK UNIFIED IDEOGRAPH-96BA
14744+
1473714745
# 寘 寘
1473814746
(‎ 寘 ‎) 5BD8 CJK UNIFIED IDEOGRAPH-5BD8
1473914747
← (‎ 寘 ‎) 2F86E CJK COMPATIBILITY IDEOGRAPH-2F86E
@@ -16715,6 +16723,10 @@
1671516723
(‎ 躗 ‎) 8E97 CJK UNIFIED IDEOGRAPH-8E97
1671616724
← (‎ 躛 ‎) 8E9B CJK UNIFIED IDEOGRAPH-8E9B
1671716725

16726+
# 躱 躲
16727+
(‎ 躱 ‎) 8EB1 CJK UNIFIED IDEOGRAPH-8EB1
16728+
← (‎ 躲 ‎) 8EB2 CJK UNIFIED IDEOGRAPH-8EB2
16729+
1671816730
# 軔 軔
1671916731
(‎ 軔 ‎) 8ED4 CJK UNIFIED IDEOGRAPH-8ED4
1672016732
← (‎ 軔 ‎) 2F9DE CJK COMPATIBILITY IDEOGRAPH-2F9DE
@@ -16956,8 +16968,9 @@
1695616968
← (‎ 響 ‎) FA69 CJK COMPATIBILITY IDEOGRAPH-FA69
1695716969
← (‎ 響 ‎) FACA CJK COMPATIBILITY IDEOGRAPH-FACA
1695816970

16959-
# 頋 頋 頋 頋
16971+
# 頋 𮨇 頋 頋 頋
1696016972
(‎ 頋 ‎) 980B CJK UNIFIED IDEOGRAPH-980B
16973+
← (‎ 𮨇 ‎) 2EA07 CJK UNIFIED IDEOGRAPH-2EA07
1696116974
← (‎ 頋 ‎) FACB CJK COMPATIBILITY IDEOGRAPH-FACB
1696216975
← (‎ 頋 ‎) 2F9FE CJK COMPATIBILITY IDEOGRAPH-2F9FE
1696316976
← (‎ 頋 ‎) 2F9FF CJK COMPATIBILITY IDEOGRAPH-2F9FF
@@ -17872,5 +17885,9 @@
1787217885
(‎ 𪘀 ‎) 2A600 CJK UNIFIED IDEOGRAPH-2A600
1787317886
← (‎ 𪘀 ‎) 2FA1D CJK COMPATIBILITY IDEOGRAPH-2FA1D
1787417887

17875-
# total : 7659
17888+
# 𮶵 𲪏
17889+
(‎ 𮶵 ‎) 2EDB5 CJK UNIFIED IDEOGRAPH-2EDB5
17890+
← (‎ 𲪏 ‎) 32A8F CJK UNIFIED IDEOGRAPH-32A8F
17891+
17892+
# total : 7664
1787617893

unicodetools/data/security/dev/data/source/confusables-source.txt

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5811,3 +5811,10 @@ A8CF ; 007C 007C # SAURASHTRA DOUBLE DANDA
58115811
17C4 ; 17C1 17B6
58125812
17C7 ; 0983
58135813
11303 ; 0983
5814+
5815+
# CJK confusables from UTC #185 Action Items
5816+
5BC9 ; 96BA
5817+
8EB1 ; 8EB2
5818+
514C ; 5151
5819+
980B ; 2EA07
5820+
2EDB5 ; 32A8F

unicodetools/data/security/dev/data/source/formatted-source.txt

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# formatted-source.txt
2-
# Date: 2025-10-25, 07:52:30 GMT
2+
# Date: 2025-11-12, 00:37:25 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -4400,6 +4400,8 @@
44004400
513F ; 16FF2 # ( 儿 ~ 𖿲 ) CJK UNIFIED IDEOGRAPH-513F ~ CHINESE SMALL SIMPLIFIED ER
44014401
513F ; 3126 # ( 儿 ~ ㄦ ) CJK UNIFIED IDEOGRAPH-513F ~ BOPOMOFO LETTER ER
44024402

4403+
514C ; 5151 # ( 兌 ~ 兑 ) CJK UNIFIED IDEOGRAPH-514C ~ CJK UNIFIED IDEOGRAPH-5151
4404+
44034405
5553 ; 555F # ( 啓 ~ 啟 ) CJK UNIFIED IDEOGRAPH-5553 ~ CJK UNIFIED IDEOGRAPH-555F
44044406

44054407
5861 ; 586B # ( 塡 ~ 填 ) CJK UNIFIED IDEOGRAPH-5861 ~ CJK UNIFIED IDEOGRAPH-586B
@@ -4408,6 +4410,8 @@
44084410

44094411
5AAF ; 5B00 # ( 媯 ~ 嬀 ) CJK UNIFIED IDEOGRAPH-5AAF ~ CJK UNIFIED IDEOGRAPH-5B00
44104412

4413+
5BC9 ; 96BA # ( 寉 ~ 隺 ) CJK UNIFIED IDEOGRAPH-5BC9 ~ CJK UNIFIED IDEOGRAPH-96BA
4414+
44114415
5CC0 ; 2B73A # ( 峀 ~ 𫜺 ) CJK UNIFIED IDEOGRAPH-5CC0 ~ CJK UNIFIED IDEOGRAPH-2B73A
44124416

44134417
5DFF ; 5E02 # ( 巿 ~ 市 ) CJK UNIFIED IDEOGRAPH-5DFF ~ CJK UNIFIED IDEOGRAPH-5E02
@@ -4462,12 +4466,16 @@
44624466

44634467
8E97 ; 8E9B # ( 躗 ~ 躛 ) CJK UNIFIED IDEOGRAPH-8E97 ~ CJK UNIFIED IDEOGRAPH-8E9B
44644468

4469+
8EB1 ; 8EB2 # ( 躱 ~ 躲 ) CJK UNIFIED IDEOGRAPH-8EB1 ~ CJK UNIFIED IDEOGRAPH-8EB2
4470+
44654471
8EFF ; 8F27 # ( 軿 ~ 輧 ) CJK UNIFIED IDEOGRAPH-8EFF ~ CJK UNIFIED IDEOGRAPH-8F27
44664472

44674473
8FB6 ; 2ECC # ( 辶 ~ ⻌ ) CJK UNIFIED IDEOGRAPH-8FB6 ~ CJK RADICAL SIMPLIFIED WALK
44684474

44694475
93AD ; 93AE # ( 鎭 ~ 鎮 ) CJK UNIFIED IDEOGRAPH-93AD ~ CJK UNIFIED IDEOGRAPH-93AE
44704476

4477+
980B ; 2EA07 # ( 頋 ~ 𮨇 ) CJK UNIFIED IDEOGRAPH-980B ~ CJK UNIFIED IDEOGRAPH-2EA07
4478+
44714479
9E42 ; 9E43 # ( 鹂 ~ 鹃 ) CJK UNIFIED IDEOGRAPH-9E42 ~ CJK UNIFIED IDEOGRAPH-9E43
44724480

44734481
A04A ; A49E # ( ꁊ ~ ꒞ ) YI SYLLABLE PUT ~ YI RADICAL PUT
@@ -4768,6 +4776,8 @@ A99D ; A9A3 # ( ꦝ ~ ꦣ ) JAVANESE LETTER DDA ~ JAVANESE LETTER DA MAHAPRANA
47684776

47694777
2D161 ; 2F82D # ( 𭅡 ~ 卑 ) CJK UNIFIED IDEOGRAPH-2D161 ~ CJK COMPATIBILITY IDEOGRAPH-2F82D
47704778

4779+
2EDB5 ; 32A8F # ( 𮶵 ~ 𲪏 ) CJK UNIFIED IDEOGRAPH-2EDB5 ~ CJK UNIFIED IDEOGRAPH-32A8F
4780+
47714781
31E7C ; 2F96E # ( 𱹼 ~ 緇 ) CJK UNIFIED IDEOGRAPH-31E7C ~ CJK COMPATIBILITY IDEOGRAPH-2F96E
47724782

47734783
FB54 ; FBE6 # ( ‎ﭔ‎ ~ ‎ﯦ‎ ) ARABIC LETTER BEEH INITIAL FORM ~ ARABIC LETTER E INITIAL FORM

0 commit comments

Comments
 (0)