Skip to content

Commit 2ce120e

Browse files
committed
Confusable sequences proposed in L2/22-107: generated data
1 parent 3e21a1e commit 2ce120e

File tree

4 files changed

+71
-7
lines changed

4 files changed

+71
-7
lines changed

unicodetools/data/security/dev/confusables.txt

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# confusables.txt
2-
# Date: 2025-07-20, 16:05:37 GMT
2+
# Date: 2025-07-20, 16:09:22 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -5671,6 +5671,8 @@ FE19 ; 2D57 ; MA #* ( ︙ → ⵗ ) PRESENTATION FORM FOR VERTICAL HORIZONTAL EL
56715671

56725672
0911 ; 0905 093E 0306 ; MA # ( ऑ → अा̆ ) DEVANAGARI LETTER CANDRA O → DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN AA, COMBINING BREVE # →अॉ→
56735673

5674+
0974 ; 0905 093E 093A ; MA # ( ॴ → अाऺ ) DEVANAGARI LETTER OOE → DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN AA, DEVANAGARI VOWEL SIGN OE # →अऻ→
5675+
56745676
0912 ; 0905 093E 0946 ; MA # ( ऒ → अाॆ ) DEVANAGARI LETTER SHORT O → DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN AA, DEVANAGARI VOWEL SIGN SHORT E # →अॊ→→आॆ→
56755677

56765678
0914 ; 0905 093E 0948 ; MA # ( औ → अाै ) DEVANAGARI LETTER AU → DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN AA, DEVANAGARI VOWEL SIGN AI # →अौ→→आै→
@@ -5693,6 +5695,8 @@ FE19 ; 2D57 ; MA #* ( ︙ → ⵗ ) PRESENTATION FORM FOR VERTICAL HORIZONTAL EL
56935695

56945696
0949 ; 093E 0306 ; MA # ( ॉ → ा̆ ) DEVANAGARI VOWEL SIGN CANDRA O → DEVANAGARI VOWEL SIGN AA, COMBINING BREVE # →ाॅ→
56955697

5698+
093B ; 093E 093A ; MA # ( ऻ → ाऺ ) DEVANAGARI VOWEL SIGN OOE → DEVANAGARI VOWEL SIGN AA, DEVANAGARI VOWEL SIGN OE #
5699+
56965700
111CB ; 093A ; MA # ( 𑇋 → ऺ ) SHARADA VOWEL MODIFIER MARK → DEVANAGARI VOWEL SIGN OE #
56975701
11B60 ; 093A ; MA # ( 𑭠 → ऺ ) SHARADA VOWEL SIGN OE → DEVANAGARI VOWEL SIGN OE #
56985702

@@ -5831,6 +5835,8 @@ FE19 ; 2D57 ; MA #* ( ︙ → ⵗ ) PRESENTATION FORM FOR VERTICAL HORIZONTAL EL
58315835

58325836
0D23 ; 0BA3 ; MA # ( ണ → ண ) MALAYALAM LETTER NNA → TAMIL LETTER NNA #
58335837

5838+
0D7A ; 0BA3 0D4D ; MA # ( ൺ → ண് ) MALAYALAM LETTER CHILLU NN → TAMIL LETTER NNA, MALAYALAM SIGN VIRAMA # →ണ്→
5839+
58345840
0BFA ; 0BA8 0BC0 ; MA #* ( ௺ → நீ ) TAMIL NUMBER SIGN → TAMIL LETTER NA, TAMIL VOWEL SIGN II #
58355841

58365842
0BF4 ; 0BAE 0BC0 ; MA #* ( ௴ → மீ ) TAMIL MONTH SIGN → TAMIL LETTER MA, TAMIL VOWEL SIGN II #
@@ -5935,10 +5941,14 @@ FE19 ; 2D57 ; MA #* ( ︙ → ⵗ ) PRESENTATION FORM FOR VERTICAL HORIZONTAL EL
59355941
0D6A ; 0D30 0D4D ; MA # ( ൪ → ര് ) MALAYALAM DIGIT FOUR → MALAYALAM LETTER RA, MALAYALAM SIGN VIRAMA #
59365942
0D7C ; 0D30 0D4D ; MA # ( ർ → ര് ) MALAYALAM LETTER CHILLU RR → MALAYALAM LETTER RA, MALAYALAM SIGN VIRAMA # →൪→
59375943

5944+
0D7D ; 0D32 0D4D ; MA # ( ൽ → ല് ) MALAYALAM LETTER CHILLU L → MALAYALAM LETTER LA, MALAYALAM SIGN VIRAMA #
5945+
59385946
0D6E ; 0D35 0D4D 0D30 ; MA # ( ൮ → വ്ര ) MALAYALAM DIGIT EIGHT → MALAYALAM LETTER VA, MALAYALAM SIGN VIRAMA, MALAYALAM LETTER RA #
59395947

59405948
0D76 ; 0D39 0D4D 0D2E ; MA #* ( ൶ → ഹ്മ ) MALAYALAM FRACTION ONE SIXTEENTH → MALAYALAM LETTER HA, MALAYALAM SIGN VIRAMA, MALAYALAM LETTER MA #
59415949

5950+
0D7E ; 0D33 0D4D ; MA # ( ൾ → ള് ) MALAYALAM LETTER CHILLU LL → MALAYALAM LETTER LLA, MALAYALAM SIGN VIRAMA #
5951+
59425952
0D42 ; 0D41 ; MA # ( ൂ → ു ) MALAYALAM VOWEL SIGN UU → MALAYALAM VOWEL SIGN U #
59435953
0D43 ; 0D41 ; MA # ( ൃ → ു ) MALAYALAM VOWEL SIGN VOCALIC R → MALAYALAM VOWEL SIGN U # →ൂ→
59445954

@@ -9814,5 +9824,5 @@ A7CF ; A7CE ; MA # ( ꟏ → ꟎ ) LATIN SMALL LETTER PHARYNGEAL VOICED FRICATIV
98149824

98159825
6138 ; 2B73F ; MA # ( 愸 → 𫜿 ) CJK UNIFIED IDEOGRAPH-6138 → CJK UNIFIED IDEOGRAPH-2B73F #
98169826

9817-
# total: 6442
9827+
# total: 6447
98189828

unicodetools/data/security/dev/confusablesSummary.txt

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# confusablesSummary.txt
2-
# Date: 2025-07-20, 16:05:37 GMT
2+
# Date: 2025-07-20, 16:09:22 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -8450,6 +8450,11 @@
84508450
← (‎ अॅ ‎) 0905 0945 DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN CANDRA E
84518451
← (‎ ॲ ‎) 0972 DEVANAGARI LETTER CANDRA A # →अॅ→
84528452

8453+
# अाऺ अऻ ॴ
8454+
(‎ अऻ ‎) 0905 093B DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN OOE
8455+
← (‎ अाऺ ‎) 0905 093E 093A DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN AA, DEVANAGARI VOWEL SIGN OE
8456+
← (‎ ॴ ‎) 0974 DEVANAGARI LETTER OOE
8457+
84538458
# अा आ
84548459
(‎ अा ‎) 0905 093E DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN AA
84558460
← (‎ आ ‎) 0906 DEVANAGARI LETTER AA
@@ -8502,6 +8507,10 @@
85028507
← (‎ 𑭠 ‎) 11B60 SHARADA VOWEL SIGN OE
85038508
← (‎ 𑇋 ‎) 111CB SHARADA VOWEL MODIFIER MARK
85048509

8510+
# ाऺ ऻ
8511+
(‎ ऻ ‎) 093B DEVANAGARI VOWEL SIGN OOE
8512+
← (‎ ाऺ ‎) 093E 093A DEVANAGARI VOWEL SIGN AA, DEVANAGARI VOWEL SIGN OE
8513+
85058514
# ऽ ઽ
85068515
(‎ ऽ ‎) 093D DEVANAGARI SIGN AVAGRAHA
85078516
← (‎ ઽ ‎) 0ABD GUJARATI SIGN AVAGRAHA
@@ -8851,6 +8860,11 @@
88518860
(‎ ண ‎) 0BA3 TAMIL LETTER NNA
88528861
← (‎ ണ ‎) 0D23 MALAYALAM LETTER NNA
88538862

8863+
# ண് ണ് ൺ
8864+
(‎ ண് ‎) 0BA3 0D4D TAMIL LETTER NNA, MALAYALAM SIGN VIRAMA
8865+
← (‎ ണ് ‎) 0D23 0D4D MALAYALAM LETTER NNA, MALAYALAM SIGN VIRAMA
8866+
← (‎ ൺ ‎) 0D7A MALAYALAM LETTER CHILLU NN # →ണ്→
8867+
88548868
# நீ ௺
88558869
(‎ நீ ‎) 0BA8 0BC0 TAMIL LETTER NA, TAMIL VOWEL SIGN II
88568870
← (‎ ௺ ‎) 0BFA TAMIL NUMBER SIGN
@@ -9076,6 +9090,14 @@
90769090
← (‎ ർ ‎) 0D7C MALAYALAM LETTER CHILLU RR # →൪→
90779091
← (‎ ൪ ‎) 0D6A MALAYALAM DIGIT FOUR
90789092

9093+
# ല് ൽ
9094+
(‎ ല് ‎) 0D32 0D4D MALAYALAM LETTER LA, MALAYALAM SIGN VIRAMA
9095+
← (‎ ൽ ‎) 0D7D MALAYALAM LETTER CHILLU L
9096+
9097+
# ള് ൾ
9098+
(‎ ള് ‎) 0D33 0D4D MALAYALAM LETTER LLA, MALAYALAM SIGN VIRAMA
9099+
← (‎ ൾ ‎) 0D7E MALAYALAM LETTER CHILLU LL
9100+
90799101
# വ്ര വ് ൮
90809102
(‎ വ് ‎) 0D35 0D4D MALAYALAM LETTER VA, MALAYALAM SIGN VIRAMA
90819103
← (‎ വ്ര ‎) 0D35 0D4D 0D30 MALAYALAM LETTER VA, MALAYALAM SIGN VIRAMA, MALAYALAM LETTER RA # →൮→
@@ -17522,5 +17544,5 @@
1752217544
(‎ 𪘀 ‎) 2A600 CJK UNIFIED IDEOGRAPH-2A600
1752317545
← (‎ 𪘀 ‎) 2FA1D CJK COMPATIBILITY IDEOGRAPH-2FA1D
1752417546

17525-
# total : 7420
17547+
# total : 7427
1752617548

unicodetools/data/security/dev/data/confusablesSummaryIdentifier.txt

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# confusablesSummaryIdentifier.txt
2-
# Date: 2025-07-20, 16:05:37 GMT
2+
# Date: 2025-07-20, 16:09:22 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -749,6 +749,11 @@
749749
← (‎ अॅ ‎) 0905 0945 DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN CANDRA E
750750
← (‎ ॲ ‎) 0972 DEVANAGARI LETTER CANDRA A # →अॅ→
751751

752+
# अाऺ अऻ ॴ
753+
(‎ अऻ ‎) 0905 093B DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN OOE
754+
← (‎ अाऺ ‎) 0905 093E 093A DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN AA, DEVANAGARI VOWEL SIGN OE
755+
← (‎ ॴ ‎) 0974 DEVANAGARI LETTER OOE
756+
752757
# अा आ
753758
(‎ अा ‎) 0905 093E DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN AA
754759
← (‎ आ ‎) 0906 DEVANAGARI LETTER AA
@@ -794,6 +799,10 @@
794799
(‎ एे ‎) 090F 0947 DEVANAGARI LETTER E, DEVANAGARI VOWEL SIGN E
795800
← (‎ ऐ ‎) 0910 DEVANAGARI LETTER AI
796801

802+
# ाऺ ऻ
803+
(‎ ऻ ‎) 093B DEVANAGARI VOWEL SIGN OOE
804+
← (‎ ाऺ ‎) 093E 093A DEVANAGARI VOWEL SIGN AA, DEVANAGARI VOWEL SIGN OE
805+
797806
# ा̆ ाॅ ॉ
798807
(‎ ा̆ ‎) 093E 0306 DEVANAGARI VOWEL SIGN AA, COMBINING BREVE
799808
← (‎ ाॅ ‎) 093E 0945 DEVANAGARI VOWEL SIGN AA, DEVANAGARI VOWEL SIGN CANDRA E
@@ -921,6 +930,11 @@
921930
(‎ ண ‎) 0BA3 TAMIL LETTER NNA
922931
← (‎ ണ ‎) 0D23 MALAYALAM LETTER NNA
923932

933+
# ண് ണ് ൺ
934+
(‎ ண് ‎) 0BA3 0D4D TAMIL LETTER NNA, MALAYALAM SIGN VIRAMA
935+
← (‎ ണ് ‎) 0D23 0D4D MALAYALAM LETTER NNA, MALAYALAM SIGN VIRAMA
936+
← (‎ ൺ ‎) 0D7A MALAYALAM LETTER CHILLU NN # →ണ്→
937+
924938
# ன ை
925939
(‎ ன ‎) 0BA9 TAMIL LETTER NNNA
926940
← (‎ ை ‎) 0BC8 TAMIL VOWEL SIGN AI
@@ -1061,6 +1075,14 @@
10611075
(‎ ര് ‎) 0D30 0D4D MALAYALAM LETTER RA, MALAYALAM SIGN VIRAMA
10621076
← (‎ ർ ‎) 0D7C MALAYALAM LETTER CHILLU RR # →൪→
10631077

1078+
# ല് ൽ
1079+
(‎ ല് ‎) 0D32 0D4D MALAYALAM LETTER LA, MALAYALAM SIGN VIRAMA
1080+
← (‎ ൽ ‎) 0D7D MALAYALAM LETTER CHILLU L
1081+
1082+
# ള് ൾ
1083+
(‎ ള് ‎) 0D33 0D4D MALAYALAM LETTER LLA, MALAYALAM SIGN VIRAMA
1084+
← (‎ ൾ ‎) 0D7E MALAYALAM LETTER CHILLU LL
1085+
10641086
# ു ൂ ൃ
10651087
(‎ ു ‎) 0D41 MALAYALAM VOWEL SIGN U
10661088
← (‎ ൂ ‎) 0D42 MALAYALAM VOWEL SIGN UU
@@ -1220,5 +1242,5 @@
12201242
(‎ へ ‎) 3078 HIRAGANA LETTER HE
12211243
← (‎ ヘ ‎) 30D8 KATAKANA LETTER HE
12221244

1223-
# total : 436
1245+
# total : 443
12241246

unicodetools/data/security/dev/data/source/formatted-source.txt

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# formatted-source.txt
2-
# Date: 2025-07-20, 16:05:36 GMT
2+
# Date: 2025-07-20, 16:09:21 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -1869,6 +1869,8 @@
18691869
0901 ; 0C81 # ( ँ ~ ಁ ) DEVANAGARI SIGN CANDRABINDU ~ KANNADA SIGN CANDRABINDU
18701870
0901 ; 0D01 # ( ँ ~ ഁ ) DEVANAGARI SIGN CANDRABINDU ~ MALAYALAM SIGN CANDRABINDU
18711871

1872+
0905 093B ; 0974 # ( अऻ ~ ॴ ) DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN OOE ~ DEVANAGARI LETTER OOE
1873+
18721874
0905 093E ; 0906 # ( अा ~ आ ) DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN AA ~ DEVANAGARI LETTER AA
18731875

18741876
0905 0945 ; 0972 # ( अॅ ~ ॲ ) DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN CANDRA E ~ DEVANAGARI LETTER CANDRA A
@@ -1902,6 +1904,8 @@
19021904

19031905
093D ; 0ABD # ( ऽ ~ ઽ ) DEVANAGARI SIGN AVAGRAHA ~ GUJARATI SIGN AVAGRAHA
19041906

1907+
093E 093A ; 093B # ( ाऺ ~ ऻ ) DEVANAGARI VOWEL SIGN AA, DEVANAGARI VOWEL SIGN OE ~ DEVANAGARI VOWEL SIGN OOE
1908+
19051909
093E 0945 ; 0949 # ( ाॅ ~ ॉ ) DEVANAGARI VOWEL SIGN AA, DEVANAGARI VOWEL SIGN CANDRA E ~ DEVANAGARI VOWEL SIGN CANDRA O
19061910

19071911
0941 ; 0AC1 # ( ु ~ ુ ) DEVANAGARI VOWEL SIGN U ~ GUJARATI VOWEL SIGN U
@@ -2211,6 +2215,8 @@
22112215

22122216
0D20 ; 0D66 # ( ഠ ~ ൦ ) MALAYALAM LETTER TTHA ~ MALAYALAM DIGIT ZERO
22132217

2218+
0D23 0D4D ; 0D7A # ( ണ് ~ ൺ ) MALAYALAM LETTER NNA, MALAYALAM SIGN VIRAMA ~ MALAYALAM LETTER CHILLU NN
2219+
22142220
0D26 0D4D 0D30 ; 0D6B # ( ദ്ര ~ ൫ ) MALAYALAM LETTER DA, MALAYALAM SIGN VIRAMA, MALAYALAM LETTER RA ~ MALAYALAM DIGIT FIVE
22152221

22162222
0D28 0D41 ; 0D0C # ( നു ~ ഌ ) MALAYALAM LETTER NA, MALAYALAM VOWEL SIGN U ~ MALAYALAM LETTER VOCALIC L
@@ -2226,6 +2232,10 @@
22262232

22272233
0D30 0D4D ; 0D6A # ( ര് ~ ൪ ) MALAYALAM LETTER RA, MALAYALAM SIGN VIRAMA ~ MALAYALAM DIGIT FOUR
22282234

2235+
0D32 0D4D ; 0D7D # ( ല് ~ ൽ ) MALAYALAM LETTER LA, MALAYALAM SIGN VIRAMA ~ MALAYALAM LETTER CHILLU L
2236+
2237+
0D33 0D4D ; 0D7E # ( ള് ~ ൾ ) MALAYALAM LETTER LLA, MALAYALAM SIGN VIRAMA ~ MALAYALAM LETTER CHILLU LL
2238+
22292239
0D35 0D4D ; 0D6E # ( വ് ~ ൮ ) MALAYALAM LETTER VA, MALAYALAM SIGN VIRAMA ~ MALAYALAM DIGIT EIGHT
22302240

22312241
0D35 0D4D 0D30 ; 0D6E # ( വ്ര ~ ൮ ) MALAYALAM LETTER VA, MALAYALAM SIGN VIRAMA, MALAYALAM LETTER RA ~ MALAYALAM DIGIT EIGHT

0 commit comments

Comments
 (0)