Skip to content

Commit ec66045

Browse files
authored
Merge pull request #30 from adraffy/unicode-17
Upgrade to Unicode 17
2 parents f897be8 + 4fb708a commit ec66045

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

79 files changed

+304076
-46287
lines changed

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,14 @@
44
`npm i @adraffy/ens-normalize` [✓](https://www.npmjs.com/package/@adraffy/ens-normalize)
55

66
* 🏛️ Follows [**ENSIP-15**: ENS Name Normalization Standard](https://docs.ens.domains/ensip/15)
7-
* Unicode: [`16.0.0`](./derive/data/16.0.0/) • CLDR: [`45`](./derive/data/CLDR-45/)
7+
* Unicode: [`17.0.0`](./derive/data/17.0.0/) • CLDR: [`47`](./derive/data/CLDR-47/)
88
* Other implementations:
99
* Python — [namehash/**ens-normalize-python**](https://github.com/namehash/ens-normalize-python)
1010
* Rust — [sevenzing/**ens-normalize-rs**](https://github.com/sevenzing/ens-normalize-rs)
1111
* Zig — [evmts/**z-ens-normalize**](https://github.com/evmts/z-ens-normalize)
1212
* C# — [adraffy/**ENSNormalize.cs**](https://github.com/adraffy/ENSNormalize.cs)
1313
* Java — [adraffy/**ENSNormalize.java**](https://github.com/adraffy/ENSNormalize.java)
14+
* Swift – [adraffy/**ENSNormalize.swift**](https://github.com/adraffy/ENSNormalize.swift)
1415
* Go — [adraffy/**go-ens-normalize**](https://github.com/adraffy/go-ens-normalize)
1516
* Swift – [adraffy/**ENSNormalize.swift**](https://github.com/adraffy/ENSNormalize.swift)
1617
* Prior implementation:
@@ -28,10 +29,9 @@
2829
* [Character Viewer](https://adraffy.github.io/ens-normalize.js/test/chars.html)
2930
* [Confused Explainer](https://adraffy.github.io/ens-normalize.js/test/confused.html)
3031
* Related Projects:
31-
* [Recent .eth Registrations](https://raffy.antistupid.com/eth/ens-regs.html) • [.eth Renews](https://raffy.antistupid.com/eth/ens-renews.html)
32-
* [.eth Expirations](https://raffy.antistupid.com/eth/ens-exp.html)
32+
* [Recent .eth Registrations](https://raffy.antistupid.com/eth/ens-regs.html) • [Renews](https://raffy.antistupid.com/eth/ens-renews.html) • [Expirations](https://raffy.antistupid.com/eth/ens-exp.html)
33+
* [Label Database](https://github.com/adraffy/ens-labels/) • [Labelhash⁻¹](https://adraffy.github.io/ens-labels/demo.html) • [Brute-force](https://raffy.antistupid.com/eth/ens-brute.html)
3334
* [Emoji Frequency Explorer](https://raffy.antistupid.com/eth/ens-emoji-freq.html)
34-
* [Label Database](https://github.com/adraffy/ens-labels/) • [Labelhash⁻¹](https://adraffy.github.io/ens-labels/demo.html)
3535
* [adraffy/**punycode.js**](https://github.com/adraffy/punycode.js/) • [Punycode Coder](https://adraffy.github.io/punycode.js/test/demo.html)
3636
* [adraffy/**keccak.js**](https://github.com/adraffy/keccak.js/) • [Keccak Hasher](https://adraffy.github.io/keccak.js/test/demo.html)
3737
* [adraffy/**emoji.js**](https://github.com/adraffy/emoji.js/) • [Emoji Parser](https://adraffy.github.io/emoji.js/test/demo.html)

derive/README.md

Lines changed: 24 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
# Derive Data Files
22

33
* [Unicode Standard](https://www.unicode.org/versions/latest/)
4+
* [Release Dates](https://www.unicode.org/history/publicationdates.html#Release_Dates)
45
* [Unicode Technical Standard #46: IDNA](https://www.unicode.org/reports/tr46/)
56
* [unicode-logic.js/`derive_idna_rules()`](./unicode-logic.js#L581) — [spec](https://unicode.org/reports/tr46/#Implementation_Notes)
67
* [idna.js/`ens_idna_rules()`](./idna.js)
@@ -20,11 +21,11 @@
2021
* [Unicode data files](https://www.unicode.org/Public/)
2122
* Download Latest: `node download.js`
2223
* To download older version: `node download.js 12.1.0`
23-
* Already included: [Unicode 11-16](./data/)
24+
* Already included: [Unicode 11-17](./data/)
2425
* [CLDR data files](https://github.com/unicode-org/cldr)
2526
* Download Latest: `node parse-cldr.js`
2627
* To download older version: `node parse-cldr.js 42`
27-
* Already included: [CLDR 42-45](./data/)
28+
* Already included: [CLDR 42-47](./data/)
2829
* ⚠️ Versioned separately from Unicode!
2930

3031
## Instructions
@@ -56,10 +57,27 @@
5657

5758
## Upgrade Notes
5859

60+
### 16.0.0 → 17.0.0
61+
62+
* [**Release**](https://www.unicode.org/versions/Unicode17.0.0)
63+
* [Diff](./diffs/16.0.0-vs-17.0.0.txt) `node unicode-diff.js 16 17`
64+
* CLDR
65+
* **Unchanged**
66+
* UAX-31:
67+
* **New** 4 Scripts: Berf, Sidt, Tayo, Tols
68+
* **Changed** Bopo → Limited Use
69+
* UTS-46:
70+
* **New** [Miscellaneous Symbols Supplement](https://www.unicode.org/charts/PDF/Unicode-17.0/U170-1CEC0.pdf)
71+
* **Fixed** Missing Disallowed from Unicode 16
72+
* UTS-51:
73+
* **New** 164 Emoji `node derive/dump-emoji-new.js`
74+
* Prior Validation: `node test/validate.js 1.11.0`
75+
* Fails on new emoji
76+
5977
### 15.1.0 → 16.0.0
6078

61-
* [Release](https://www.unicode.org/versions/Unicode16.0.0/#Character_Additions)
62-
* [Diff](./diffs/15.1.0-vs-16.0.0.txt) `node unicode.diff.js 15.1 16`
79+
* [**Release**](https://www.unicode.org/versions/Unicode16.0.0)
80+
* [Diff](./diffs/15.1.0-vs-16.0.0.txt) `node unicode-diff.js 15.1 16`
6381
* CLDR
6482
* `short-names.json` **Unchanged**
6583
* `regions.json` **New** `"CQ"`
@@ -84,8 +102,8 @@
84102

85103
### 15.0.0 → 15.1.0
86104

87-
* [Release](https://www.unicode.org/versions/Unicode15.1.0)
88-
* [Diff](./diffs/15.0.0-vs-15.1.0.txt) `node unicode.diff.js 15 15.1`
105+
* [**Release**](https://www.unicode.org/versions/Unicode15.1.0)
106+
* [Diff](./diffs/15.0.0-vs-15.1.0.txt) `node unicode-diff.js 15 15.1`
89107
* CLDR
90108
* `short-names.json` **Unchanged**
91109
* UCD:

derive/data/17.0.0/CompositionExclusions.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# CompositionExclusions-17.0.0.txt
2-
# Date: 2024-02-02
3-
# © 2024 Unicode®, Inc.
2+
# Date: 2025-08-01
3+
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
66
#

derive/data/17.0.0/DerivedCoreProperties.txt

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# DerivedCoreProperties-17.0.0.txt
2-
# Date: 2025-06-30, 06:20:18 GMT
2+
# Date: 2025-07-30, 23:55:08 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -3553,7 +3553,8 @@ E0100..E01EF ; Case_Ignorable # Mn [240] VARIATION SELECTOR-17..VARIATION SELEC
35533553

35543554
# Derived Property: Changes_When_Lowercased (CWL)
35553555
# Characters whose normalized forms are not stable under a toLowercase mapping.
3556-
# For more information, see D139 in Section 3.13, "Default Case Algorithms".
3556+
# For more information, see the definition of "isLowercase(X)"
3557+
# in the "Conformance" / "Default Case Algorithms" section of the core specification.
35573558
# Changes_When_Lowercased(X) is true when toLowercase(toNFD(X)) != toNFD(X)
35583559

35593560
0041..005A ; Changes_When_Lowercased # L& [26] LATIN CAPITAL LETTER A..LATIN CAPITAL LETTER Z
@@ -4181,7 +4182,8 @@ FF21..FF3A ; Changes_When_Lowercased # L& [26] FULLWIDTH LATIN CAPITAL LETTE
41814182

41824183
# Derived Property: Changes_When_Uppercased (CWU)
41834184
# Characters whose normalized forms are not stable under a toUppercase mapping.
4184-
# For more information, see D140 in Section 3.13, "Default Case Algorithms".
4185+
# For more information, see the definition of "isUppercase(X)"
4186+
# in the "Conformance" / "Default Case Algorithms" section of the core specification.
41854187
# Changes_When_Uppercased(X) is true when toUppercase(toNFD(X)) != toNFD(X)
41864188

41874189
0061..007A ; Changes_When_Uppercased # L& [26] LATIN SMALL LETTER A..LATIN SMALL LETTER Z
@@ -4825,7 +4827,8 @@ FF41..FF5A ; Changes_When_Uppercased # L& [26] FULLWIDTH LATIN SMALL LETTER
48254827

48264828
# Derived Property: Changes_When_Titlecased (CWT)
48274829
# Characters whose normalized forms are not stable under a toTitlecase mapping.
4828-
# For more information, see D141 in Section 3.13, "Default Case Algorithms".
4830+
# For more information, see the definition of "isTitlecase(X)"
4831+
# in the "Conformance" / "Default Case Algorithms" section of the core specification.
48294832
# Changes_When_Titlecased(X) is true when toTitlecase(toNFD(X)) != toNFD(X)
48304833

48314834
0061..007A ; Changes_When_Titlecased # L& [26] LATIN SMALL LETTER A..LATIN SMALL LETTER Z
@@ -5468,7 +5471,8 @@ FF41..FF5A ; Changes_When_Titlecased # L& [26] FULLWIDTH LATIN SMALL LETTER
54685471

54695472
# Derived Property: Changes_When_Casefolded (CWCF)
54705473
# Characters whose normalized forms are not stable under case folding.
5471-
# For more information, see D142 in Section 3.13, "Default Case Algorithms".
5474+
# For more information, see the definition of "isCasefolded(X)"
5475+
# in the "Conformance" / "Default Case Algorithms" section of the core specification.
54725476
# Changes_When_Casefolded(X) is true when toCasefold(toNFD(X)) != toNFD(X)
54735477

54745478
0041..005A ; Changes_When_Casefolded # L& [26] LATIN CAPITAL LETTER A..LATIN CAPITAL LETTER Z
@@ -6108,7 +6112,8 @@ FF21..FF3A ; Changes_When_Casefolded # L& [26] FULLWIDTH LATIN CAPITAL LETTE
61086112

61096113
# Derived Property: Changes_When_Casemapped (CWCM)
61106114
# Characters whose normalized forms are not stable under case mapping.
6111-
# For more information, see D143 in Section 3.13, "Default Case Algorithms".
6115+
# For more information, see the definition of "isCased(X)"
6116+
# in the "Conformance" / "Default Case Algorithms" section of the core specification.
61126117
# Changes_When_Casemapped(X) is true when CWL(X), or CWT(X), or CWU(X)
61136118

61146119
0041..005A ; Changes_When_Casemapped # L& [26] LATIN CAPITAL LETTER A..LATIN CAPITAL LETTER Z
@@ -12964,7 +12969,7 @@ FFFC..FFFD ; Grapheme_Base # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEME
1296412969
1FA80..1FA8A ; Grapheme_Base # So [11] YO-YO..TROMBONE
1296512970
1FA8E..1FAC6 ; Grapheme_Base # So [57] TREASURE CHEST..FINGERPRINT
1296612971
1FAC8 ; Grapheme_Base # So HAIRY CREATURE
12967-
1FACD..1FADD ; Grapheme_Base # So [17] ORCA..APPLE CORE
12972+
1FACD..1FADC ; Grapheme_Base # So [16] ORCA..ROOT VEGETABLE
1296812973
1FADF..1FAEA ; Grapheme_Base # So [12] SPLATTER..DISTORTED FACE
1296912974
1FAEF..1FAF8 ; Grapheme_Base # So [10] FIGHT CLOUD..RIGHTWARDS PUSHING HAND
1297012975
1FB00..1FB92 ; Grapheme_Base # So [147] BLOCK SEXTANT-1..UPPER HALF INVERSE MEDIUM SHADE AND LOWER HALF BLOCK
@@ -12980,7 +12985,7 @@ FFFC..FFFD ; Grapheme_Base # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEME
1298012985
30000..3134A ; Grapheme_Base # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
1298112986
31350..33479 ; Grapheme_Base # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
1298212987

12983-
# Total code points: 157495
12988+
# Total code points: 157494
1298412989

1298512990
# ================================================
1298612991

0 commit comments

Comments
 (0)