You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
\* Note: Each release, the number of items needed for Modern and Moderate increases. So locales without active contributors may drop down in coverage level.
54
+
\* Note: Two locales dropped in coverage (📉), from Moderate to Basic.
55
+
Each release, the number of items needed for Modern and Moderate increases.
56
+
So locales without active contributors may drop down in coverage level.
50
57
51
58
For a full listing, see [Coverage Levels](https://unicode.org/cldr/charts/dev/supplemental/locale_coverage.html)
52
59
@@ -62,28 +69,93 @@ See the [Modifications section](https://www.unicode.org/reports/tr35/proposed.ht
62
69
## Data Changes
63
70
64
71
### DTD Changes
65
-
66
-
-`territories` attribute of `languageData` in [`supplementalData.xml`](https://github.com/unicode-org/cldr/blob/main/common/supplemental/supplementalData.xml) removed. While it was a nice proxy to count the most important territories for each language, it was not clear and it was ripe for mis-understanding. ([CLDR-5708][])
67
-
68
-
For a full listing, see [Delta DTDs](https://unicode.org/cldr/charts/dev/supplemental/dtd_deltas.html).
72
+
**[TBD: Update from https://unicode.org/cldr/charts/48/supplemental/dtd_deltas.html, adding the meaning/impact of each]. Also consult the InfoHub vetter information.**
73
+
#### ldml
74
+
-`exemplarCharacters` added more `type` values:
75
+
- numbers-auxiliary — for number characters that are not 'core' to the language, but sometimes used (like regular auxiliary)
76
+
- punctuation-auxiliary — for punctual characters that are not 'core' to the language, but sometimes used (like regular auxiliary)
77
+
- punctuation-person — for the limited set of punctuation characters used in person name fields: eg, "Jean-Luc", "MD, Ph.D."
78
+
-`dateTimeFormat` added more `type` values:
79
+
- relative — TBD
80
+
-`gmtUnknownFormat` element was added — Indicating that the timezone is unknown (as opposed to absent from the format)
-`era` — the range of `code` values nows allows two letters before the first hyphen.
91
+
-`languageData` the `territories` attribute [`supplementalData.xml`](https://github.com/unicode-org/cldr/blob/main/common/supplemental/supplementalData.xml) was deprecated and data using it removed. The definition was unclear, and prone to mis-understanding — the more detailed data is in `territoryInfo`. ([CLDR-5708][])
92
+
-`usesMetazone` adds two new attributes `stdOffset` and `dstOffset` so that implementations can use either "vanguard" or "rearguard" TZDB data sources.
93
+
-`numberingSystem` — Unicode 17 data was added.
94
+
#### ldmlBCP47
95
+
-`type` adds a new attibute `region`
96
+
-`keyboard3@conformsTo` is updated to allow "48"
97
+
98
+
For a full listing, see [Delta DTDs].
99
+
100
+
### BCP47 Data Changes
101
+
-`nu-tols` Numbering system for Tolong Siki digits
102
+
- One additional zone: America/Coyhaique = tz-clcxq
103
+
- Seven region attributes for determining regions for timezones
104
+
- Three additional aliases
105
+
106
+
For a full listing, see [BCP47 Delta].
107
+
108
+
TBD, change these links to put the URLs at the bottom
69
109
70
110
### Supplemental Data Changes
71
111
112
+
#### Identifiers
113
+
- Added aliases/deprecations for languages (dek, mnk, nte)
114
+
- Updated to the latest language subtag registry, with various additions and deprecations
115
+
- Updated to the ISO currency data, with various additions and deprecations
116
+
- Added unit IDs part, part-per-1e6, part-per-1e9, cup-imperial, fluid-ounce-metric, with conversions
117
+
- deprecated unit IDs permillion, portion, portion-per-1e9, 100-kilometer
118
+
119
+
#### Language Data
72
120
-[language_script.tsv](https://github.com/unicode-org/cldr/blob/main/tools/cldr-code/src/main/resources/org/unicode/cldr/util/data/language_script.tsv) updated to include only one "Primary" writing system for languages that used to have multiple options ([CLDR-18114][]). Notable changes are:
73
121
- Panjabi `pa` has the primary to Gurumukhi `Guru` because widespread usage is in the Gurumukhi script -- while most speakers are in Pakistan `PK`, written usage remains Gurumukhi.
74
122
- Azerbaijani `az` and Northern Kurdish `ku` primarily are used in Latin `Latn`.
75
123
- Chinese languages `zh`, `hak`, and `nan` are matched to Simplified Han writing `Hans` -- except Cantonese `yue`, which is known for a preference in Traditional Han writing `Hant`.
76
124
- Hassiniyya `mey` was missing significant data, it should be associated with the Arabic `Arab` writing system by default, not Latin `Latn`.
125
+
- 5 new language distance values are added (for fallback to zh)
126
+
- Substantial updates to Language Info: additional languages in countries; revised population values, writing percentages, literacy percentages, and official status values.
127
+
128
+
#### Likely Subtags
129
+
- Many additions: see [Likely Subtags]
77
130
- Errors in likely subtags addressed
78
131
- The default language for Belarus `BY` is now Russian `ru`, reflecting modern usage. ([CLDR-14479][])
79
132
- Literary Chinese `lzh` was written in Traditional Han writing `Hant`. ([CLDR-16715][])
80
133
- Likely subtags updated because of prior mentioned primary script matches.
81
134
- Northern Kurdish `ku` now matched to Cyrillic writing in the CIS countries. ([CLDR-18114][])
82
135
- Hassiniyya `mey` updated to default to `mey_Arab_DZ` instead of `mey_Latn_SN` ([CLDR-18114][])
83
-
- See other likely subtags updated in [the Supplemental Data Delta page](https://www.unicode.org/cldr/charts/48/delta/supplemental-data.html#Likely)
84
136
137
+
#### Calendars, Timezones, Dayperiods
138
+
- Many updates and corrections for Metazone data
139
+
- Many updates to calendars, including the removal of eras and adjustment to era start dates
140
+
- Day periods for kok, scn, hi_Latn,
141
+
142
+
#### Plural Rules
143
+
- additions for cv, ie, kok, sgs
144
+
145
+
#### Currencies
146
+
- Updates to the latest ISO currencies
85
147
86
-
For a full listing, see [¤¤BCP47 Delta](https://unicode.org/cldr/charts/dev/delta/bcp47.html) and [¤¤Supplemental Delta](https://unicode.org/cldr/charts/dev/delta/supplemental-data.html)
148
+
#### Weekdata
149
+
- IS changed to firstDay=sun
150
+
- ku_SY adding H and hB
151
+
152
+
For a full listing, see [Supplemental Delta].
153
+
154
+
### Transforms
155
+
- Fixed problem in Gujarati → Latin with ૰
156
+
- Updated to latest Unicode 17 data for Han → Latin, with very many changes.
157
+
158
+
For a full listing, see [Transforms Delta].
87
159
88
160
### Locale Changes
89
161
@@ -93,8 +165,21 @@ For a full listing, see [¤¤BCP47 Delta](https://unicode.org/cldr/charts/dev/de
93
165
-`ku_Latn_IQ`: Kurdish (Kurmanji, Latin alphabet, Iraq)
For a full listing, see [Delta Data](https://unicode.org/cldr/charts/dev/delta/index.html)
168
+
- Languages that reached Basic in the last release have their names translated in this release
169
+
- Compound language names now have "core" and "extension" variants for use in menus (TBD, flesh this out)
170
+
- Many features selectable with locale options now have "core" names, for better presentation in menus (TBD, flesh this out)
171
+
- Calendar names, collation names, emoji options, currency formats, hour-cycle options, and so on.
172
+
- To match ISO, translations for Sark (CQ) were added.
173
+
- Recent or upcoming currency names are added (XCG, ZWG)
174
+
- There are now combination formats for relative times (TBD, flesh this out)
175
+
- Some additional flexible (aka available) date formats were added (TBD, flesh this out)
176
+
- Many locales had seldom-used short timezone abbreviations (such as EST) removed, or moved to sublocales that use them.
177
+
- The currency-number formats for alphaNextToNumber, noCurrency, and compact currency formats are now generated from other data for consistency. (TBD, flesh this out)
178
+
- The tooling made it easier to see when a space was a non-breaking character or not, or thin versions of those. The usage is now more consisent in many locales.
179
+
- New emoji for Unicode 17, have added names and search keywords.
180
+
- Additional guidance on translations was added, leading to refined translations or transcreations.
181
+
182
+
For a full listing, see [Delta Data].
98
183
99
184
### Message Format Specification
100
185
@@ -134,7 +219,9 @@ For a full listing, see [Delta Data](https://unicode.org/cldr/charts/dev/delta/i
134
219
135
220
## Migration
136
221
137
-
- TBD
222
+
- Number patterns that did not have a specific numberSystem (such as latn or arab) had be deprecated for many releases, and were finally removed.
223
+
-**TBD — add many items!**
224
+
138
225
139
226
### V48 advance warnings
140
227
The following changes are planned for CLDR 48. Please plan accordingly to avoid disruption.
@@ -181,4 +268,10 @@ For web pages with different views of CLDR data, see [http://cldr.unicode.org/in
0 commit comments