Skip to content

Commit 780d97c

Browse files
authored
CLDR-18940 BRS v48 Release page alpha updates (#5036)
1 parent e39ec8f commit 780d97c

File tree

1 file changed

+113
-20
lines changed

1 file changed

+113
-20
lines changed

docs/site/downloads/cldr-48.md

Lines changed: 113 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -28,25 +28,32 @@ The most significant changes in this release are:
2828
For more details, see below.
2929

3030
### Locale Coverage Status
31+
The following shows the coverage levels per language in this version of CLDR.
32+
- The **With Script** column indicates which of the **Count** locales are language-script variants.
33+
- For example, zh_Hant and zh(_Hans) add two to the **Count**, and one to **With Script**.
34+
- The **Regional Variants** column indicates the number of other regional locales: none are in **Count**.
35+
- For example, there are 46 locales for French, such as fr, fr_CA, fr_BE, etc., so that adds 46 to the RV column for Modern.
3136

3237
#### Current Levels
3338

34-
Count | Level | Usage | Examples
35-
-- | -- | -- | --
36-
xx | Modern | Suitable for full UI internationalization |
37-
xx | Moderate | Suitable for “document content” internationalization, eg. in spreadsheet |
38-
xx | Basic | Suitable for locale selection, eg. choice of language on mobile phone |
39+
Count | With Script | Regional Variants | Level | Usage | Examples
40+
-- | -- | -- | -- | -- | --
41+
104 | 5 | 305 | Modern | Suitable for full UI internationalization | Afrikaans, shqip, አማርኛ, ‫العربية‬, հայերեն, অসমীয়া, azərbaycan
42+
13 | 0 | 1 | Moderate | Suitable for “document content” internationalization, eg. in spreadsheet | Akan, Cebuano, Māori, тоҷикӣ
43+
57 | 10 | 22 | Basic | Suitable for locale selection, eg. choice of language on mobile phone | भोजपुरी, बर’, डोगरी, eʋegbe, Gã, हरियाणवी
3944

4045
#### Changes
4146

4247
| ± | New Level | Locales |
4348
| -- | -- | -- |
44-
| 📈 | Modern | |
45-
| 📈 | Moderate | |
46-
| 📈 | Basic | |
47-
| 📉 | Basic* | |
49+
| 📈 | Modern | Quechua, Akan, Romansh, Chuvash, Kazakh (Arabic), Shan, Bashkir |
50+
| 📈 | Moderate | Esperanto, Anii |
51+
| 📈 | Basic | Sicilian, Tuvinian, Buriat, Piedmontese |
52+
| 📉 | Basic* | Baluchi (Latin), Kurdish |
4853

49-
\* Note: Each release, the number of items needed for Modern and Moderate increases. So locales without active contributors may drop down in coverage level.
54+
\* Note: Two locales dropped in coverage (📉), from Moderate to Basic.
55+
Each release, the number of items needed for Modern and Moderate increases.
56+
So locales without active contributors may drop down in coverage level.
5057

5158
For a full listing, see [Coverage Levels](https://unicode.org/cldr/charts/dev/supplemental/locale_coverage.html)
5259

@@ -62,28 +69,93 @@ See the [Modifications section](https://www.unicode.org/reports/tr35/proposed.ht
6269
## Data Changes
6370

6471
### DTD Changes
65-
66-
- `territories` attribute of `languageData` in [`supplementalData.xml`](https://github.com/unicode-org/cldr/blob/main/common/supplemental/supplementalData.xml) removed. While it was a nice proxy to count the most important territories for each language, it was not clear and it was ripe for mis-understanding. ([CLDR-5708][])
67-
68-
For a full listing, see [Delta DTDs](https://unicode.org/cldr/charts/dev/supplemental/dtd_deltas.html).
72+
**[TBD: Update from https://unicode.org/cldr/charts/48/supplemental/dtd_deltas.html, adding the meaning/impact of each]. Also consult the InfoHub vetter information.**
73+
#### ldml
74+
- `exemplarCharacters` added more `type` values:
75+
- numbers-auxiliary — for number characters that are not 'core' to the language, but sometimes used (like regular auxiliary)
76+
- punctuation-auxiliary — for punctual characters that are not 'core' to the language, but sometimes used (like regular auxiliary)
77+
- punctuation-person — for the limited set of punctuation characters used in person name fields: eg, "Jean-Luc", "MD, Ph.D."
78+
- `dateTimeFormat` added more `type` values:
79+
- relative — TBD
80+
- `gmtUnknownFormat` element was added — Indicating that the timezone is unknown (as opposed to absent from the format)
81+
- `language` added more `menu` values:
82+
- core — TBD
83+
- extension — TBD
84+
- `type` added more `scope` values:
85+
- core — TBD
86+
- `numbers` added `rationalFormats` sub-element:
87+
- TBD Add from sites page
88+
- `rbnf​/rulesetGrouping` added `rbnfRules` sub-element — TBD
89+
#### supplementalData
90+
- `era` — the range of `code` values nows allows two letters before the first hyphen.
91+
- `languageData` the `territories` attribute [`supplementalData.xml`](https://github.com/unicode-org/cldr/blob/main/common/supplemental/supplementalData.xml) was deprecated and data using it removed. The definition was unclear, and prone to mis-understanding — the more detailed data is in `territoryInfo`. ([CLDR-5708][])
92+
- `usesMetazone` adds two new attributes `stdOffset` and `dstOffset` so that implementations can use either "vanguard" or "rearguard" TZDB data sources.
93+
- `numberingSystem` — Unicode 17 data was added.
94+
#### ldmlBCP47
95+
- `type` adds a new attibute `region`
96+
- `keyboard3@conformsTo` is updated to allow "48"
97+
98+
For a full listing, see [Delta DTDs].
99+
100+
### BCP47 Data Changes
101+
- `nu-tols` Numbering system for Tolong Siki digits
102+
- One additional zone: America/Coyhaique = tz-clcxq
103+
- Seven region attributes for determining regions for timezones
104+
- Three additional aliases
105+
106+
For a full listing, see [BCP47 Delta].
107+
108+
TBD, change these links to put the URLs at the bottom
69109

70110
### Supplemental Data Changes
71111

112+
#### Identifiers
113+
- Added aliases/deprecations for languages (dek, mnk, nte)
114+
- Updated to the latest language subtag registry, with various additions and deprecations
115+
- Updated to the ISO currency data, with various additions and deprecations
116+
- Added unit IDs part, part-per-1e6, part-per-1e9, cup-imperial, fluid-ounce-metric, with conversions
117+
- deprecated unit IDs permillion, portion, portion-per-1e9, 100-kilometer
118+
119+
#### Language Data
72120
- [language_script.tsv](https://github.com/unicode-org/cldr/blob/main/tools/cldr-code/src/main/resources/org/unicode/cldr/util/data/language_script.tsv) updated to include only one "Primary" writing system for languages that used to have multiple options ([CLDR-18114][]). Notable changes are:
73121
- Panjabi `pa` has the primary to Gurumukhi `Guru` because widespread usage is in the Gurumukhi script -- while most speakers are in Pakistan `PK`, written usage remains Gurumukhi.
74122
- Azerbaijani `az` and Northern Kurdish `ku` primarily are used in Latin `Latn`.
75123
- Chinese languages `zh`, `hak`, and `nan` are matched to Simplified Han writing `Hans` -- except Cantonese `yue`, which is known for a preference in Traditional Han writing `Hant`.
76124
- Hassiniyya `mey` was missing significant data, it should be associated with the Arabic `Arab` writing system by default, not Latin `Latn`.
125+
- 5 new language distance values are added (for fallback to zh)
126+
- Substantial updates to Language Info: additional languages in countries; revised population values, writing percentages, literacy percentages, and official status values.
127+
128+
#### Likely Subtags
129+
- Many additions: see [Likely Subtags]
77130
- Errors in likely subtags addressed
78131
- The default language for Belarus `BY` is now Russian `ru`, reflecting modern usage. ([CLDR-14479][])
79132
- Literary Chinese `lzh` was written in Traditional Han writing `Hant`. ([CLDR-16715][])
80133
- Likely subtags updated because of prior mentioned primary script matches.
81134
- Northern Kurdish `ku` now matched to Cyrillic writing in the CIS countries. ([CLDR-18114][])
82135
- Hassiniyya `mey` updated to default to `mey_Arab_DZ` instead of `mey_Latn_SN` ([CLDR-18114][])
83-
- See other likely subtags updated in [the Supplemental Data Delta page](https://www.unicode.org/cldr/charts/48/delta/supplemental-data.html#Likely)
84136

137+
#### Calendars, Timezones, Dayperiods
138+
- Many updates and corrections for Metazone data
139+
- Many updates to calendars, including the removal of eras and adjustment to era start dates
140+
- Day periods for kok, scn, hi_Latn,
141+
142+
#### Plural Rules
143+
- additions for cv, ie, kok, sgs
144+
145+
#### Currencies
146+
- Updates to the latest ISO currencies
85147

86-
For a full listing, see [¤¤BCP47 Delta](https://unicode.org/cldr/charts/dev/delta/bcp47.html) and [¤¤Supplemental Delta](https://unicode.org/cldr/charts/dev/delta/supplemental-data.html)
148+
#### Weekdata
149+
- IS changed to firstDay=sun
150+
- ku_SY adding H and hB
151+
152+
For a full listing, see [Supplemental Delta].
153+
154+
### Transforms
155+
- Fixed problem in Gujarati → Latin with ૰
156+
- Updated to latest Unicode 17 data for Han → Latin, with very many changes.
157+
158+
For a full listing, see [Transforms Delta].
87159

88160
### Locale Changes
89161

@@ -93,8 +165,21 @@ For a full listing, see [¤¤BCP47 Delta](https://unicode.org/cldr/charts/dev/de
93165
- `ku_Latn_IQ`: Kurdish (Kurmanji, Latin alphabet, Iraq)
94166
- `ku_Arab_IQ`: Kurdish (Kurmanji, Arabic writing, Iraq), default for Kurdish (Kurmanji, Arabic writing) `ku_Arab`
95167
- `ku_Arab_IR`: Kurdish (Kurmanji, Arabic writing, Iran)
96-
97-
For a full listing, see [Delta Data](https://unicode.org/cldr/charts/dev/delta/index.html)
168+
- Languages that reached Basic in the last release have their names translated in this release
169+
- Compound language names now have "core" and "extension" variants for use in menus (TBD, flesh this out)
170+
- Many features selectable with locale options now have "core" names, for better presentation in menus (TBD, flesh this out)
171+
- Calendar names, collation names, emoji options, currency formats, hour-cycle options, and so on.
172+
- To match ISO, translations for Sark (CQ) were added.
173+
- Recent or upcoming currency names are added (XCG, ZWG)
174+
- There are now combination formats for relative times (TBD, flesh this out)
175+
- Some additional flexible (aka available) date formats were added (TBD, flesh this out)
176+
- Many locales had seldom-used short timezone abbreviations (such as EST) removed, or moved to sublocales that use them.
177+
- The currency-number formats for alphaNextToNumber, noCurrency, and compact currency formats are now generated from other data for consistency. (TBD, flesh this out)
178+
- The tooling made it easier to see when a space was a non-breaking character or not, or thin versions of those. The usage is now more consisent in many locales.
179+
- New emoji for Unicode 17, have added names and search keywords.
180+
- Additional guidance on translations was added, leading to refined translations or transcreations.
181+
182+
For a full listing, see [Delta Data].
98183

99184
### Message Format Specification
100185

@@ -134,7 +219,9 @@ For a full listing, see [Delta Data](https://unicode.org/cldr/charts/dev/delta/i
134219

135220
## Migration
136221

137-
- TBD
222+
- Number patterns that did not have a specific numberSystem (such as latn or arab) had be deprecated for many releases, and were finally removed.
223+
- **TBD — add many items!**
224+
138225

139226
### V48 advance warnings
140227
The following changes are planned for CLDR 48. Please plan accordingly to avoid disruption.
@@ -181,4 +268,10 @@ For web pages with different views of CLDR data, see [http://cldr.unicode.org/in
181268
[CLDR-18219]: https://unicode-org.atlassian.net/browse/CLDR-18219
182269
[CLDR-18275]: https://unicode-org.atlassian.net/browse/CLDR-18275
183270
[CLDR-18311]: https://unicode-org.atlassian.net/browse/CLDR-18311
184-
[CLDR-11400]: https://unicode-org.atlassian.net/browse/CLDR-11400
271+
[CLDR-11400]: https://unicode-org.atlassian.net/browse/CLDR-11400
272+
[Delta DTDs]: https://unicode.org/cldr/charts/48/supplemental/dtd_deltas.html
273+
[BCP47 Delta]: https://unicode.org/cldr/charts/48/delta/bcp47.html
274+
[Supplemental Delta]: https://unicode.org/cldr/charts/48/delta/supplemental-data.html
275+
[Likely Subtags]: https://www.unicode.org/cldr/charts/48/delta/supplemental-data.html#Likely
276+
[Transforms Delta]: https://unicode.org/cldr/charts/48/delta/transforms.html
277+
[Delta Data]: https://unicode.org/cldr/charts/dev/delta/index.html

0 commit comments

Comments
 (0)