You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### <aname="locale_display_name_algorithm"href="#locale_display_name_algorithm">Locale Display Name Algorithm</a>
170
+
The `language` element has the additional `alt="menu"` option, that allows for related languages to be sorted together.
171
171
172
-
A locale display name LDN is generated for a locale identifier L in the following way. First, convert the locale identifier to *canonical syntax* per **[Part 1, Canonical Unicode Locale Identifiers](tr35.md#Canonical_Unicode_Locale_Identifiers)**. That will put the subtags in a defined order, and replace aliases by their canonical counterparts. (That defined order is followed in the processing below.)
However, when `localePattern`s are used, the names start to get complicated. There is an additional `menu` attribute, with two values: `core` and `extension`.For example:
173
177
174
-
Then follow each of the following steps for the subtags in L, building a base name LDN and a list of qualifying strings LQS.
Where there is a match for a subtag, disregard that subtag from L and add the element value to LDN or LQS as described below. If there is no match for a subtag, use the fallback pattern with the subtag instead.
192
+
The core part can be used as the language name, with the extension going into the `localePattern`, such as in the following illustration of part of a menu:
177
193
178
-
Once LDN and LQS are built, return the following based on the length of LQS.
194
+
| Language |
195
+
| ---- |
196
+
| … |
197
+
| Kashmiri |
198
+
| Kurdish (Kurmanji, Latin) |
199
+
| Kurdish (Central, Arabic) |
200
+
| Kurdish (Southern, Arabic) |
201
+
| Kyrgyz |
202
+
| … |
179
203
180
-
<!-- HTML: no header -->
181
-
<table><tbody>
182
-
<tr><td>0</td><td>return LDN</td></tr>
183
-
<tr><td>1</td><td>use the <localePattern> to compose the result LDN from LDN and LQS[0], and return it.</td></tr>
184
-
<tr><td>>1</td><td>use the <localeSeparator> element value to join the elements of the list into LDN2, then use the <localePattern> to compose the result LDN from LDN and LDN2, and return it.</td></tr>
185
-
</tbody></table>
204
+
### <aname="locale_display_name_algorithm"href="#locale_display_name_algorithm">Locale Display Name Algorithm</a>
205
+
206
+
A locale display name LDN is generated for a locale identifier L in the following way.
207
+
1. Convert the locale identifier to *canonical syntax* per **[Part 1, Canonical Unicode Locale Identifiers](tr35.md#Canonical_Unicode_Locale_Identifiers)**.
208
+
That will put the subtags in a defined order, and replace aliases by their canonical counterparts. (That defined order is followed in the processing below.)
209
+
2. Build a base name LDN from the language, possibly also some other subtags, taking into account the parameters listed below.
210
+
* The language name uses the longest match, dropping all fields that match. For example:
211
+
* With L = "nl_Cyrl_BE", if there is a `<language type="nl_BE">`Flemish`</language>`, the language name is set to "Flemish", and the "BE" is ignored in step 4.
212
+
* With L = "ca_fonipa_valencia", if there is a `<language type="ca_valencia">`Valencian`</language>`, the language name is set to "Valencian", and the subtag "valencia" is ignored in step 4.
213
+
4. Build a list of qualifying strings LQS.
214
+
1. For each remaining subtag language identifier (script, region, or variant):
215
+
1. Where there is a match for a subtag, disregard that subtag from L and add the name of the subtag to LDN or LQS as described below.
216
+
2. If there is no match for a subtag, use the fallback pattern with the subtag instead.
217
+
2. For any remaining `-u` or `t` key-value pairs, there are two options (based on the parameters; the first is the default)
218
+
1.`WholeKeyValue`: Add the formatted key-value, OR
219
+
2.`SeparateKeyValue` Add a string created from the formatted key and the formatted value using `scope="core"`
220
+
5. Once LDN and LQS are built, return the following based on the length of LQS.
221
+
222
+
| Length | Processing |
223
+
| :---- | :---- |
224
+
| 0 | return LDN |
225
+
| 1 | use the \<localePattern\> to compose the result LDN from LDN and LQS\[0\], and return it. |
226
+
|\>1 | use the \<localeSeparator\> element value to join the elements of the list into LDN2, then use the \<localePattern\> to compose the result LDN from LDN and LDN2, and return it. |
186
227
187
-
The processing can be controlled via the following parameters.
228
+
The processing can be controlled via the following parameters (the names of the parameters are only illustrative):
188
229
189
230
*`CombineLanguage`: boolean
190
231
* Example: the `CombineLanguage = true`, picking the bold value below.
191
-
*`<language type="nl">Dutch</language>`
232
+
*`<language type="nl">`Dutch`</language>`
192
233
***`<language type="nl_BE">Flemish</language>`**
193
234
*`PreferAlt`: map from element to preferred alt value, picking the bold value below.
194
235
* Example: the `PreferAlt` contains `{"language"="short"}`:
In addition, the input locale display name could be minimized (see [Part 1: Likely Subtags](tr35.md#Likely_Subtags)) before generating the LDN. Selective minimization is often the best choice. For example, in a menu list it is often clearer to show the region if there are any regional variants. Thus the user would just see \["Spanish"\] for es if the latter is the only supported Spanish, but where es-MX is also listed, then see \["Spanish (Spain)", "Spanish (Mexico)"\].
199
255
256
+
The key-type `scope="core"` is also useful in menus. For example, if a menu or pull-down is offering different choices of calendars, it is cleaner to use the key value for the name of the menu (eg, "Calendar"), and use the `scope="core"` values for the choices. Thus:
Copy file name to clipboardExpand all lines: docs/ldml/tr35-modifications.md
+6-3Lines changed: 6 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -57,7 +57,9 @@ The LDML specification is divided into the following parts:
57
57
58
58
**Changes in LDML Version 48 (Differences from Version 47)**
59
59
60
-
### Locale Identifiers
60
+
### Locale Identifiers and Names
61
+
*[Display Name Elements](tr35-general.md#display-name-elements) Described the usage of the `language` element `menu` values `core` and `extension`, and `alt="menu"`.
62
+
Also revamped the description of how to construct names for locale IDs, for clarity.
61
63
*[Special Script Codes](tr35.md#special-script-codes) Added the `Hntl` compound script. (This is also reflected in the `<scriptData>` elements in supplementalData.xml.)
62
64
*[Likely Subtags](tr35.md#likely-subtags) Changed the Canonicalize step to point to the section on canonicalization.
63
65
*[Unicode Locale Identifier](tr35.md#unicode-locale-identifier) Changed the `attribute` component in the EBNF to be `uattribute` for consistency with `ufield`, etc.
@@ -84,8 +86,9 @@ There is also now a mechanism for finding the region code from short timezone id
84
86
*[Plural rules syntax](tr35-numbers.md#plural-rules-syntax) Added substantial clarifications and new examples.
85
87
The order of execution is also clearly specified.
86
88
*[Compact Number Formats](tr35-numbers.md#compact-number-formats) Specified the mechanism for formatting compact numbers more precisely.
87
-
*[Rule-Based Number Formatting]() The rules are also now represented by a new XML structure with a “flat” format,
88
-
which is easier for clients to handle (the old format will be retained for one more release).
89
+
*[Rule-Based Number Formatting](tr35-numbers.md#) Added a full specification.
90
+
The rules have been converted to a “flat” format, which is easier for clients to handle (the old format will be retained for one more release).
91
+
*[Rational Numbers](tr35-numbers.md#rational-numbers) Added support for formatting fractions like 5½.
89
92
90
93
### Units of Measurement
91
94
*[Unit Syntax](tr35-general.md#unit-syntax) Simplified the EBNF `product_unit` and added an additional well-formedness constraint for mixed units.
Copy file name to clipboardExpand all lines: docs/ldml/tr35-numbers.md
+56Lines changed: 56 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -852,6 +852,62 @@ To specify a rounding increment in a pattern, include the increment in the patte
852
852
853
853
Single quotes (**'**) enclose bits of the pattern that should be treated literally. Inside a quoted string, two single quotes ('') are replaced with a single one ('). For example: `'X '`#`' Q '` -> **X 1939 Q** (Literal strings `shaded`.)
The rational number patterns specify the formatting of rational fractions in different languages.
886
+
Rational fractions contain a numerator and denominator, such as ½, and may also have an integer, such a 5½.
887
+
There are two different “combination patterns”, needed because sometimes fonts and rendering systems don’t properly support fractions (such as displaying 5 1/2),
888
+
and need two patterns: one with a space and one without.
889
+
The choice of which to use depends on the rendering system and font support available, as described below.
890
+
891
+
Here are the the English values for example, and a short description of the purpose of each field:
892
+
893
+
| Code | Default Value | Description |
894
+
| :---- | :---: | :---- |
895
+
|`rationalPattern`| {0}⁄{1} | The format for a rational fraction with arbitrary numerator and denominator; the English pattern uses the Unicode character ‘⁄’ U+2044 FRACTION SLASH which causes composition of fractions such as 22⁄7, when supported properly by rendering systems and fonts. |
896
+
|`integerAndRationalPattern`| {0} {1} | The format for combining an integer with a rational fraction that is composed using the `Rational` pattern; the English pattern uses U+202F NARROW NO-BREAK SPACE (NNBSP) to produce a _non-breaking thin space_. |
897
+
|`integerAndRationalPattern-superSub`| {0}{1} | The format for combining an integer with a rational fraction that is composed using the `Rational` pattern; the English pattern uses U+2060 WORD JOINER, a _zero-width no-break space_. |
898
+
|`rationalUsage`| sometimes | An indication of the extent to which rational fractions are used in the locale; either `never` or `sometimes`. |
899
+
900
+
The `integerAndRationalPattern-superSub` is used for an integer with fraction. However, some fonts and rendering systems don’t properly handle the fraction slash, and the user would see something like **51/2** (fifty-one halves) when **5½** is desired\!
901
+
Therefore, the `integerAndRationalPattern` is available also, which forces a visible space between the integer and fraction (**5 ½**).
902
+
(In some languages, there there may always be a space: in that case the patterns for `integerAndRationalPattern` and `integerAndRationalPattern-superSub` will be identical. )
903
+
904
+
In environments where the rendering system and font can't be trusted to handle U+2044 FRACTION SLASH properly, there are a few techniques available to have a better rendering than 22/7:
905
+
- Use markup such as HTML `<super>` and `<sub>` for the numerator and denominator.
906
+
- Where markup is not available and the numbering system is `latn` (ASCII digits 0..9), there are two other choices:
907
+
- If the fraction happens to match the precomposed fractions available in Unicode, those can be used (eg, ½ ⅔ ⅗ ⅐ ⅝ ¾ …)
908
+
- The Latin superscript (¹ ² ³ …) and subscript digits (₁ ₂ ₃ …) digits can be used with the U+2044 FRACTION SLASH, such as ²²⁄₇.
909
+
- In both cases, some fonts don't have consistent support for these characters, and so the sizes and positioning may vary.
Copy file name to clipboardExpand all lines: docs/site/downloads/cldr-48.md
+14-1Lines changed: 14 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -91,6 +91,7 @@ See the [Modifications section](https://www.unicode.org/reports/tr35/dev/tr35-mo
91
91
#### General
92
92
- Languages that reached Basic in the last release have their names translated at Modern Coverage in this release.
93
93
- Compound language names now have "core" and "extension" variants for more uniform formats in menus and lists.
94
+
The description of how to format names for locale IDs has been extended and clarified.
94
95
- For example, that allows the Kurdish variants to have a uniform format where more than Kurmanji is displayed.
95
96
- Kashmiri
96
97
- Kurdish (Kurmanji, Latin)
@@ -286,15 +287,27 @@ The following files are new in the release:
286
287
287
288
- TBD
288
289
290
+
----
291
+
289
292
## Migration
290
293
291
294
- Number patterns that did not have a specific numberSystem (such as `latn` or `arab`) had been deprecated for many releases, and were finally removed.
292
295
- Additionally, language and territory data in `languageData` and `territoryInfo` data received significant updates to improve accuracy and maintainability [CLDR-18087]
293
296
- The likely language for Belarus changed to Russian [CLDR-14479]
297
+
- The unit identifiers for the following changed for consistency.
298
+
As with all such changes, aliases are available to permit parsing and formatting to work across versions.
299
+
- `permillion` changed to `part-per-1e6`; English values remain “parts per million”, “{0} part per million”, etc.
300
+
- `portion-per-1e9` changed to `part-per-1e9`; English values remain “parts per billion”, “{0} part per billion”, etc.
301
+
- `part` used for constructing arbitrary concentrations such as “parts per 100,000”; English values “parts”, “{0} part”, etc.
302
+
- English and/or root names of many exemplar cities and some metazones changed.
303
+
This was typically to move towards the official spelling in the country in question, such as retaining accents, or to add landscape terms such as “Island”.
304
+
For example: El Aaiun → El Aaiún; Casey → Casey Station; Hovd Time → Khovd Time.
305
+
- A few additional availableFormat and interval format patterns have been added, such as GyMEd and Hv, to fill some gaps.
306
+
- The metazone for Hawaii has changed.
294
307
-**TBD Additional items plus future guidance will be added before the spec-beta.**
295
308
296
-
297
309
### V49 advance warnings
310
+
298
311
The following changes are planned for CLDR 49. Please plan accordingly to avoid disruption.
299
312
-[CLDR-18303][] H24 will be deprecated. If it is encountered, it will have H23 behavior. There is no known intentional usage of H24. If you have a current need for H24 instead of H23, please comment on [CLDR-18303][].
300
313
- The default week numbering changes to ISO instead being based on the calendar week starting in CLDR 48 [CLDR-18275]. The calendar week will be more clearly targeted at matching usage in displayed month calendars.
0 commit comments