Skip to content

Commit c4b7914

Browse files
authored
CLDR-18848 Specify the algorithm for compact number formatting (#5097)
1 parent 49b183d commit c4b7914

File tree

2 files changed

+89
-35
lines changed

2 files changed

+89
-35
lines changed

docs/ldml/tr35-modifications.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -79,8 +79,9 @@ There is also now a mechanism for finding the region code from short timezone id
7979
### Numbers
8080
* [Plural rules syntax](tr35-numbers.md#plural-rules-syntax) Added substantial clarifications and new examples.
8181
The order of execution is also clearly specified.
82-
* [Rule-Based Number Formatting]() Added a full specification.
83-
The rules have been converted to a “flat” format, which is easier for clients to handle (the old format will be retained for one more release).
82+
* [Compact Number Formats](tr35-numbers.md#compact-number-formats) Specified the mechanism for formatting compact numbers more precisely.
83+
* [Rule-Based Number Formatting]() The rules are also now represented by a new XML structure with a “flat” format,
84+
which is easier for clients to handle (the old format will be retained for one more release).
8485

8586
### Units of Measurement
8687
* [Unit Syntax](tr35-general.md#unit-syntax) Simplified the EBNF `product_unit` and added an additional well-formedness constraint for mixed units.

docs/ldml/tr35-numbers.md

Lines changed: 86 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -372,50 +372,102 @@ A pattern `type` attribute is used for _compact number formats_, such as the fol
372372

373373
```xml
374374
<decimalFormatLength type="long">
375-
<decimalFormat>
376-
<pattern type="1000" count="one">0 millier</pattern>
377-
<pattern type="1000" count="other">0 milliers</pattern>
378-
<pattern type="10000" count="one">00 mille</pattern>
379-
<pattern type="10000" count="other">00 mille</pattern>
380-
<pattern type="100000" count="one">000 mille</pattern>
381-
<pattern type="100000" count="other">000 mille</pattern>
382-
<pattern type="1000000" count="one">0 million</pattern>
383-
<pattern type="1000000" count="other">0 millions</pattern>
384-
385-
</decimalFormat>
375+
<decimalFormat>
376+
<pattern type="1000" count="one">0 thousand</pattern>
377+
<pattern type="1000" count="other">0 thousand</pattern>
378+
<pattern type="10000" count="one">00 thousand</pattern>
379+
<pattern type="10000" count="other">00 thousand</pattern>
380+
<pattern type="100000" count="one">000 thousand</pattern>
381+
<pattern type="100000" count="other">000 thousand</pattern>
382+
<pattern type="1000000" count="one">0 million</pattern>
383+
<pattern type="1000000" count="other">0 million</pattern>
384+
<pattern type="10000000" count="one">00 million</pattern>
385+
<pattern type="10000000" count="other">00 million</pattern>
386+
387+
</decimalFormat>
386388
</decimalFormatLength>
387389
<decimalFormatLength type="short">
388-
<decimalFormat>
389-
<pattern type="1000" count="one">0 K</pattern>
390-
<pattern type="1000" count="other">0 K</pattern>
391-
<pattern type="10000" count="one">00 K</pattern>
392-
<pattern type="10000" count="other">00 K</pattern>
393-
<pattern type="100000" count="one">000 K</pattern>
394-
<pattern type="100000" count="other">000 K</pattern>
395-
<pattern type="1000000" count="one">0 M</pattern>
396-
<pattern type="1000000" count="other">0 M</pattern>
397-
398-
</decimalFormat>
390+
<decimalFormat>
391+
<pattern type="1000" count="one">0K</pattern>
392+
<pattern type="1000" count="other">0K</pattern>
393+
<pattern type="10000" count="one">00K</pattern>
394+
<pattern type="10000" count="other">00K</pattern>
395+
<pattern type="100000" count="one">000K</pattern>
396+
<pattern type="100000" count="other">000K</pattern>
397+
<pattern type="1000000" count="one">0M</pattern>
398+
<pattern type="1000000" count="other">0M</pattern>
399+
<pattern type="10000000" count="one">00M</pattern>
400+
<pattern type="10000000" count="other">00M</pattern>
401+
402+
</decimalFormat>
399403
</decimalFormatLength>
400404
401405
<currencyFormatLength type="short">
402406
<currencyFormat type="standard">
403-
<pattern type="1000" count="one">0 K ¤</pattern>
404-
<pattern type="1000" count="other">0 K ¤</pattern>
405-
<pattern type="10000" count="one">00 K ¤</pattern>
406-
<pattern type="10000" count="other">00 K ¤</pattern>
407-
<pattern type="100000" count="one">000 K ¤</pattern>
408-
<pattern type="100000" count="other">000 K ¤</pattern>
409-
<pattern type="1000000" count="one">0 M ¤</pattern>
410-
<pattern type="1000000" count="other">0 M ¤</pattern>
411-
407+
<pattern type="1000" count="one">¤0K</pattern>
408+
<pattern type="1000" count="one" alt="alphaNextToNumber">¤ 0K</pattern>
409+
<pattern type="1000" count="other">¤0K</pattern>
410+
<pattern type="1000" count="other" alt="alphaNextToNumber">¤ 0K</pattern>
411+
<pattern type="10000" count="one">¤00K</pattern>
412+
<pattern type="10000" count="one" alt="alphaNextToNumber">¤ 00K</pattern>
413+
<pattern type="10000" count="other">¤00K</pattern>
414+
<pattern type="10000" count="other" alt="alphaNextToNumber">¤ 00K</pattern>
415+
<pattern type="100000" count="one">¤000K</pattern>
416+
<pattern type="100000" count="one" alt="alphaNextToNumber">¤ 000K</pattern>
417+
<pattern type="100000" count="other">¤000K</pattern>
418+
<pattern type="100000" count="other" alt="alphaNextToNumber">¤ 000K</pattern>
419+
<pattern type="1000000" count="one">¤0M</pattern>
420+
<pattern type="1000000" count="one" alt="alphaNextToNumber">¤ 0M</pattern>
421+
<pattern type="1000000" count="other">¤0M</pattern>
422+
<pattern type="1000000" count="other" alt="alphaNextToNumber">¤ 0M</pattern>
423+
<pattern type="10000000" count="one">¤00M</pattern>
424+
<pattern type="10000000" count="one" alt="alphaNextToNumber">¤ 00M</pattern>
425+
<pattern type="10000000" count="other">¤00M</pattern>
426+
<pattern type="10000000" count="other" alt="alphaNextToNumber">¤ 00M</pattern> …
412427
</currencyFormat>
413428
</currencyFormatLength>
414429
```
415430

416431
Formats can be supplied for numbers (as above) or for currencies or other units. They can also be used with ranges of numbers, resulting in formatting strings like “$10K” or “$3–7M”.
417432

418-
To format a number N, the greatest type less than or equal to N is used, with the appropriate plural category. N is divided by the type, after removing the number of zeros in the pattern, less 1. APIs supporting this format should provide control over the number of significant or fraction digits.
433+
To format a number N, use the following steps:
434+
435+
Notes:
436+
- A _letter grapheme cluster_ is a grapheme cluster that starts with a letter and then 0 or more combining marks.
437+
For example, each of the following are are _letter grapheme clusters_: \<q>, \<q, _combining ring above_>, \<q, _combining ring above_, _acute accent_>.
438+
- All of the pattern elements with the same type must have the same number of zeros in the pattern element value.
439+
- The examples use N = 123456, the currency = CAD, and the currency symbol string = "$CA"
440+
441+
1. Let P be the pattern element with greatest type less than or equal to N, and any count value.
442+
* P = `<pattern type="100000" count="**one**">¤000K</pattern>`
443+
2. Let V be the pattern element value.
444+
* V = "¤000K"
445+
3. If the element value of P is "0", then use the corresponding non-compact number formatting instead, and skip the rest of these steps — but adjust the precision as described below.
446+
* For example, instead of `currencyFormat` `<pattern type="10000" count="one">¤00K</pattern>`, use `<pattern>¤#,##0.00</pattern>`.
447+
4. If P is a currency format, look at the currency symbol string, and the position of the currency symbol ¤ in the pattern element value.
448+
If ¤ is immediately to the left of a 0 and the currency string ends with a _letter grapheme cluster_ (eg, "$CA"),
449+
or to the right and the currency starts with a letter (eg, "CA$"),
450+
then switch to the `alt=alphaNextToNumber` pattern, if there is one.
451+
* P = `<pattern type="100000" count="**one**" alt="alphaNextToNumber">¤ 000K</pattern>` // with the currency symbol "CA$"
452+
* V = "¤ 000K"
453+
5. Let Z be the number of 0 characters in V, minus 1.
454+
* Z = 2
455+
6. Let T be the numeric value of the `type` attribute value, after removing the final Z zeros.
456+
* "100000" removing "00" = "1000"
457+
* T = 1000
458+
7. Let N' be N / T
459+
* N = 123.456
460+
8. Determine the plural category of N, based on the numeric precision settings (the min/max number of significant or fraction digits), and switch the value of V if necessary.
461+
* In this case, the plural category of 123.456 in English with any precision is "other", so the
462+
* P = `<pattern type="100000" count="**other**" alt="alphaNextToNumber">¤ 000K</pattern>`
463+
* V = "¤ 000K"
464+
* For the short compact formats, it doesn't make a difference for English, but may for other locales!
465+
9. Let V' be the same as V, but replacing that sequence of zeros by "{0}".
466+
* V' = "¤ {0}K"
467+
10. Let F be N' formatted according to V' and the numeric precision settings.
468+
* F = "$CA 123K" // where the precision is min = max = 3 significant digits
469+
* F = "$CA 123.4K" // where the precision is min = max = 1 fraction digit
470+
419471

420472
The default pattern for any type that is not supplied is the special value “0”, as in the following. The value “0” must be used when a child locale overrides a parent locale to drop the compact pattern for that type and use the default pattern.
421473

@@ -427,7 +479,8 @@ With the data above, N=12345 matches `<pattern type="10000" count="other">00 K</
427479

428480
Formatting 1200 in USD would result in “1.2 K $”, while 990 implicitly maps to the special value “0”, which maps to `<currencyFormat type="standard"><pattern>#,##0.00 ¤</pattern>`, and would result in simply “990 $”.
429481

430-
The short format is designed for UI environments where space is at a premium, and should ideally result in a formatted string no more than about 6 em wide (with no fractional digits).
482+
The short non-currency format is designed for UI environments where space is at a premium, and should ideally result in a formatted string no more than about 6 em wide (with no fractional digits).
483+
The short currency format will include currency symbols, and should ideally be no more than 8 em in width.
431484

432485
#### <a name="Currency_Formats" href="#Currency_Formats">Currency Formats</a>
433486

0 commit comments

Comments
 (0)