Skip to content

Commit 419c050

Browse files
authored
CLDR-14305 Small cleanup for total order in unit normalization (unicode-org#5075)
1 parent 784f76d commit 419c050

File tree

1 file changed

+16
-8
lines changed

1 file changed

+16
-8
lines changed

docs/ldml/tr35-info.md

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1138,19 +1138,22 @@ The unitType (as in “length-meter”) is not the same as the quantity. It is o
11381138

11391139
### <a name="Unit_Identifier_Normalization" href="#Unit_Identifier_Normalization">Unit Identifier Normalization</a>
11401140

1141-
There are many possible ways to construct complex units. For comparison of unit identifiers, an implementation can normalize in the following way:
1141+
There are many possible ways to construct complex units. For comparison of unit identifiers, and for formatting, an implementation can normalize in the following way:
11421142

11431143
1. Convert all but the first -per- to simple multiplication. The result then has the format of /numerator ( -per- denominator)?/
11441144
* foot-per-second-per-second ⇒ foot-per-second-second
11451145
2. Within each of the numerator and denominator:
11461146
3. Convert multiple instances of a unit into the appropriate power.
11471147
* foot-per-second-second ⇒ foot-per-square-second
11481148
* kilogram-meter-kilogram ⇒ meter-square-kilogram
1149-
4. For each single unit, disregarding prefixes and powers, get the order of the _simple_ unit among the `unitQuantity` elements in the [units.xml](https://github.com/unicode-org/cldr/blob/main/common/supplemental/units.xml). Sort the single units by that order, using a stable sort. If there are private-use single units, sort them after all the non-private use single units.
1150-
* meter-square-kilogram => square-kilogram-meter
1149+
4. For each single unit, disregarding prefixes and powers, get the order of the _simple_ unit among the `unitQuantity` elements in the [units.xml](https://github.com/unicode-org/cldr/blob/main/common/supplemental/units.xml).
1150+
Sort the single units by that order, using a stable sort.
1151+
If there are private-use single units, sort them after all the non-private use single units, in alphabetical order.
1152+
* meter-square-kilogram ⇒ square-kilogram-meter
11511153
* meter-square-gram ⇒ square-gram-meter
1152-
5. As an edge case, there could be two adjacent single units with the same _simple_ unit but different prefixes, such as _meter-kilometer_. In that case, sort the larger prefixes first, such as _kilometer-meter_ or _kibibyte-kilobyte_.
1153-
6. Within private-use single units, sort by the simple unit alphabetically.
1154+
5. As an edge case, there could be two adjacent single units with the same _simple_ unit but different prefixes such as _meter-kilometer_.
1155+
In that case, sort a sequence of those units by the larger prefixes first, so … megameter < … meter < … picometer < …
1156+
* meter-kilometer ⇒ kilometer-meter
11541157

11551158
The examples in #4 are due to the following ordering of the `unitQuantity` elements:
11561159

@@ -1161,6 +1164,8 @@ The examples in #4 are due to the following ordering of the `unitQuantity` eleme
11611164
4. …
11621165
```
11631166

1167+
Note that this uses an ordering of elements _within_ a unit identifier. It is different than an ordering _of_ separate units, such as within a table.
1168+
11641169
## Mixed Units
11651170

11661171
Mixed units, or unit sequences, are units with the same base unit which are listed in sequence.
@@ -1180,12 +1185,15 @@ However, when all of the units would be omitted, then the highest unit is shown
11801185

11811186
Implementations may offer mechanisms to control the precision of the formatted mixed unit. Examples include, but are not limited to:
11821187
* An implementation could apply the precision of a number formatter to the final unit.
1183-
However, this mechanisim has a couple of disadvantages, such as matching precision across user preferences. For example, suppose the input amount is 1.5254 and the precision is 2 decimals.
1188+
However, this approach has a couple of disadvantages, such as matching precision across user preferences. For example, suppose the input amount is 1.5254 and the precision is 2 decimals.
11841189
* Locale A uses decimal degrees and gets 1.53°.
11851190
* Locale B uses degrees, minutes, seconds, and gets 1° 31′ 31.44″
11861191
* Locale B has an unnecessarily precise result: the equivalent of 1.52540 in precision.
1187-
* An implementation could allow a percentage precision;
1188-
thus 1612 meters with ±1% precision would be represented by **1 mile** rather than **1 mile 9 feet**.
1192+
* An implementation could match the decimal precision that would be used with just the first unit, such as the following:
1193+
* Two decimal digits with degrees is 1.53°, representing a range of 1.525° to 1.535°
1194+
* Only continue adding subunits (or fractions in the final unit) if the current amount is not within that range.
1195+
* 1° 31′ => 1.516666667, so it is not within that range, and we add another subunit
1196+
* 1° 31′ 31″ => 1.525277778, so it is within range, and we don't add any fractional units
11891197

11901198
The default behavior is to round the lowest unit to the nearest integer.
11911199
Thus 1.99959 degree-and-arc-minute-and-arc-second would be (before rounding) **1 degree 59 minutes 58.524 seconds**.

0 commit comments

Comments
 (0)