@@ -7,10 +7,12 @@ Status: **Accepted**
77 <dl>
88 <dt>Contributors</dt>
99 <dd>@eemeli</dd>
10+ <dd>@aphillips</dd>
1011 <dt>First proposed</dt>
1112 <dd>2023-09-06</dd>
1213 <dt>Pull Request</dt>
1314 <dd><a href="https://github.com/unicode-org/message-format-wg/pull/471">#471</a></dd>
15+ <dd><a href="https://github.com/unicode-org/message-format-wg/pull/621">#621</a></dd>
1416 </dl>
1517</details >
1618
@@ -45,6 +47,7 @@ but ordinal rules use `one` (_1st_, _21st_, etc.), `few` (_2nd_, _22nd_, etc.),
4547Additionally,
4648MF1 provides ` ChoiceFormat ` selection based on a complex rule set
4749(and which allows determining if a number falls into a specific range).
50+ This capability is not supported by the default functions of MF2.
4851
4952Both JS and ICU PluralRules implementations provide for determining the plural category
5053of a range based on its start and end values.
@@ -92,44 +95,303 @@ ICU MF1 messages using `plural` and `selectordinal` should be representable in M
9295
9396## Proposed Design
9497
95- Given that we already have a `:number`,
96- it makes sense to add a `<matchSignature>` to it with an option
98+ ### Number Selection
9799
98- ```xml
99- <option name="select" values="plural ordinal exact" default="plural" />
100- ```
100+ Number selection has three modes:
101+ - `exact` selection matches the operand to explicit numeric keys exactly
102+ - `plural` selection matches the operand to explicit numeric keys exactly
103+ or to plural rule categories if there is no explicit match
104+ - `ordinal` selection matches the operand to explicit numeric keys exactly
105+ or to ordinal rule categories if there is no explicit match
106+
107+
108+ ### Functions
109+
110+ The following functions use numeric selection:
111+
112+ The function `:number` is the default selector for numeric values.
113+
114+ The function `:integer` provides a reduced set of options for selecting
115+ and formatting numeric values as integers.
116+
117+ ### Operands
118+
119+ The _operand_ of a number function is either an implementation-defined type or
120+ a literal that matches the `number-literal` production in the [ABNF](/main/spec/message.abnf).
121+ All other values produce a _Selection Error_ when evaluated for selection
122+ or a _Formatting Error_ when attempting to format the value.
123+
124+ > For example, in Java, any subclass of `java.lang.Number` plus the primitive
125+ > types (`byte`, `short`, `int`, `long`, `float`, `double`, etc.)
126+ > might be considered as the "implementation-defined numeric types".
127+ > Implementations in other programming languages would define different types
128+ > or classes according to their local needs.
129+
130+ > [!NOTE]
131+ > String values passed as variables in the _formatting context_'s
132+ > _input mapping_ can be formatted as numeric values as long as their
133+ > contents match the `number-literal` production in the [ABNF](/main/spec/message.abnf).
134+ >
135+ > For example, if the value of the variable `num` were the string
136+ > `-1234.567`, it would behave identically to the local
137+ > variable in this example:
138+ > ```
139+ > .local $example = {|-1234.567| :number}
140+ > {{{$num :number} == {$example}}}
141+ > ```
101142
102- The default ` plural ` value is presumed to be the most common use case,
103- and it affords the least bad fallback when used incorrectly:
104- Using "plural" for "exact" still selects exactly matching cases,
105- whereas using "exact" for "plural" will not select LDML category matches.
106- This might not be noticeable in the source language,
143+ > [!NOTE]
144+ > Implementations are encouraged to provide support for compound types or data structures
145+ > that provide additional semantic meaning to the formatting of number-like values.
146+ > For example, in ICU4J, the type `com.ibm.icu.util.Measure` can be used to communicate
147+ > a value that include a unit
148+ > or the type `com.ibm.icu.util.CurrencyAmount` can be used to set the currency and related
149+ > options (such as the number of fraction digits).
150+
151+
152+ ### Options
153+
154+ The following options and their values are required in the default registry to be available on the
155+ function `:number`:
156+ - `select`
157+ - `plural` (default)
158+ - `ordinal`
159+ - `exact`
160+ - `compactDisplay` // this option only has meaning when combined with the option `notation=compact`
161+ - `short` (default)
162+ - `long`
163+ - `notation`
164+ - `standard` (default)
165+ - `scientific`
166+ - `engineering`
167+ - `compact`
168+ - `numberingSystem`
169+ - valid [Unicode Number System Identifier](https://cldr-smoke.unicode.org/spec/main/ldml/tr35.html#UnicodeNumberSystemIdentifier)
170+ (default is locale-specific)
171+ - `signDisplay`
172+ - `auto` (default)
173+ - `always`
174+ - `exceptZero`
175+ - `negative`
176+ - `never`
177+ - `style`
178+ - `decimal` (default)
179+ - `percent` (see [Percent Style](#percent-style) below)
180+ - `useGrouping`
181+ - `auto` (default)
182+ - `always`
183+ - `never`
184+ - `min2`
185+ - `minimumIntegerDigits`
186+ - (non-negative integer, default: `1`)
187+ -
188+ > [!NOTE]
189+ > The following options do not have default values because they are only to be used
190+ > as overrides for an existing locale-and-value dependent implementation-defined
191+ > default
192+
193+ - `minimumFractionDigits`
194+ - (non-negative integer)
195+ - `maximumFractionDigits`
196+ - (non-negative integer)
197+ - `minimumSignificantDigits`
198+ - (non-negative integer)
199+ - `maximumSignificantDigits`
200+ - (non-negative integer)
201+
202+ The following options and their values are required in the default registry to be available on the
203+ function `:integer`:
204+ - `select`
205+ - `plural` (default)
206+ - `ordinal`
207+ - `exact`
208+ - `numberingSystem`
209+ - valid [Unicode Number System Identifier](https://cldr-smoke.unicode.org/spec/main/ldml/tr35.html#UnicodeNumberSystemIdentifier)
210+ (default is locale-specific)
211+ - `signDisplay`
212+ - `auto` (default)
213+ - `always`
214+ - `exceptZero`
215+ - `negative`
216+ - `never`
217+ - `style`
218+ - `decimal` (default)
219+ - `percent` (see [Percent Style](#percent-style) below)
220+ - `useGrouping`
221+ - `auto` (default)
222+ - `true`
223+ - `false`
224+ - `min2`
225+ - `always`
226+ - `minimumIntegerDigits`
227+ - (non-negative integer, default: `1`)
228+
229+ > [!NOTE]
230+ > The following option does not have a default value because it is only to be used
231+ > as an override for an existing locale-and-value dependent implementation-defined
232+ > default
233+
234+ - `maximumSignificantDigits`
235+ - (non-negative integer)
236+
237+ > [!NOTE]
238+ > The following options or option values are being developed during the Technical Preview
239+ > period.
240+
241+ The following values for the option `style` are _not_ part of the default registry.
242+ Implementations SHOULD avoid creating options that conflict with these, but
243+ are encouraged to track development of these options during Tech Preview:
244+ - `currency`
245+ - `unit`
246+
247+ The following options are _not_ part of the default registry.
248+ Implementations SHOULD avoid creating options that conflict with these, but
249+ are encouraged to track development of these options during Tech Preview:
250+ - `currency`
251+ - valid [Unicode Currency Identifier](https://cldr-smoke.unicode.org/spec/main/ldml/tr35.html#UnicodeCurrencyIdentifier)
252+ (no default)
253+ - `currencyDisplay`
254+ - `symbol` (default)
255+ - `narrowSymbol`
256+ - `code`
257+ - `name`
258+ - `currencySign`
259+ - `accounting`
260+ - `standard` (default)
261+ - `unit`
262+ - (anything not empty)
263+ - `unitDisplay`
264+ - `long`
265+ - `short` (default)
266+ - `narrow`
267+
268+ ### Default Value of `select` Option
269+
270+ The value `plural` is default for the option `select`
271+ because it is the most common use case for numeric selection.
272+ It can be used for exact value matches but also allows for the grammatical needs of other
273+ languages using CLDR's plural rules.
274+ This might not be noticeable in the source language (particularly English),
107275but can cause problems in target locales that the original developer is not considering.
108276
109277> For example, a naive developer might use a special message for the value `1` without
110278> considering other locale's need for a `one` plural:
111279>
112280> ```
113281> .match {$var}
114- > [1] {{You have one last chance}}
115- > [one] {{You have {$var} chance remaining}} // needed by languages such as Polish or Russian
116- > [*] {{You have {$var} chances remaining}}
282+ > 1 {{You have one last chance}}
283+ > one {{You have {$var} chance remaining}} // needed by languages such as Polish or Russian
284+ > // such locales typically require other keywords
285+ > // such as two, few, many, and so forth
286+ > * {{You have {$var} chances remaining}}
117287> ```
118288
119- Additional options such as `minimumFractionDigits` and others already supported by `:number`
120- should also be supported.
121289
122- If PR [#532](https://github.com/unicode-org/message-format-wg/pull/532) is accepted,
123- also add the following `<alias>` definitions to `<function name="number">`:
290+ ### Percent Style
291+
292+ When implementing `style=percent`, the numeric value of the operand
293+ MUST be divided by 100 for the purposes of formatting.
294+
295+ ### Selection
296+
297+ When implementing [`MatchSelectorKeys`](spec/formatting.md#resolve-preferences),
298+ numeric selectors perform as described below.
299+
300+ - Let `return_value` be a new empty list of strings.
301+ - Let `operand` be the resolved value of the _operand_.
302+ If the `operand` is not a number type, emit a _Selection Error_
303+ and return `return_value`.
304+ - Let `keys` be a list of strings containing keys to match.
305+ (Hint: this list is an argument to `MatchSelectorKeys`)
306+ - For each string `key` in `keys`:
307+ - If the value of `key` matches the production `number-literal`:
308+ - If the parsed value of `key` is an [exact match](#determining-exact-literal-match)
309+ of the value of the `operand`, then `key` matches the selector.
310+ Add `key` to the front of the `return_value` list.
311+ - Else, if the value of `key` is a keyword:
312+ - Let `keyword` be a string which is the result of [rule selection](#rule-selection).
313+ - If `keyword` equals `key`, then `key` matches the selector.
314+ Append `key` to the end of the `return_value` list.
315+ - Else, `key` is invalid;
316+ emit a _Selection Error_.
317+ Do not add `key` to `return_value`
318+ - Return `return_value`
319+
320+ ### Plural/Ordinal Keywords
321+ The _plural/ordinal keywords_ are: `zero`, `one`, `two`, `few`, `many`, and
322+ `other`.
323+
324+ ### Rule Selection
325+
326+ If the option `select` is set to `exact`, rule-based selection is not used.
327+ Return the empty string.
328+
329+ > [!NOTE]
330+ > Since keys cannot be the empty string in a numeric selector, returning the
331+ > empty string disables keyword selection
332+
333+ If the option `select` is set to `plural`, selection should be based on CLDR plural rule data
334+ of type `cardinal`. See [charts](https://www.unicode.org/cldr/charts/latest/supplemental/language_plural_rules.html)
335+ for examples.
336+
337+ If the option `select` is set to `ordinal`, selection should be based on CLDR plural rule data
338+ of type `ordinal`. See [charts](https://www.unicode.org/cldr/charts/latest/supplemental/language_plural_rules.html)
339+ for examples.
340+
341+ Apply the rules defined by CLDR to the resolved value of the operand and the function options,
342+ and return the resulting keyword.
343+ If no rules match, return `other`.
344+
345+ > **Example.**
346+ > In CLDR 44, the Czech (`cs`) plural rule set can be found
347+ > [here](https://www.unicode.org/cldr/charts/44/supplemental/language_plural_rules.html#cs).
348+ >
349+ > A message in Czech might be:
350+ > ```
351+ > .match {$numDays :number}
352+ > one {{{$numDays} den}}
353+ > few {{{$numDays} dny}}
354+ > many {{{$numDays} dne}}
355+ > * {{{$numDays} dní}}
356+ > ```
357+ > Using the rules found above, the results of various `operand` values might look like:
358+ > | Operand value | Keyword | Formatted Message |
359+ > |---|---|---|
360+ > | 1 | `one` | 1 den |
361+ > | 2 | `few` | 2 dny |
362+ > | 5 | `other` | 5 dní |
363+ > | 22 | `few` | 22 dny |
364+ > | 27 | `other` | 27 dní |
365+ > | 2.4 | `many` | 2,4 dne |
366+
367+
368+
369+ ### Determining Exact Literal Match
370+
371+ > [!IMPORTANT]
372+ > The exact behavior of exact literal match is only defined for non-zero-filled
373+ > integer values.
374+ > Annotations that use fraction digits or significant digits might work in specific
375+ > implementation-defined ways.
376+ > Users should avoid depending on these types of keys in message selection.
377+
378+
379+ Number literals in the MessageFormat 2 syntax use the
380+ [format defined for a JSON number](https://www.rfc-editor.org/rfc/rfc8259#section-6).
381+ The resolved value of an `operand` exactly matches a numeric literal `key`
382+ if, when the `operand` is serialized using the format for a JSON number
383+ the two strings are equal.
384+
385+ > [!NOTE]
386+ > Implementations are not expected to implement this exactly as written,
387+ > as there are clearly optimizations that can be applied.
388+
389+ > [!NOTE]
390+ > Only integer matching is required in the Technical Preview.
391+ > Feedback describing use cases for fractional and significant digits-based
392+ > selection would be helpful.
393+ Otherwise, users should avoid using matching with fractional numbers or significant digits.
124394
125- ```xml
126- <alias name="plural" supports="match">
127- <setOption name="select" value="plural"/>
128- </alias>
129- <alias name="ordinal" supports="match">
130- <setOption name="select" value="ordinal"/>
131- </alias>
132- ```
133395
134396## Alternatives Considered
135397
0 commit comments