From 6e8513a563e5c11dd82a7269a209d49b2bf14b17 Mon Sep 17 00:00:00 2001 From: Tim Chevalier Date: Fri, 6 Dec 2024 14:13:19 -0800 Subject: [PATCH 01/12] In bidi default strategy, make steps consistent with each other --- spec/formatting.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/spec/formatting.md b/spec/formatting.md index 78a2a71347..1bbfc44791 100644 --- a/spec/formatting.md +++ b/spec/formatting.md @@ -942,9 +942,9 @@ The _Default Bidi Strategy_ is defined as follows: True if the `u:dir` _option_ of the _resolved value_ of `exp` has a value other than `'inherit'`, or False otherwise. 1. If `dir` is `'LTR'`: - 1. If `msgdir` is `'LTR'` in the formatted output + 1. If `msgdir` is `'LTR'` and `isolate` is False, - let `fmt` be itself + let the formatted output be `fmt` itself. 1. Else, in the formatted output, prefix `fmt` with U+2066 LEFT-TO-RIGHT ISOLATE and postfix it with U+2069 POP DIRECTIONAL ISOLATE. From 39e9056592d0aa4bfe8da978ee646f0f6737e751 Mon Sep 17 00:00:00 2001 From: Tim Chevalier Date: Fri, 6 Dec 2024 14:30:51 -0800 Subject: [PATCH 02/12] Don't replace entire formatted output --- spec/formatting.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec/formatting.md b/spec/formatting.md index 1bbfc44791..064b259cfa 100644 --- a/spec/formatting.md +++ b/spec/formatting.md @@ -944,7 +944,7 @@ The _Default Bidi Strategy_ is defined as follows: 1. If `dir` is `'LTR'`: 1. If `msgdir` is `'LTR'` and `isolate` is False, - let the formatted output be `fmt` itself. + append `fmt` to the formatted output. 1. Else, in the formatted output, prefix `fmt` with U+2066 LEFT-TO-RIGHT ISOLATE and postfix it with U+2069 POP DIRECTIONAL ISOLATE. From c69013d5a3a1f02bf04ced194007074e0622d4c5 Mon Sep 17 00:00:00 2001 From: Tim Chevalier Date: Mon, 9 Dec 2024 14:21:46 -0800 Subject: [PATCH 03/12] Reformulate the default bidi strategy as a function --- spec/formatting.md | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/spec/formatting.md b/spec/formatting.md index 064b259cfa..c09fa078df 100644 --- a/spec/formatting.md +++ b/spec/formatting.md @@ -928,7 +928,8 @@ Implementations MAY provide other _bidirectional isolation strategies_. Implementations MAY supply a _bidirectional isolation strategy_ that performs no processing. -The _Default Bidi Strategy_ is defined as follows: +The _Default Bidi Strategy_ is defined as a function `B` from expressions +to formatted strings, as follows. 1. Let `msgdir` be the directionality of the whole message, one of « `'LTR'`, `'RTL'`, `'unknown'` ». @@ -944,17 +945,17 @@ The _Default Bidi Strategy_ is defined as follows: 1. If `dir` is `'LTR'`: 1. If `msgdir` is `'LTR'` and `isolate` is False, - append `fmt` to the formatted output. - 1. Else, in the formatted output, - prefix `fmt` with U+2066 LEFT-TO-RIGHT ISOLATE - and postfix it with U+2069 POP DIRECTIONAL ISOLATE. + `B(exp)` is `fmt`. + 1. Else, `B(exp)` is + `fmt` prefixed with U+2066 LEFT-TO-RIGHT ISOLATE + and postfixed with U+2069 POP DIRECTIONAL ISOLATE. 1. Else, if `dir` is `'RTL'`: - 1. In the formatted output, - prefix `fmt` with U+2067 RIGHT-TO-LEFT ISOLATE - and postfix it with U+2069 POP DIRECTIONAL ISOLATE. + 1. `B(exp)` is + `fmt` prefixed with U+2067 RIGHT-TO-LEFT ISOLATE + and postfixed with U+2069 POP DIRECTIONAL ISOLATE. 1. Else: - 1. In the formatted output, - prefix `fmt` with U+2068 FIRST STRONG ISOLATE - and postfix it with U+2069 POP DIRECTIONAL ISOLATE. + 1. `B(exp)` is + `fmt` prefixed with U+2068 FIRST STRONG ISOLATE + and postfixed with U+2069 POP DIRECTIONAL ISOLATE. From 386fa2f5c2ab6bf76828738cd706eff003f0797e Mon Sep 17 00:00:00 2001 From: Tim Chevalier Date: Wed, 11 Dec 2024 13:33:49 -0800 Subject: [PATCH 04/12] Rewrite imperatively --- spec/formatting.md | 43 ++++++++++++++++++++++++------------------- 1 file changed, 24 insertions(+), 19 deletions(-) diff --git a/spec/formatting.md b/spec/formatting.md index c09fa078df..1b2a2e7c97 100644 --- a/spec/formatting.md +++ b/spec/formatting.md @@ -928,34 +928,39 @@ Implementations MAY provide other _bidirectional isolation strategies_. Implementations MAY supply a _bidirectional isolation strategy_ that performs no processing. -The _Default Bidi Strategy_ is defined as a function `B` from expressions -to formatted strings, as follows. +The _Default Bidi Strategy_ is defined as follows: +1. Let `out` be the empty string. 1. Let `msgdir` be the directionality of the whole message, one of « `'LTR'`, `'RTL'`, `'unknown'` ». These correspond to the message having left-to-right directionality, right-to-left directionality, and to the message's directionality not being known. -1. For each _expression_ `exp` in _pattern_: - 1. Let `fmt` be the formatted string representation of the _resolved value_ of `exp`. - 1. Let `dir` be the directionality of `fmt`, +1. For each part `part` in _pattern_: + 1. If `part` is a plain literal (text) part, append `part` to `out`. + 1. Else: + i. Assert `part` is a _placeholder_. + i. Let `exp` be `part`. + i. Let `fmt` be the formatted string representation of the _resolved value_ of `exp`. + ii. Let `dir` be the directionality of `fmt`, one of « `'LTR'`, `'RTL'`, `'unknown'` », with the same meanings as for `msgdir`. - 1. Let the boolean value `isolate` be + iii. Let the boolean value `isolate` be True if the `u:dir` _option_ of the _resolved value_ of `exp` has a value other than `'inherit'`, or False otherwise. - 1. If `dir` is `'LTR'`: + iv. If `dir` is `'LTR'`: 1. If `msgdir` is `'LTR'` and `isolate` is False, - `B(exp)` is `fmt`. - 1. Else, `B(exp)` is - `fmt` prefixed with U+2066 LEFT-TO-RIGHT ISOLATE - and postfixed with U+2069 POP DIRECTIONAL ISOLATE. - 1. Else, if `dir` is `'RTL'`: - 1. `B(exp)` is - `fmt` prefixed with U+2067 RIGHT-TO-LEFT ISOLATE - and postfixed with U+2069 POP DIRECTIONAL ISOLATE. - 1. Else: - 1. `B(exp)` is - `fmt` prefixed with U+2068 FIRST STRONG ISOLATE - and postfixed with U+2069 POP DIRECTIONAL ISOLATE. + append `fmt` to `out`. + 1. Else: + i. Append U+2066 LEFT-TO-RIGHT ISOLATE to `out`. + i. Append `fmt` to `out`. + i. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. + v. Else, if `dir` is `'RTL'`: + 1. Append U+2067 RIGHT-TO-LEFT ISOLATE to `out.` + 1. Append `fmt` to `out`. + 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. + vi. Else: + 1. Append U+2068 FIRST STRONG ISOLATE to `out`. + 1. Append `fmt` to `out`. + 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. From 019a29f8cc3e9279d4c3e4ca75b10a488ef62822 Mon Sep 17 00:00:00 2001 From: Tim Chevalier Date: Wed, 11 Dec 2024 13:34:48 -0800 Subject: [PATCH 05/12] Add last line (emit output) --- spec/formatting.md | 1 + 1 file changed, 1 insertion(+) diff --git a/spec/formatting.md b/spec/formatting.md index 1b2a2e7c97..4e74c4ee91 100644 --- a/spec/formatting.md +++ b/spec/formatting.md @@ -962,5 +962,6 @@ The _Default Bidi Strategy_ is defined as follows: 1. Append U+2068 FIRST STRONG ISOLATE to `out`. 1. Append `fmt` to `out`. 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. +1. Emit `out` as the formatted output of the message. From f42453c7e4ea69fd8056009f17d08ae1d0c72a91 Mon Sep 17 00:00:00 2001 From: Tim Chevalier Date: Wed, 11 Dec 2024 13:36:34 -0800 Subject: [PATCH 06/12] Fix formatting --- spec/formatting.md | 48 +++++++++++++++++++++++----------------------- 1 file changed, 24 insertions(+), 24 deletions(-) diff --git a/spec/formatting.md b/spec/formatting.md index 4e74c4ee91..c7dc737ca6 100644 --- a/spec/formatting.md +++ b/spec/formatting.md @@ -938,30 +938,30 @@ The _Default Bidi Strategy_ is defined as follows: 1. For each part `part` in _pattern_: 1. If `part` is a plain literal (text) part, append `part` to `out`. 1. Else: - i. Assert `part` is a _placeholder_. - i. Let `exp` be `part`. - i. Let `fmt` be the formatted string representation of the _resolved value_ of `exp`. - ii. Let `dir` be the directionality of `fmt`, - one of « `'LTR'`, `'RTL'`, `'unknown'` », with the same meanings as for `msgdir`. - iii. Let the boolean value `isolate` be - True if the `u:dir` _option_ of the _resolved value_ of `exp` has a value other than `'inherit'`, - or False otherwise. - iv. If `dir` is `'LTR'`: - 1. If `msgdir` is `'LTR'` - and `isolate` is False, - append `fmt` to `out`. - 1. Else: - i. Append U+2066 LEFT-TO-RIGHT ISOLATE to `out`. - i. Append `fmt` to `out`. - i. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. - v. Else, if `dir` is `'RTL'`: - 1. Append U+2067 RIGHT-TO-LEFT ISOLATE to `out.` - 1. Append `fmt` to `out`. - 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. - vi. Else: - 1. Append U+2068 FIRST STRONG ISOLATE to `out`. - 1. Append `fmt` to `out`. - 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. + i. Assert `part` is a _placeholder_. + i. Let `exp` be `part`. + i. Let `fmt` be the formatted string representation of the _resolved value_ of `exp`. + i. Let `dir` be the directionality of `fmt`, + one of « `'LTR'`, `'RTL'`, `'unknown'` », with the same meanings as for `msgdir`. + i. Let the boolean value `isolate` be + True if the `u:dir` _option_ of the _resolved value_ of `exp` has a value other than `'inherit'`, + or False otherwise. + i. If `dir` is `'LTR'`: + 1. If `msgdir` is `'LTR'` + and `isolate` is False, + append `fmt` to `out`. + 1. Else: + i. Append U+2066 LEFT-TO-RIGHT ISOLATE to `out`. + i. Append `fmt` to `out`. + i. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. + i. Else, if `dir` is `'RTL'`: + 1. Append U+2067 RIGHT-TO-LEFT ISOLATE to `out.` + 1. Append `fmt` to `out`. + 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. + i. Else: + 1. Append U+2068 FIRST STRONG ISOLATE to `out`. + 1. Append `fmt` to `out`. + 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. 1. Emit `out` as the formatted output of the message. From 4c918ac6af135d62f227121b10ac44f6597327cd Mon Sep 17 00:00:00 2001 From: Tim Chevalier Date: Wed, 11 Dec 2024 13:37:56 -0800 Subject: [PATCH 07/12] Fix formatting harder --- spec/formatting.html | 1048 ++++++++++++++++++++++++++++++++++++++++++ spec/formatting.md | 16 +- 2 files changed, 1056 insertions(+), 8 deletions(-) create mode 100644 spec/formatting.html diff --git a/spec/formatting.html b/spec/formatting.html new file mode 100644 index 0000000000..b80bdde905 --- /dev/null +++ b/spec/formatting.html @@ -0,0 +1,1048 @@ + + + + + + +formatting.html + + + + + + +

DRAFT MessageFormat 2.0 +Formatting

+

Introduction

+

This section defines the behavior of a MessageFormat 2.0 +implementation when formatting a message for display in a user +interface, or for some later processing.

+

To start, we presume that a message has either been parsed +from its syntax or created from a data model description. If the +resulting message is not well-formed, a Syntax +Error is emitted. If the resulting message is +well-formed but is not valid, a Data Model +Error is emitted.

+

The formatting of a message is defined by the following +operations:

+
    +
  • Pattern Selection determines +which of a message’s patterns is formatted. For a message with +no selectors, this is simple as there is only one +pattern. With selectors, this will depend on their +resolution.

  • +
  • Formatting takes the +resolved values of the text and placeholder +parts of the selected pattern, and produces the formatted +result for the message. Depending on the implementation, this +result could be a single concatenated string, an array of objects, an +attributed string, or some other locally appropriate data type.

  • +
  • Expression and Markup +Resolution determines the value of an +expression or markup, with reference to the current +formatting context. This can include multiple steps, such as +looking up the value of a variable and calling formatting functions. The +form of the resolved value is implementation defined and the +value might not be evaluated or formatted yet. However, it needs to be +“formattable”, i.e. it contains everything required by the eventual +formatting.

    +

    The resolution of text is rather straightforward, and is +detailed under literal resolution.

  • +
+

Implementations are not required to expose the expression +resolution and pattern selection operations to their +users, or even use them in their internal processing, as long as the +final formatting result is made available to users and the +observable behavior of the formatting matches that described +here.

+

Attributes MUST NOT have any effect on the formatted output +of a message, nor be made available to function +handlers.

+
+

[!IMPORTANT]

+

This specification does not require either eager or lazy +expression resolution of message parts; do not +construe any requirement in this document as requiring +either.

+

Implementations are not required to evaluate all parts of a +message when parsing, processing, or formatting. In particular, +an implementation MAY choose not to evaluate or resolve the value of a +given expression until it is actually used by a selection or +formatting process. However, when an expression is resolved, it +MUST behave as if all preceding declarations affecting +variables referenced by that expression have already +been evaluated in the order in which the relevant declarations +appear in the message. An implementation MUST ensure that every +expression in a message is evaluated at most once.

+
+
+

[!IMPORTANT]

+

Implementations with lazy evaluation MUST NOT use a call-by-name +evaluation strategy. Instead, they must evaluate expressions at most +once (“call-by-need”). This is to prevent expressions from +having different values when used in different parts of a given +message. Function handlers are not necessarily pure: +they can access external mutable state such as the current system clock +time. Thus, evaluating the same expression more than once could +yield different results. That behavior violates this specification.

+
+
+

[!IMPORTANT] Implementations and users SHOULD NOT create function +handlers that mutate external program state, particularly since +such a function handler can present a remote execution +hazard.

+
+

Formatting Context

+

A message’s formatting +context represents the data and procedures that are +required for the message’s expression resolution, +pattern selection and formatting.

+

At a minimum, it includes:

+
    +
  • Information on the current locale, +potentially including a fallback chain of locales. This will be passed +on to formatting functions.

  • +
  • Information on the base directionality of the message +and its text tokens. This will be used by strategies for +bidirectional isolation, and can be used to set the base direction of +the message upon display.

  • +
  • An input mapping of string +identifiers to values, defining variable values that are available +during variable resolution. This is often determined by a +user-provided argument of a formatting function call.

  • +
  • The function registry, providing the function +handlers of the functions referred to by message +functions.

  • +
  • Optionally, a fallback string to use for the message if +it is not valid.

  • +
+

Implementations MAY include additional fields in their formatting +context.

+

Resolved Values

+

A resolved value is the result +of resolving a text, literal, variable, +expression, or markup. The resolved value is +determined using the formatting context. The form of the +resolved value is implementation-defined.

+

In a declaration, the resolved value of an +expression is bound to a variable, which makes it +available for use in later expressions and markup +options.

+
+

For example, in

+
.input {$a :number minimumFractionDigits=3}
+.local $b = {$a :integer notation=compact}
+.match $a
+0 {{The value is zero.}}
+* {{In compact form, the value {$a} is rendered as {$b}.}}
+

the resolved value bound to $a is used as the +operand of the :integer function when +resolving the value of the variable $b, as a +selector in the .match statement, as well as for +formatting the placeholder {$a}.

+
+

In an input-declaration, the variable operand of +the variable-expression identifies not only the name of the +external input value, but also the variable to which the +resolved value of the variable-expression is +bound.

+

In a pattern, the resolved value of an +expression or markup is used in its +formatting.

+

The form that resolved values take is +implementation-dependent, and different implementations MAY choose to +perform different levels of resolution.

+
+

While this specification does not require it, a resolved +value could be implemented by requiring each function +handler to return a value matching the following interface:

+
interface MessageValue {
+  formatToString(): string
+  formatToX(): X // where X is an implementation-defined type
+  getValue(): unknown
+  resolvedOptions(): { [key: string]: MessageValue }
+  selectKeys(keys: string[]): string[]
+}
+

With this approach: - An expression could be used as a +placeholder if calling the formatToString() or +formatToX() method of its resolved value did not +emit an error. - A variable could be used as a +selector if calling the selectKeys(keys) method of +its resolved value did not emit an error. - Using a +variable, the resolved value of an expression +could be used as an operand or option value if calling +the getValue() method of its resolved value did +not emit an error. In this use case, the resolvedOptions() +method could also provide a set of option values that could be taken +into account by the called function.

+

Extensions of the base MessageValue interface could be +provided for different data types, such as numbers or strings, for which +the unknown return type of getValue() and the +generic MessageValue type used in +resolvedOptions() could be narrowed appropriately. An +implementation could also allow MessageValue values to be +passed in as input variables, or automatically wrap each variable as a +MessageValue to provide a uniform interface for custom +functions.

+
+

Expression and Markup +Resolution

+

Expressions are used in declarations and +patterns. Markup is only used in +patterns.

+

Depending on the presence or absence of a variable or +literal operand and a function, the resolved +value of the expression is determined as follows:

+

If the expression contains a function, its +resolved value is defined by function resolution.

+

Else, if the expression consists of a variable, its +resolved value is defined by variable resolution. An +implementation MAY perform additional processing when resolving the +value of an expression that consists only of a +variable.

+
+

For example, it could apply function resolution using a +function and a set of options chosen based on the +value or type of the variable. So, given a message +like this:

+
Today is {$date}
+

If the value passed in the variable were a date object, such +as a JavaScript Date or a Java java.util.Date +or java.time.Temporal, the implementation could interpret +the placeholder {$date} as if the pattern included +the function :datetime with some set of default +options.

+
+

Else, the expression consists of a literal. Its +resolved value is defined by literal resolution.

+
+

[!NOTE] This means that a literal value with no +function is always treated as a string. To represent values +that are not strings as a literal, a function needs to +be provided:

+
.local $aNumber = {1234 :number}
+.local $aDate = {|2023-08-30| :datetime}
+.local $aFoo = {|some foo| :foo}
+{{You have {42 :number}}}
+
+

Literal Resolution

+

The resolved value of a text or a literal +contains the character sequence of the text or literal +after any character escape has been converted to the escaped +character.

+

When a literal is used as an operand or on the +right-hand side of an option, the formatting function MUST +treat its resolved value the same whether its value was +originally a quoted literal or an unquoted +literal.

+
+

For example, the option foo=42 and the +option foo=|42| are treated as identical.

+
+
+

For example, in a JavaScript formatter, the resolved value +of a text or a literal could have the following +implementation:

+
class MessageLiteral implements MessageValue {
+  constructor(value: string) {
+    this.formatToString = () => value;
+    this.getValue = () => value;
+  }
+  resolvedOptions: () => ({});
+  selectKeys(_keys: string[]) {
+    throw Error("Selection on unannotated literals is not supported");
+  }
+}
+
+

Variable Resolution

+

To resolve the value of a variable, its name is +used to identify either a local variable or an input variable. If a +declaration exists for the variable, its resolved +value is used. Otherwise, the variable is an implicit +reference to an input value, and its value is looked up from the +formatting context input mapping.

+

The resolution of a variable fails if no value is identified +for its name. If this happens, an Unresolved Variable +error is emitted and a fallback value is used as the +resolved value of the variable.

+

If the resolved value identified for the variable +name is a fallback value, a fallback value is +used as the resolved value of the variable.

+

The fallback value representation of a variable has +a string representation consisting of the U+0024 DOLLAR SIGN +$ followed by the name of the +variable.

+

Function Resolution

+

To resolve an expression with a function, the +following steps are taken:

+
    +
  1. If the expression includes an operand, resolve +its value. If this is a fallback value, return a fallback +value as the resolved value of the +expression.

  2. +
  3. Resolve the identifier of the function and find +the appropriate function handler to call. If the implementation +cannot find the function handler, or if the identifier +includes a namespace that the implementation does not support, +emit an Unknown Function error and return a fallback +value as the resolved value of the +expression.

    +

    Implementations are not required to implement namespaces or +installable function registries.

  4. +
  5. Perform option resolution.

  6. +
  7. Determine the function context for calling the +function handler.

    +

    The function context contains +the context necessary for the function handler to resolve the +expression. This includes:

    +
      +
    • The current locale, potentially including a fallback chain +of locales.
    • +
    • The base directionality of the expression. By default, this +is undefined or empty.
    • +
    +

    If the resolved mapping of options includes any +u: options supported by the implementation, +process them as specified. Such u: options MAY be removed +from the resolved mapping of options.

  8. +
  9. Call the function handler with the following +arguments:

    +
      +
    • The function context.
    • +
    • The resolved mapping of options.
    • +
    • If the expression includes an operand, its +resolved value.
    • +
    +

    The form that resolved operand and option values +take is implementation-defined.

    +

    An implementation MAY pass additional arguments to the function +handler, as long as reasonable precautions are taken to keep the +function interface simple and minimal, and avoid introducing potential +security vulnerabilities.

  10. +
  11. If the call succeeds, resolve the value of the +expression as the result of that function call.

    +

    If the call fails or does not return a valid value, emit the +appropriate Message Function Error for the failure.

    +

    Implementations MAY provide a mechanism for the function +handler to provide additional detail about internal failures. +Specifically, if the cause of the failure was that the datatype, value, +or format of the operand did not match that expected by the +function, the function SHOULD cause a Bad +Operand error to be emitted.

    +

    In all failure cases, return a fallback value as the +resolved value of the expression.

  12. +
+

Function Handler

+

A function handler is an +implementation-defined process such as a function or method which +accepts a set of arguments and returns a resolved value. A +function handler is required to resolve a +function.

+

An implementation MAY define its own functions and their handlers. An +implementation MAY allow custom functions to be defined by users.

+

Implementations that provide a means for defining custom functions +MUST provide a means for function handlers to return +resolved values that contain enough information to be used as +operands or option values in subsequent +expressions.

+

The resolved value returned by a function handler +MAY be different from the value of the operand of the +function. It MAY be an implementation specified type. It is not +required to be the same type as the operand.

+

A function handler MAY include resolved options in its +resolved value. The resolved options MAY be different from the +options of the function.

+

A function handler SHOULD emit a Bad Operand error +for operands whose resolved value or type is not +supported.

+

Function handler access to the formatting context +MUST be minimal and read-only, and execution time SHOULD be limited.

+

Implementation-defined functions SHOULD use an +implementation-defined namespace.

+

Option Resolution

+

Option resolution is the process +of computing the options for a given expression. +Option resolution results in a mapping of string +identifiers to values. The order of options +MUST NOT be significant.

+
+

For example, the following message treats both both +placeholders identically:

+
{$x :function option1=foo option2=bar} {$x :function option2=bar option1=foo}
+
+

For each option:

+
    +
  1. Let res be a new empty mapping.
  2. +
  3. For each option: +
      +
    1. Let id be the string value of the identifier +of the option.
    2. +
    3. Let rv be the resolved value of the +option value.
    4. +
    5. If rv is a fallback value: +
        +
      1. If supported, emit a Bad Option error.
      2. +
    6. +
    7. Else: +
        +
      1. Set res[id] to be rv.
      2. +
    8. +
  4. +
  5. Return res.
  6. +
+

The result of option resolution MUST be a (possibly empty) +mapping of string identifiers to values; that is, errors MAY be emitted, +but such errors MUST NOT be fatal. This mapping can be empty.

+
+

[!NOTE] The resolved value of a function +operand can also include resolved option values. These are not +included in the option resolution result, and need to be +processed separately by a function handler.

+
+

Markup Resolution

+

Unlike functions, the resolution of markup is not +customizable.

+

The resolved value of markup includes the following +fields:

+
    +
  • The type of the markup: open, standalone, or close
  • +
  • The identifier of the markup
  • +
  • The resolved options values after option +resolution.
  • +
+

If the resolved mapping of options includes any +u: options supported by the implementation, +process them as specified. Such u: options MAY be removed +from the resolved mapping of options.

+

The resolution of markup MUST always succeed.

+

Fallback Resolution

+

A fallback value is the +resolved value for an expression or variable +when that expression or variable fails to resolve. It +contains a string representation that is used for its formatting, and no +option values.

+

The resolved value of text, literal, and +markup MUST NOT be a fallback value.

+

A variable fails to resolve when no value is identified for +its name. The string representation of its fallback +value is U+0024 DOLLAR SIGN $ followed by the +name of the variable.

+

An expression fails to resolve when:

+
    +
  • A variable used as its operand resolves to a +fallback value. Note that an expression does not +necessarily fail to resolve if an option resolves with a +fallback value.
  • +
  • No function handler is found for a function +identifier.
  • +
  • Calling a function handler fails or does not return a valid +value.
  • +
+

The string representation of the fallback value of an +expression depends on its contents:

+
    +
  • expression with a literal operand +(either quoted or unquoted): U+007C VERTICAL LINE | +followed by the value of the literal with escaping applied to +U+005C REVERSE SOLIDUS \ and U+007C VERTICAL LINE +|, and then by U+007C VERTICAL LINE |.

    +
    +

    Examples: In a context where :func fails to resolve, +{42 :func} resolves to a fallback value with a +string representation |42| and {|C:\\| :func} +resolves to a fallback value with a string representation +|C:\\|.

    +
  • +
  • expression with variable operand: the +fallback value representation of that variable, U+0024 +DOLLAR SIGN $ followed by the name of the +variable

    +
    +

    Examples: In a context where $var fails to resolve, +{$var} and {$var :number} both resolve to a +fallback value with a string representation $var +(even if :number fails to resolve).

    +

    In a context where :func fails to resolve, the +placeholder in +.local $var = {|val| :func} {{{$var}}} resolves to a +fallback value with a string representation +$var.

    +

    In a context where either :now or :pretty +fails to resolve, the placeholder in

    +
    .local $time = {:now format=iso8601}
    +{{{$time :pretty}}}
    +

    resolves to a fallback value with a string representation +$time.

    +
  • +
  • function expression with no operand: +U+003A COLON : followed by the function +identifier

    +
    +

    Examples: In a context where :func fails to resolve, +{:func} resolves to a fallback value with a string +representation :func. In a context where +:ns:func fails to resolve, {:ns:func} resolves +to a fallback value with a string representation +:ns:func.

    +
  • +
  • Otherwise: the U+FFFD REPLACEMENT CHARACTER

    +

    This is not currently used by any expression, but may apply in future +revisions.

  • +
+

Options and attributes are not included in the +fallback value.

+

Pattern selection is not supported for fallback +values.

+
+

For example, in a JavaScript formatter the fallback value +could have the following implementation, where source is +one of the above-defined strings:

+
class MessageFallback implements MessageValue {
+  constructor(source: string) {
+    this.formatToString = () => `{${source}}`;
+    this.getValue = () => undefined;
+  }
+  resolvedOptions: () => ({});
+  selectKeys(_keys: string[]) {
+    throw Error("Selection on fallback values is not supported");
+  }
+}
+
+

Pattern Selection

+

If the message being formatted is not well-formed +and valid, the result of pattern selection is a +pattern consisting of a single fallback value using +the message’s fallback string defined in the formatting +context or if this is not available or empty, the U+FFFD +REPLACEMENT CHARACTER .

+

If the message being formatted does not contain a +matcher, the result of pattern selection is its +pattern value.

+

When a message contains a matcher with one or more +selectors, the implementation needs to determine which +variant will be used to provide the pattern for the +formatting operation. This is done by ordering and filtering the +available variant statements according to their key +values and selecting the first one.

+
+

[!NOTE] At least one variant is required to have all of its +keys consist of the fallback value *. Some +selectors might be implemented in a way that the key value +* cannot be selected in a valid message. +In other cases, this key value might be unreachable only in certain +locales. This could result in the need in some locales to create one or +more variants that do not make sense grammatically for that +language. > For example, in the pl (Polish) locale, this +message cannot reach > the * variant: +> +> .input {$num :integer} > .match $num > 0 {{ }} > one {{ }} > few {{ }} > many {{ }} > * {{Only used by fractions in Polish.}} >

+

In the Tech Preview, feedback from users and implementers is desired +about whether to relax the requirement that such a “fallback +variant” appear in every message, versus the potential for a +message to fail at runtime because no matching variant +is available.

+
+

The number of keys in each variant MUST equal the +number of selectors.

+

Each key corresponds to a selector by its position +in the variant.

+
+

For example, in this message:

+
.input {$one :number}
+.input {$two :number}
+.input {$three :number}
+.match $one $two $three
+1 2 3 {{ ... }}
+

The first key 1 corresponds to the first +selector ($one), the second key +2 to the second selector ($two), and +the third key 3 to the third selector +($three).

+
+

To determine which variant best matches a given set of +inputs, each selector is used in turn to order and filter the +list of variants.

+

Each variant with a key that does not match its +corresponding selector is omitted from the list of +variants. The remaining variants are sorted according +to the selector’s key-ordering preference. Earlier +selectors in the matcher’s list of selectors +have a higher priority than later ones.

+

When all of the selectors have been processed, the +earliest-sorted variant in the remaining list of +variants is selected.

+

This selection method is defined in more detail below. An +implementation MAY use any pattern selection method, as long as its +observable behavior matches the results of the method defined here.

+

Resolve Selectors

+

First, resolve the values of each selector:

+
    +
  1. Let res be a new empty list of resolved values +that support selection.
  2. +
  3. For each selector sel, in source order, +
      +
    1. Let rv be the resolved value of +sel.
    2. +
    3. If selection is supported for rv: +
        +
      1. Append rv as the last element of the list +res.
      2. +
    4. +
    5. Else: +
        +
      1. Let nomatch be a resolved value for which +selection always fails.
      2. +
      3. Append nomatch as the last element of the list +res.
      4. +
      5. Emit a Bad Selector error.
      6. +
    6. +
  4. +
+

The form of the resolved values is determined by each +implementation, along with the manner of determining their support for +selection.

+

Resolve Preferences

+

Next, using res, resolve the preferential order for all +message keys:

+
    +
  1. Let pref be a new empty list of lists of strings.
  2. +
  3. For each index i in res: +
      +
    1. Let keys be a new empty list of strings.
    2. +
    3. For each variant var of the message: +
        +
      1. Let key be the var key at position +i.
      2. +
      3. If key is not the catch-all key '*': +
          +
        1. Assert that key is a literal.
        2. +
        3. Let ks be the resolved value of +key in Unicode Normalization Form C.
        4. +
        5. Append ks as the last element of the list +keys.
        6. +
      4. +
    4. +
    5. Let rv be the resolved value at index +i of res.
    6. +
    7. Let matches be the result of calling the method +MatchSelectorKeys(rv, keys)
    8. +
    9. Append matches as the last element of the list +pref.
    10. +
  4. +
+

The method MatchSelectorKeys is determined by the implementation. It +takes as arguments a resolved selector value rv +and a list of string keys keys, and returns a list of +string keys in preferential order. The returned list MUST contain only +unique elements of the input list keys. The returned list +MAY be empty. The most-preferred key is first, with each successive key +appearing in order by decreasing preference.

+

The resolved value of each key MUST be in Unicode +Normalization Form C (“NFC”), even if the literal for the +key is not.

+

If calling MatchSelectorKeys encounters any error, a Bad +Selector error is emitted and an empty list is returned.

+

Filter Variants

+

Then, using the preferential key orders pref, filter the +list of variants to the ones that match with some +preference:

+
    +
  1. Let vars be a new empty list of variants.
  2. +
  3. For each variant var of the message: +
      +
    1. For each index i in pref: +
        +
      1. Let key be the var key at position +i.
      2. +
      3. If key is the catch-all key '*': +
          +
        1. Continue the inner loop on pref.
        2. +
      4. +
      5. Assert that key is a literal.
      6. +
      7. Let ks be the resolved value of +key.
      8. +
      9. Let matches be the list of strings at index +i of pref.
      10. +
      11. If matches includes ks: +
          +
        1. Continue the inner loop on pref.
        2. +
      12. +
      13. Else: +
          +
        1. Continue the outer loop on message variants.
        2. +
      14. +
    2. +
    3. Append var as the last element of the list +vars.
    4. +
  4. +
+

Sort Variants

+

Finally, sort the list of variants vars and select the +pattern:

+
    +
  1. Let sortable be a new empty list of (integer, +variant) tuples.
  2. +
  3. For each variant var of vars: +
      +
    1. Let tuple be a new tuple (-1, var).
    2. +
    3. Append tuple as the last element of the list +sortable.
    4. +
  4. +
  5. Let len be the integer count of items in +pref.
  6. +
  7. Let i be len - 1.
  8. +
  9. While i >= 0: +
      +
    1. Let matches be the list of strings at index +i of pref.
    2. +
    3. Let minpref be the integer count of items in +matches.
    4. +
    5. For each tuple tuple of sortable: +
        +
      1. Let matchpref be an integer with the value +minpref.
      2. +
      3. Let key be the tuple variant key +at position i.
      4. +
      5. If key is not the catch-all key '*': +
          +
        1. Assert that key is a literal.
        2. +
        3. Let ks be the resolved value of +key.
        4. +
        5. Let matchpref be the integer position of +ks in matches.
        6. +
      6. +
      7. Set the tuple integer value as +matchpref.
      8. +
    6. +
    7. Set sortable to be the result of calling the method +SortVariants(sortable).
    8. +
    9. Set i to be i - 1.
    10. +
  10. +
  11. Let var be the variant element of the first +element of sortable.
  12. +
  13. Select the pattern of var.
  14. +
+

SortVariants is a method whose single argument is a list +of (integer, variant) tuples. It returns a list of (integer, +variant) tuples. Any implementation of +SortVariants is acceptable as long as it satisfies the +following requirements:

+
    +
  1. Let sortable be an arbitrary list of (integer, +variant) tuples.
  2. +
  3. Let sorted be SortVariants(sortable).
  4. +
  5. sorted is the result of sorting sortable +using the following comparator: +
      +
    1. (i1, v1) <= (i2, v2) if and only if +i1 <= i2.
    2. +
  6. +
  7. The sort is stable (pairs of tuples from sortable that +are equal in their first element have the same relative order in +sorted).
  8. +
+

Examples

+

This section is non-normative.

+

Example 1

+

Presuming a minimal implementation which only supports +:string function which matches keys by using +string comparison, and a formatting context in which the variable +reference $foo resolves to the string 'foo' +and the variable reference $bar resolves to the string +'bar', pattern selection proceeds as follows for this +message:

+
.input {$foo :string}
+.input {$bar :string}
+.match $foo $bar
+bar bar {{All bar}}
+foo foo {{All foo}}
+* * {{Otherwise}}
+
    +
  1. For the first selector:
    The value of the selector is resolved +to be 'foo'.
    The available keys « 'bar', +'foo' » are compared to 'foo',
    resulting +in a list « 'foo' » of matching keys.

  2. +
  3. For the second selector:
    The value of the selector is +resolved to be 'bar'.
    The available keys « +'bar', 'foo' » are compared to +'bar',
    resulting in a list « 'bar' » of +matching keys.

  4. +
  5. Creating the list vars of variants matching all +keys:
    The first variant bar bar is discarded as its +first key does not match the first selector.
    The second variant +foo foo is discarded as its second key does not match the +second selector.
    The catch-all keys of the third variant +* * always match, and this is added to +vars,
    resulting in a list « * * » of +variants.

  6. +
  7. As the list vars only has one entry, it does not +need to be sorted.
    The pattern Otherwise of the third +variant is selected.

  8. +
+

Example 2

+

Alternatively, with the same implementation and formatting context as +in Example 1, pattern selection would proceed as follows for this +message:

+
.input {$foo :string}
+.input {$bar :string}
+.match $foo $bar
+* bar {{Any and bar}}
+foo * {{Foo and any}}
+foo bar {{Foo and bar}}
+* * {{Otherwise}}
+
    +
  1. For the first selector:
    The value of the selector is resolved +to be 'foo'.
    The available keys « 'foo' » +are compared to 'foo',
    resulting in a list « +'foo' » of matching keys.

  2. +
  3. For the second selector:
    The value of the selector is +resolved to be 'bar'.
    The available keys « +'bar' » are compared to 'bar',
    resulting +in a list « 'bar' » of matching keys.

  4. +
  5. Creating the list vars of variants matching all +keys:
    The keys of all variants either match each selector exactly, +or via the catch-all key,
    resulting in a list « * bar, +foo *, foo bar, * * » of +variants.

  6. +
  7. Sorting the variants:
    The list sortable is first +set with the variants in their source order and scores determined by the +second selector:
    « ( 0, * bar ), ( 1, +foo * ), ( 0, foo bar ), ( 1, * * +) »
    This is then sorted as:
    « ( 0, * bar ), ( 0, +foo bar ), ( 1, foo * ), ( 1, * * +) ».
    To sort according to the first selector, the scores are updated +to:
    « ( 1, * bar ), ( 0, foo bar ), ( 0, +foo * ), ( 1, * * ) ».
    This is then sorted +as:
    « ( 0, foo bar ), ( 0, foo * ), ( 1, +* bar ), ( 1, * * ) ».

  8. +
  9. The pattern Foo and bar of the most preferred +foo bar variant is selected.

  10. +
+

Example 3

+

A more-complex example is the matching found in selection APIs such +as ICU’s PluralFormat. Suppose that this API is represented +here by the function :number. This :number +function can match a given numeric value to a specific number +literal and also to a plural category +(zero, one, two, +few, many, other) according to +locale rules defined in CLDR.

+

Given a variable reference $count whose value resolves +to the number 1 and an en (English) locale, +the pattern selection proceeds as follows for this message:

+
.input {$count :number}
+.match $count
+one {{Category match for {$count}}}
+1   {{Exact match for {$count}}}
+*   {{Other match for {$count}}}
+
    +
  1. For the selector:
    The value of the selector is resolved to an +implementation-defined value that is capable of performing English +plural category selection on the value 1.
    The available +keys « 'one', '1' » are passed to the +implementation’s MatchSelectorKeys method,
    resulting in a list « +'1', 'one' » of matching keys.

  2. +
  3. Creating the list vars of variants matching all +keys:
    The keys of all variants are included in the list of matching +keys, or use the catch-all key,
    resulting in a list « +one, 1, * » of variants.

  4. +
  5. Sorting the variants:
    The list sortable is first +set with the variants in their source order and scores determined by the +selector key order:
    « ( 1, one ), ( 0, 1 +), ( 2, * ) »
    This is then sorted as:
    « ( 0, +1 ), ( 1, one ), ( 2, * ) +»

  6. +
  7. The pattern Exact match for {$count} of the most +preferred 1 variant is selected.

  8. +
+

Formatting

+

After pattern selection, each text and +placeholder part of the selected pattern is resolved +and formatted.

+

Resolved values cannot always be formatted by a given +implementation. When such an error occurs during formatting, an +appropriate Message Function Error is emitted and a +fallback value is used for the placeholder with the +error.

+

Implementations MAY represent the result of formatting using +the most appropriate data type or structure. Some examples of these +include:

+
    +
  • A single string concatenated from the parts of the resolved +pattern.
  • +
  • A string with associated attributes for portions of its text.
  • +
  • A flat sequence of objects corresponding to each resolved +value.
  • +
  • A hierarchical structure of objects that group spans of resolved +values, such as sequences delimited by markup-open and +markup-close placeholders.
  • +
+

Implementations SHOULD provide formatting result types that +match user needs, including situations that require further processing +of formatted messages. Implementations SHOULD encourage users to +consider a formatted localised string as an opaque data structure, +suitable only for presentation.

+

When formatting to a string, the default representation of all +markup MUST be an empty string. Implementations MAY offer +functionality for customizing this, such as by emitting XML-ish tags for +each markup.

+

Examples

+

This section is non-normative.

+
    +
  1. An implementation might choose to return an interstitial object +so that the caller can “decorate” portions of the formatted value. In +ICU4J, the NumberFormatter class returns a +FormattedNumber object, so a pattern such as +This is my number {42 :number} might return the character +sequence This is my number followed by a +FormattedNumber object representing the value +42 in the current locale.

  2. +
  3. A formatter in a web browser could format a message as a DOM +fragment rather than as a representation of its HTML source.

  4. +
+

Formatting Fallback Values

+

If the resolved pattern includes any fallback +values and the formatting result is a concatenated string or a +sequence of strings, the string representation of each fallback +value MUST be the concatenation of a U+007B LEFT CURLY BRACKET +{, the fallback value as a string, and a U+007D +RIGHT CURLY BRACKET }.

+
+

For example, a message that is not well-formed +would format to a string as {�}, unless a fallback string +is defined in the formatting context, in which case that string +would be used instead.

+
+

Handling Bidirectional Text

+

Messages contain text. Any text can be bidirectional +text. That is, the text can can consist of a mixture of +left-to-right and right-to-left spans of text. The display of +bidirectional text is defined by the Unicode Bidirectional +Algorithm [UAX9].

+

The directionality of the formatted message as a whole is +provided by the formatting context.

+
+

[!NOTE] Keep in mind the difference between the formatted output of a +message, which is the topic of this section, and the syntax of +message prior to formatting. The processing of a +message depends on the logical sequence of Unicode code points, +not on the presentation of the message. Affordances to allow +users appropriate control over the appearance of the message’s +syntax have been provided.

+
+

When a message is formatted, placeholders are +replaced with their formatted representation. Applying the Unicode +Bidirectional Algorithm to the text of a formatted message +(including its formatted parts) can result in unexpected or undesirable +spillover +effects. Applying bidi +isolation to each affected formatted value helps avoid this +spillover in a formatted message.

+

Note that both the message and, separately, each +placeholder need to have direction metadata for this to work. +If an implementation supports formatting to something other than a +string (such as a sequence of parts), the directionality of each +formatted placeholder needs to be available to the caller.

+

If a formatted expression itself contains spans with +differing directionality, its formatter SHOULD perform any necessary +processing, such as inserting controls or isolating such parts to ensure +that the formatted value displays correctly in a plain text context.

+
+

For example, an implementation could provide a :currency +formatting function which inserts strongly directional characters, such +as U+200F RIGHT-TO-LEFT MARK (RLM), U+200E LEFT-TO-RIGHT MARK (LRM), or +U+061C ARABIC LETTER MARKER (ALM), to coerce proper display of the sign +and currency symbol next to a formatted number. An example of this is +formatting the value -1234.56 as the currency +AED in the ar-AE locale. The formatted value +appears like this:

+
‎-1,234.56 د.إ.‏
+

The code point sequence for this string, as produced by the ICU4J +NumberFormat function, includes U+200F +U+200E at the start and U+200F at the end of +the string. If it did not do this, the same string would appear like +this instead:

+
+ + +
+
+

A bidirectional isolation +strategy is functionality in the formatter’s +processing of a message that produces bidirectional output text +that is ready for display.

+

The Default Bidi Strategy is a +bidirectional isolation strategy that uses isolating Unicode +control characters around placeholder’s formatted values. It is +primarily intended for use in plain-text strings, where markup or other +mechanisms are not available. Implementations MUST provide the +Default Bidi Strategy as one of the bidirectional isolation +strategies.

+

Implementations MAY provide other bidirectional isolation +strategies.

+

Implementations MAY supply a bidirectional isolation +strategy that performs no processing.

+

The Default Bidi Strategy is defined as follows:

+
    +
  1. Let out be the empty string.
  2. +
  3. Let msgdir be the directionality of the whole message, +one of « 'LTR', 'RTL', 'unknown' +». These correspond to the message having left-to-right directionality, +right-to-left directionality, and to the message’s directionality not +being known.
  4. +
  5. For each part part in pattern: +
      +
    1. If part is a plain literal (text) part, append +part to out.
    2. +
    3. Else: +
        +
      1. Assert part is a placeholder.
      2. +
      3. Let exp be part.
      4. +
      5. Let fmt be the formatted string representation of the +resolved value of exp.
      6. +
      7. Let dir be the directionality of fmt, one +of « 'LTR', 'RTL', 'unknown' », +with the same meanings as for msgdir.
      8. +
      9. Let the boolean value isolate be True if the +u:dir option of the resolved value of +exp has a value other than 'inherit', or False +otherwise.
      10. +
      11. If dir is 'LTR': +
          +
        1. If msgdir is 'LTR' and +isolate is False, append fmt to +out.
        2. +
        3. Else: +
            +
          1. Append U+2066 LEFT-TO-RIGHT ISOLATE to out.
          2. +
          3. Append fmt to out.
          4. +
          5. Append U+2069 POP DIRECTIONAL ISOLATE to out.
          6. +
        4. +
      12. +
      13. Else, if dir is 'RTL': +
          +
        1. Append U+2067 RIGHT-TO-LEFT ISOLATE to out.
        2. +
        3. Append fmt to out.
        4. +
        5. Append U+2069 POP DIRECTIONAL ISOLATE to out.
        6. +
      14. +
      15. Else: +
          +
        1. Append U+2068 FIRST STRONG ISOLATE to out.
        2. +
        3. Append fmt to out.
        4. +
        5. Append U+2069 POP DIRECTIONAL ISOLATE to out.
        6. +
      16. +
    4. +
  6. +
  7. Emit out as the formatted output of the message.
  8. +
+ + + diff --git a/spec/formatting.md b/spec/formatting.md index c7dc737ca6..ef583ed22b 100644 --- a/spec/formatting.md +++ b/spec/formatting.md @@ -938,15 +938,15 @@ The _Default Bidi Strategy_ is defined as follows: 1. For each part `part` in _pattern_: 1. If `part` is a plain literal (text) part, append `part` to `out`. 1. Else: - i. Assert `part` is a _placeholder_. - i. Let `exp` be `part`. - i. Let `fmt` be the formatted string representation of the _resolved value_ of `exp`. - i. Let `dir` be the directionality of `fmt`, + 1. Assert `part` is a _placeholder_. + 1. Let `exp` be `part`. + 1. Let `fmt` be the formatted string representation of the _resolved value_ of `exp`. + 1. Let `dir` be the directionality of `fmt`, one of « `'LTR'`, `'RTL'`, `'unknown'` », with the same meanings as for `msgdir`. - i. Let the boolean value `isolate` be + 1. Let the boolean value `isolate` be True if the `u:dir` _option_ of the _resolved value_ of `exp` has a value other than `'inherit'`, or False otherwise. - i. If `dir` is `'LTR'`: + 1. If `dir` is `'LTR'`: 1. If `msgdir` is `'LTR'` and `isolate` is False, append `fmt` to `out`. @@ -954,11 +954,11 @@ The _Default Bidi Strategy_ is defined as follows: i. Append U+2066 LEFT-TO-RIGHT ISOLATE to `out`. i. Append `fmt` to `out`. i. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. - i. Else, if `dir` is `'RTL'`: + 1. Else, if `dir` is `'RTL'`: 1. Append U+2067 RIGHT-TO-LEFT ISOLATE to `out.` 1. Append `fmt` to `out`. 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. - i. Else: + 1. Else: 1. Append U+2068 FIRST STRONG ISOLATE to `out`. 1. Append `fmt` to `out`. 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. From 7c5bf61999da4fbea81f06cc1380fe128426cedd Mon Sep 17 00:00:00 2001 From: Tim Chevalier Date: Wed, 11 Dec 2024 13:39:01 -0800 Subject: [PATCH 08/12] Fix formatting again --- spec/formatting.html | 2 +- spec/formatting.md | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/spec/formatting.html b/spec/formatting.html index b80bdde905..e0da11e440 100644 --- a/spec/formatting.html +++ b/spec/formatting.html @@ -1021,7 +1021,7 @@

Handling Bidirectional Text

isolate is False, append fmt to out.
  • Else: -
      +
      1. Append U+2066 LEFT-TO-RIGHT ISOLATE to out.
      2. Append fmt to out.
      3. Append U+2069 POP DIRECTIONAL ISOLATE to out.
      4. diff --git a/spec/formatting.md b/spec/formatting.md index ef583ed22b..d1d2a320ee 100644 --- a/spec/formatting.md +++ b/spec/formatting.md @@ -951,9 +951,9 @@ The _Default Bidi Strategy_ is defined as follows: and `isolate` is False, append `fmt` to `out`. 1. Else: - i. Append U+2066 LEFT-TO-RIGHT ISOLATE to `out`. - i. Append `fmt` to `out`. - i. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. + 1. Append U+2066 LEFT-TO-RIGHT ISOLATE to `out`. + 1. Append `fmt` to `out`. + 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. 1. Else, if `dir` is `'RTL'`: 1. Append U+2067 RIGHT-TO-LEFT ISOLATE to `out.` 1. Append `fmt` to `out`. From 53aa0fba417b6ed7d3433cfb96548b41f9fabb3b Mon Sep 17 00:00:00 2001 From: Tim Chevalier Date: Wed, 11 Dec 2024 15:53:00 -0800 Subject: [PATCH 09/12] Remove erroneously-added file --- spec/formatting.html | 1048 ------------------------------------------ 1 file changed, 1048 deletions(-) delete mode 100644 spec/formatting.html diff --git a/spec/formatting.html b/spec/formatting.html deleted file mode 100644 index e0da11e440..0000000000 --- a/spec/formatting.html +++ /dev/null @@ -1,1048 +0,0 @@ - - - - - - -formatting.html - - - - - - -

        DRAFT MessageFormat 2.0 -Formatting

        -

        Introduction

        -

        This section defines the behavior of a MessageFormat 2.0 -implementation when formatting a message for display in a user -interface, or for some later processing.

        -

        To start, we presume that a message has either been parsed -from its syntax or created from a data model description. If the -resulting message is not well-formed, a Syntax -Error is emitted. If the resulting message is -well-formed but is not valid, a Data Model -Error is emitted.

        -

        The formatting of a message is defined by the following -operations:

        -
          -
        • Pattern Selection determines -which of a message’s patterns is formatted. For a message with -no selectors, this is simple as there is only one -pattern. With selectors, this will depend on their -resolution.

        • -
        • Formatting takes the -resolved values of the text and placeholder -parts of the selected pattern, and produces the formatted -result for the message. Depending on the implementation, this -result could be a single concatenated string, an array of objects, an -attributed string, or some other locally appropriate data type.

        • -
        • Expression and Markup -Resolution determines the value of an -expression or markup, with reference to the current -formatting context. This can include multiple steps, such as -looking up the value of a variable and calling formatting functions. The -form of the resolved value is implementation defined and the -value might not be evaluated or formatted yet. However, it needs to be -“formattable”, i.e. it contains everything required by the eventual -formatting.

          -

          The resolution of text is rather straightforward, and is -detailed under literal resolution.

        • -
        -

        Implementations are not required to expose the expression -resolution and pattern selection operations to their -users, or even use them in their internal processing, as long as the -final formatting result is made available to users and the -observable behavior of the formatting matches that described -here.

        -

        Attributes MUST NOT have any effect on the formatted output -of a message, nor be made available to function -handlers.

        -
        -

        [!IMPORTANT]

        -

        This specification does not require either eager or lazy -expression resolution of message parts; do not -construe any requirement in this document as requiring -either.

        -

        Implementations are not required to evaluate all parts of a -message when parsing, processing, or formatting. In particular, -an implementation MAY choose not to evaluate or resolve the value of a -given expression until it is actually used by a selection or -formatting process. However, when an expression is resolved, it -MUST behave as if all preceding declarations affecting -variables referenced by that expression have already -been evaluated in the order in which the relevant declarations -appear in the message. An implementation MUST ensure that every -expression in a message is evaluated at most once.

        -
        -
        -

        [!IMPORTANT]

        -

        Implementations with lazy evaluation MUST NOT use a call-by-name -evaluation strategy. Instead, they must evaluate expressions at most -once (“call-by-need”). This is to prevent expressions from -having different values when used in different parts of a given -message. Function handlers are not necessarily pure: -they can access external mutable state such as the current system clock -time. Thus, evaluating the same expression more than once could -yield different results. That behavior violates this specification.

        -
        -
        -

        [!IMPORTANT] Implementations and users SHOULD NOT create function -handlers that mutate external program state, particularly since -such a function handler can present a remote execution -hazard.

        -
        -

        Formatting Context

        -

        A message’s formatting -context represents the data and procedures that are -required for the message’s expression resolution, -pattern selection and formatting.

        -

        At a minimum, it includes:

        -
          -
        • Information on the current locale, -potentially including a fallback chain of locales. This will be passed -on to formatting functions.

        • -
        • Information on the base directionality of the message -and its text tokens. This will be used by strategies for -bidirectional isolation, and can be used to set the base direction of -the message upon display.

        • -
        • An input mapping of string -identifiers to values, defining variable values that are available -during variable resolution. This is often determined by a -user-provided argument of a formatting function call.

        • -
        • The function registry, providing the function -handlers of the functions referred to by message -functions.

        • -
        • Optionally, a fallback string to use for the message if -it is not valid.

        • -
        -

        Implementations MAY include additional fields in their formatting -context.

        -

        Resolved Values

        -

        A resolved value is the result -of resolving a text, literal, variable, -expression, or markup. The resolved value is -determined using the formatting context. The form of the -resolved value is implementation-defined.

        -

        In a declaration, the resolved value of an -expression is bound to a variable, which makes it -available for use in later expressions and markup -options.

        -
        -

        For example, in

        -
        .input {$a :number minimumFractionDigits=3}
        -.local $b = {$a :integer notation=compact}
        -.match $a
        -0 {{The value is zero.}}
        -* {{In compact form, the value {$a} is rendered as {$b}.}}
        -

        the resolved value bound to $a is used as the -operand of the :integer function when -resolving the value of the variable $b, as a -selector in the .match statement, as well as for -formatting the placeholder {$a}.

        -
        -

        In an input-declaration, the variable operand of -the variable-expression identifies not only the name of the -external input value, but also the variable to which the -resolved value of the variable-expression is -bound.

        -

        In a pattern, the resolved value of an -expression or markup is used in its -formatting.

        -

        The form that resolved values take is -implementation-dependent, and different implementations MAY choose to -perform different levels of resolution.

        -
        -

        While this specification does not require it, a resolved -value could be implemented by requiring each function -handler to return a value matching the following interface:

        -
        interface MessageValue {
        -  formatToString(): string
        -  formatToX(): X // where X is an implementation-defined type
        -  getValue(): unknown
        -  resolvedOptions(): { [key: string]: MessageValue }
        -  selectKeys(keys: string[]): string[]
        -}
        -

        With this approach: - An expression could be used as a -placeholder if calling the formatToString() or -formatToX() method of its resolved value did not -emit an error. - A variable could be used as a -selector if calling the selectKeys(keys) method of -its resolved value did not emit an error. - Using a -variable, the resolved value of an expression -could be used as an operand or option value if calling -the getValue() method of its resolved value did -not emit an error. In this use case, the resolvedOptions() -method could also provide a set of option values that could be taken -into account by the called function.

        -

        Extensions of the base MessageValue interface could be -provided for different data types, such as numbers or strings, for which -the unknown return type of getValue() and the -generic MessageValue type used in -resolvedOptions() could be narrowed appropriately. An -implementation could also allow MessageValue values to be -passed in as input variables, or automatically wrap each variable as a -MessageValue to provide a uniform interface for custom -functions.

        -
        -

        Expression and Markup -Resolution

        -

        Expressions are used in declarations and -patterns. Markup is only used in -patterns.

        -

        Depending on the presence or absence of a variable or -literal operand and a function, the resolved -value of the expression is determined as follows:

        -

        If the expression contains a function, its -resolved value is defined by function resolution.

        -

        Else, if the expression consists of a variable, its -resolved value is defined by variable resolution. An -implementation MAY perform additional processing when resolving the -value of an expression that consists only of a -variable.

        -
        -

        For example, it could apply function resolution using a -function and a set of options chosen based on the -value or type of the variable. So, given a message -like this:

        -
        Today is {$date}
        -

        If the value passed in the variable were a date object, such -as a JavaScript Date or a Java java.util.Date -or java.time.Temporal, the implementation could interpret -the placeholder {$date} as if the pattern included -the function :datetime with some set of default -options.

        -
        -

        Else, the expression consists of a literal. Its -resolved value is defined by literal resolution.

        -
        -

        [!NOTE] This means that a literal value with no -function is always treated as a string. To represent values -that are not strings as a literal, a function needs to -be provided:

        -
        .local $aNumber = {1234 :number}
        -.local $aDate = {|2023-08-30| :datetime}
        -.local $aFoo = {|some foo| :foo}
        -{{You have {42 :number}}}
        -
        -

        Literal Resolution

        -

        The resolved value of a text or a literal -contains the character sequence of the text or literal -after any character escape has been converted to the escaped -character.

        -

        When a literal is used as an operand or on the -right-hand side of an option, the formatting function MUST -treat its resolved value the same whether its value was -originally a quoted literal or an unquoted -literal.

        -
        -

        For example, the option foo=42 and the -option foo=|42| are treated as identical.

        -
        -
        -

        For example, in a JavaScript formatter, the resolved value -of a text or a literal could have the following -implementation:

        -
        class MessageLiteral implements MessageValue {
        -  constructor(value: string) {
        -    this.formatToString = () => value;
        -    this.getValue = () => value;
        -  }
        -  resolvedOptions: () => ({});
        -  selectKeys(_keys: string[]) {
        -    throw Error("Selection on unannotated literals is not supported");
        -  }
        -}
        -
        -

        Variable Resolution

        -

        To resolve the value of a variable, its name is -used to identify either a local variable or an input variable. If a -declaration exists for the variable, its resolved -value is used. Otherwise, the variable is an implicit -reference to an input value, and its value is looked up from the -formatting context input mapping.

        -

        The resolution of a variable fails if no value is identified -for its name. If this happens, an Unresolved Variable -error is emitted and a fallback value is used as the -resolved value of the variable.

        -

        If the resolved value identified for the variable -name is a fallback value, a fallback value is -used as the resolved value of the variable.

        -

        The fallback value representation of a variable has -a string representation consisting of the U+0024 DOLLAR SIGN -$ followed by the name of the -variable.

        -

        Function Resolution

        -

        To resolve an expression with a function, the -following steps are taken:

        -
          -
        1. If the expression includes an operand, resolve -its value. If this is a fallback value, return a fallback -value as the resolved value of the -expression.

        2. -
        3. Resolve the identifier of the function and find -the appropriate function handler to call. If the implementation -cannot find the function handler, or if the identifier -includes a namespace that the implementation does not support, -emit an Unknown Function error and return a fallback -value as the resolved value of the -expression.

          -

          Implementations are not required to implement namespaces or -installable function registries.

        4. -
        5. Perform option resolution.

        6. -
        7. Determine the function context for calling the -function handler.

          -

          The function context contains -the context necessary for the function handler to resolve the -expression. This includes:

          -
            -
          • The current locale, potentially including a fallback chain -of locales.
          • -
          • The base directionality of the expression. By default, this -is undefined or empty.
          • -
          -

          If the resolved mapping of options includes any -u: options supported by the implementation, -process them as specified. Such u: options MAY be removed -from the resolved mapping of options.

        8. -
        9. Call the function handler with the following -arguments:

          -
            -
          • The function context.
          • -
          • The resolved mapping of options.
          • -
          • If the expression includes an operand, its -resolved value.
          • -
          -

          The form that resolved operand and option values -take is implementation-defined.

          -

          An implementation MAY pass additional arguments to the function -handler, as long as reasonable precautions are taken to keep the -function interface simple and minimal, and avoid introducing potential -security vulnerabilities.

        10. -
        11. If the call succeeds, resolve the value of the -expression as the result of that function call.

          -

          If the call fails or does not return a valid value, emit the -appropriate Message Function Error for the failure.

          -

          Implementations MAY provide a mechanism for the function -handler to provide additional detail about internal failures. -Specifically, if the cause of the failure was that the datatype, value, -or format of the operand did not match that expected by the -function, the function SHOULD cause a Bad -Operand error to be emitted.

          -

          In all failure cases, return a fallback value as the -resolved value of the expression.

        12. -
        -

        Function Handler

        -

        A function handler is an -implementation-defined process such as a function or method which -accepts a set of arguments and returns a resolved value. A -function handler is required to resolve a -function.

        -

        An implementation MAY define its own functions and their handlers. An -implementation MAY allow custom functions to be defined by users.

        -

        Implementations that provide a means for defining custom functions -MUST provide a means for function handlers to return -resolved values that contain enough information to be used as -operands or option values in subsequent -expressions.

        -

        The resolved value returned by a function handler -MAY be different from the value of the operand of the -function. It MAY be an implementation specified type. It is not -required to be the same type as the operand.

        -

        A function handler MAY include resolved options in its -resolved value. The resolved options MAY be different from the -options of the function.

        -

        A function handler SHOULD emit a Bad Operand error -for operands whose resolved value or type is not -supported.

        -

        Function handler access to the formatting context -MUST be minimal and read-only, and execution time SHOULD be limited.

        -

        Implementation-defined functions SHOULD use an -implementation-defined namespace.

        -

        Option Resolution

        -

        Option resolution is the process -of computing the options for a given expression. -Option resolution results in a mapping of string -identifiers to values. The order of options -MUST NOT be significant.

        -
        -

        For example, the following message treats both both -placeholders identically:

        -
        {$x :function option1=foo option2=bar} {$x :function option2=bar option1=foo}
        -
        -

        For each option:

        -
          -
        1. Let res be a new empty mapping.
        2. -
        3. For each option: -
            -
          1. Let id be the string value of the identifier -of the option.
          2. -
          3. Let rv be the resolved value of the -option value.
          4. -
          5. If rv is a fallback value: -
              -
            1. If supported, emit a Bad Option error.
            2. -
          6. -
          7. Else: -
              -
            1. Set res[id] to be rv.
            2. -
          8. -
        4. -
        5. Return res.
        6. -
        -

        The result of option resolution MUST be a (possibly empty) -mapping of string identifiers to values; that is, errors MAY be emitted, -but such errors MUST NOT be fatal. This mapping can be empty.

        -
        -

        [!NOTE] The resolved value of a function -operand can also include resolved option values. These are not -included in the option resolution result, and need to be -processed separately by a function handler.

        -
        -

        Markup Resolution

        -

        Unlike functions, the resolution of markup is not -customizable.

        -

        The resolved value of markup includes the following -fields:

        -
          -
        • The type of the markup: open, standalone, or close
        • -
        • The identifier of the markup
        • -
        • The resolved options values after option -resolution.
        • -
        -

        If the resolved mapping of options includes any -u: options supported by the implementation, -process them as specified. Such u: options MAY be removed -from the resolved mapping of options.

        -

        The resolution of markup MUST always succeed.

        -

        Fallback Resolution

        -

        A fallback value is the -resolved value for an expression or variable -when that expression or variable fails to resolve. It -contains a string representation that is used for its formatting, and no -option values.

        -

        The resolved value of text, literal, and -markup MUST NOT be a fallback value.

        -

        A variable fails to resolve when no value is identified for -its name. The string representation of its fallback -value is U+0024 DOLLAR SIGN $ followed by the -name of the variable.

        -

        An expression fails to resolve when:

        -
          -
        • A variable used as its operand resolves to a -fallback value. Note that an expression does not -necessarily fail to resolve if an option resolves with a -fallback value.
        • -
        • No function handler is found for a function -identifier.
        • -
        • Calling a function handler fails or does not return a valid -value.
        • -
        -

        The string representation of the fallback value of an -expression depends on its contents:

        -
          -
        • expression with a literal operand -(either quoted or unquoted): U+007C VERTICAL LINE | -followed by the value of the literal with escaping applied to -U+005C REVERSE SOLIDUS \ and U+007C VERTICAL LINE -|, and then by U+007C VERTICAL LINE |.

          -
          -

          Examples: In a context where :func fails to resolve, -{42 :func} resolves to a fallback value with a -string representation |42| and {|C:\\| :func} -resolves to a fallback value with a string representation -|C:\\|.

          -
        • -
        • expression with variable operand: the -fallback value representation of that variable, U+0024 -DOLLAR SIGN $ followed by the name of the -variable

          -
          -

          Examples: In a context where $var fails to resolve, -{$var} and {$var :number} both resolve to a -fallback value with a string representation $var -(even if :number fails to resolve).

          -

          In a context where :func fails to resolve, the -placeholder in -.local $var = {|val| :func} {{{$var}}} resolves to a -fallback value with a string representation -$var.

          -

          In a context where either :now or :pretty -fails to resolve, the placeholder in

          -
          .local $time = {:now format=iso8601}
          -{{{$time :pretty}}}
          -

          resolves to a fallback value with a string representation -$time.

          -
        • -
        • function expression with no operand: -U+003A COLON : followed by the function -identifier

          -
          -

          Examples: In a context where :func fails to resolve, -{:func} resolves to a fallback value with a string -representation :func. In a context where -:ns:func fails to resolve, {:ns:func} resolves -to a fallback value with a string representation -:ns:func.

          -
        • -
        • Otherwise: the U+FFFD REPLACEMENT CHARACTER

          -

          This is not currently used by any expression, but may apply in future -revisions.

        • -
        -

        Options and attributes are not included in the -fallback value.

        -

        Pattern selection is not supported for fallback -values.

        -
        -

        For example, in a JavaScript formatter the fallback value -could have the following implementation, where source is -one of the above-defined strings:

        -
        class MessageFallback implements MessageValue {
        -  constructor(source: string) {
        -    this.formatToString = () => `{${source}}`;
        -    this.getValue = () => undefined;
        -  }
        -  resolvedOptions: () => ({});
        -  selectKeys(_keys: string[]) {
        -    throw Error("Selection on fallback values is not supported");
        -  }
        -}
        -
        -

        Pattern Selection

        -

        If the message being formatted is not well-formed -and valid, the result of pattern selection is a -pattern consisting of a single fallback value using -the message’s fallback string defined in the formatting -context or if this is not available or empty, the U+FFFD -REPLACEMENT CHARACTER .

        -

        If the message being formatted does not contain a -matcher, the result of pattern selection is its -pattern value.

        -

        When a message contains a matcher with one or more -selectors, the implementation needs to determine which -variant will be used to provide the pattern for the -formatting operation. This is done by ordering and filtering the -available variant statements according to their key -values and selecting the first one.

        -
        -

        [!NOTE] At least one variant is required to have all of its -keys consist of the fallback value *. Some -selectors might be implemented in a way that the key value -* cannot be selected in a valid message. -In other cases, this key value might be unreachable only in certain -locales. This could result in the need in some locales to create one or -more variants that do not make sense grammatically for that -language. > For example, in the pl (Polish) locale, this -message cannot reach > the * variant: -> -> .input {$num :integer} > .match $num > 0 {{ }} > one {{ }} > few {{ }} > many {{ }} > * {{Only used by fractions in Polish.}} >

        -

        In the Tech Preview, feedback from users and implementers is desired -about whether to relax the requirement that such a “fallback -variant” appear in every message, versus the potential for a -message to fail at runtime because no matching variant -is available.

        -
        -

        The number of keys in each variant MUST equal the -number of selectors.

        -

        Each key corresponds to a selector by its position -in the variant.

        -
        -

        For example, in this message:

        -
        .input {$one :number}
        -.input {$two :number}
        -.input {$three :number}
        -.match $one $two $three
        -1 2 3 {{ ... }}
        -

        The first key 1 corresponds to the first -selector ($one), the second key -2 to the second selector ($two), and -the third key 3 to the third selector -($three).

        -
        -

        To determine which variant best matches a given set of -inputs, each selector is used in turn to order and filter the -list of variants.

        -

        Each variant with a key that does not match its -corresponding selector is omitted from the list of -variants. The remaining variants are sorted according -to the selector’s key-ordering preference. Earlier -selectors in the matcher’s list of selectors -have a higher priority than later ones.

        -

        When all of the selectors have been processed, the -earliest-sorted variant in the remaining list of -variants is selected.

        -

        This selection method is defined in more detail below. An -implementation MAY use any pattern selection method, as long as its -observable behavior matches the results of the method defined here.

        -

        Resolve Selectors

        -

        First, resolve the values of each selector:

        -
          -
        1. Let res be a new empty list of resolved values -that support selection.
        2. -
        3. For each selector sel, in source order, -
            -
          1. Let rv be the resolved value of -sel.
          2. -
          3. If selection is supported for rv: -
              -
            1. Append rv as the last element of the list -res.
            2. -
          4. -
          5. Else: -
              -
            1. Let nomatch be a resolved value for which -selection always fails.
            2. -
            3. Append nomatch as the last element of the list -res.
            4. -
            5. Emit a Bad Selector error.
            6. -
          6. -
        4. -
        -

        The form of the resolved values is determined by each -implementation, along with the manner of determining their support for -selection.

        -

        Resolve Preferences

        -

        Next, using res, resolve the preferential order for all -message keys:

        -
          -
        1. Let pref be a new empty list of lists of strings.
        2. -
        3. For each index i in res: -
            -
          1. Let keys be a new empty list of strings.
          2. -
          3. For each variant var of the message: -
              -
            1. Let key be the var key at position -i.
            2. -
            3. If key is not the catch-all key '*': -
                -
              1. Assert that key is a literal.
              2. -
              3. Let ks be the resolved value of -key in Unicode Normalization Form C.
              4. -
              5. Append ks as the last element of the list -keys.
              6. -
            4. -
          4. -
          5. Let rv be the resolved value at index -i of res.
          6. -
          7. Let matches be the result of calling the method -MatchSelectorKeys(rv, keys)
          8. -
          9. Append matches as the last element of the list -pref.
          10. -
        4. -
        -

        The method MatchSelectorKeys is determined by the implementation. It -takes as arguments a resolved selector value rv -and a list of string keys keys, and returns a list of -string keys in preferential order. The returned list MUST contain only -unique elements of the input list keys. The returned list -MAY be empty. The most-preferred key is first, with each successive key -appearing in order by decreasing preference.

        -

        The resolved value of each key MUST be in Unicode -Normalization Form C (“NFC”), even if the literal for the -key is not.

        -

        If calling MatchSelectorKeys encounters any error, a Bad -Selector error is emitted and an empty list is returned.

        -

        Filter Variants

        -

        Then, using the preferential key orders pref, filter the -list of variants to the ones that match with some -preference:

        -
          -
        1. Let vars be a new empty list of variants.
        2. -
        3. For each variant var of the message: -
            -
          1. For each index i in pref: -
              -
            1. Let key be the var key at position -i.
            2. -
            3. If key is the catch-all key '*': -
                -
              1. Continue the inner loop on pref.
              2. -
            4. -
            5. Assert that key is a literal.
            6. -
            7. Let ks be the resolved value of -key.
            8. -
            9. Let matches be the list of strings at index -i of pref.
            10. -
            11. If matches includes ks: -
                -
              1. Continue the inner loop on pref.
              2. -
            12. -
            13. Else: -
                -
              1. Continue the outer loop on message variants.
              2. -
            14. -
          2. -
          3. Append var as the last element of the list -vars.
          4. -
        4. -
        -

        Sort Variants

        -

        Finally, sort the list of variants vars and select the -pattern:

        -
          -
        1. Let sortable be a new empty list of (integer, -variant) tuples.
        2. -
        3. For each variant var of vars: -
            -
          1. Let tuple be a new tuple (-1, var).
          2. -
          3. Append tuple as the last element of the list -sortable.
          4. -
        4. -
        5. Let len be the integer count of items in -pref.
        6. -
        7. Let i be len - 1.
        8. -
        9. While i >= 0: -
            -
          1. Let matches be the list of strings at index -i of pref.
          2. -
          3. Let minpref be the integer count of items in -matches.
          4. -
          5. For each tuple tuple of sortable: -
              -
            1. Let matchpref be an integer with the value -minpref.
            2. -
            3. Let key be the tuple variant key -at position i.
            4. -
            5. If key is not the catch-all key '*': -
                -
              1. Assert that key is a literal.
              2. -
              3. Let ks be the resolved value of -key.
              4. -
              5. Let matchpref be the integer position of -ks in matches.
              6. -
            6. -
            7. Set the tuple integer value as -matchpref.
            8. -
          6. -
          7. Set sortable to be the result of calling the method -SortVariants(sortable).
          8. -
          9. Set i to be i - 1.
          10. -
        10. -
        11. Let var be the variant element of the first -element of sortable.
        12. -
        13. Select the pattern of var.
        14. -
        -

        SortVariants is a method whose single argument is a list -of (integer, variant) tuples. It returns a list of (integer, -variant) tuples. Any implementation of -SortVariants is acceptable as long as it satisfies the -following requirements:

        -
          -
        1. Let sortable be an arbitrary list of (integer, -variant) tuples.
        2. -
        3. Let sorted be SortVariants(sortable).
        4. -
        5. sorted is the result of sorting sortable -using the following comparator: -
            -
          1. (i1, v1) <= (i2, v2) if and only if -i1 <= i2.
          2. -
        6. -
        7. The sort is stable (pairs of tuples from sortable that -are equal in their first element have the same relative order in -sorted).
        8. -
        -

        Examples

        -

        This section is non-normative.

        -

        Example 1

        -

        Presuming a minimal implementation which only supports -:string function which matches keys by using -string comparison, and a formatting context in which the variable -reference $foo resolves to the string 'foo' -and the variable reference $bar resolves to the string -'bar', pattern selection proceeds as follows for this -message:

        -
        .input {$foo :string}
        -.input {$bar :string}
        -.match $foo $bar
        -bar bar {{All bar}}
        -foo foo {{All foo}}
        -* * {{Otherwise}}
        -
          -
        1. For the first selector:
          The value of the selector is resolved -to be 'foo'.
          The available keys « 'bar', -'foo' » are compared to 'foo',
          resulting -in a list « 'foo' » of matching keys.

        2. -
        3. For the second selector:
          The value of the selector is -resolved to be 'bar'.
          The available keys « -'bar', 'foo' » are compared to -'bar',
          resulting in a list « 'bar' » of -matching keys.

        4. -
        5. Creating the list vars of variants matching all -keys:
          The first variant bar bar is discarded as its -first key does not match the first selector.
          The second variant -foo foo is discarded as its second key does not match the -second selector.
          The catch-all keys of the third variant -* * always match, and this is added to -vars,
          resulting in a list « * * » of -variants.

        6. -
        7. As the list vars only has one entry, it does not -need to be sorted.
          The pattern Otherwise of the third -variant is selected.

        8. -
        -

        Example 2

        -

        Alternatively, with the same implementation and formatting context as -in Example 1, pattern selection would proceed as follows for this -message:

        -
        .input {$foo :string}
        -.input {$bar :string}
        -.match $foo $bar
        -* bar {{Any and bar}}
        -foo * {{Foo and any}}
        -foo bar {{Foo and bar}}
        -* * {{Otherwise}}
        -
          -
        1. For the first selector:
          The value of the selector is resolved -to be 'foo'.
          The available keys « 'foo' » -are compared to 'foo',
          resulting in a list « -'foo' » of matching keys.

        2. -
        3. For the second selector:
          The value of the selector is -resolved to be 'bar'.
          The available keys « -'bar' » are compared to 'bar',
          resulting -in a list « 'bar' » of matching keys.

        4. -
        5. Creating the list vars of variants matching all -keys:
          The keys of all variants either match each selector exactly, -or via the catch-all key,
          resulting in a list « * bar, -foo *, foo bar, * * » of -variants.

        6. -
        7. Sorting the variants:
          The list sortable is first -set with the variants in their source order and scores determined by the -second selector:
          « ( 0, * bar ), ( 1, -foo * ), ( 0, foo bar ), ( 1, * * -) »
          This is then sorted as:
          « ( 0, * bar ), ( 0, -foo bar ), ( 1, foo * ), ( 1, * * -) ».
          To sort according to the first selector, the scores are updated -to:
          « ( 1, * bar ), ( 0, foo bar ), ( 0, -foo * ), ( 1, * * ) ».
          This is then sorted -as:
          « ( 0, foo bar ), ( 0, foo * ), ( 1, -* bar ), ( 1, * * ) ».

        8. -
        9. The pattern Foo and bar of the most preferred -foo bar variant is selected.

        10. -
        -

        Example 3

        -

        A more-complex example is the matching found in selection APIs such -as ICU’s PluralFormat. Suppose that this API is represented -here by the function :number. This :number -function can match a given numeric value to a specific number -literal and also to a plural category -(zero, one, two, -few, many, other) according to -locale rules defined in CLDR.

        -

        Given a variable reference $count whose value resolves -to the number 1 and an en (English) locale, -the pattern selection proceeds as follows for this message:

        -
        .input {$count :number}
        -.match $count
        -one {{Category match for {$count}}}
        -1   {{Exact match for {$count}}}
        -*   {{Other match for {$count}}}
        -
          -
        1. For the selector:
          The value of the selector is resolved to an -implementation-defined value that is capable of performing English -plural category selection on the value 1.
          The available -keys « 'one', '1' » are passed to the -implementation’s MatchSelectorKeys method,
          resulting in a list « -'1', 'one' » of matching keys.

        2. -
        3. Creating the list vars of variants matching all -keys:
          The keys of all variants are included in the list of matching -keys, or use the catch-all key,
          resulting in a list « -one, 1, * » of variants.

        4. -
        5. Sorting the variants:
          The list sortable is first -set with the variants in their source order and scores determined by the -selector key order:
          « ( 1, one ), ( 0, 1 -), ( 2, * ) »
          This is then sorted as:
          « ( 0, -1 ), ( 1, one ), ( 2, * ) -»

        6. -
        7. The pattern Exact match for {$count} of the most -preferred 1 variant is selected.

        8. -
        -

        Formatting

        -

        After pattern selection, each text and -placeholder part of the selected pattern is resolved -and formatted.

        -

        Resolved values cannot always be formatted by a given -implementation. When such an error occurs during formatting, an -appropriate Message Function Error is emitted and a -fallback value is used for the placeholder with the -error.

        -

        Implementations MAY represent the result of formatting using -the most appropriate data type or structure. Some examples of these -include:

        -
          -
        • A single string concatenated from the parts of the resolved -pattern.
        • -
        • A string with associated attributes for portions of its text.
        • -
        • A flat sequence of objects corresponding to each resolved -value.
        • -
        • A hierarchical structure of objects that group spans of resolved -values, such as sequences delimited by markup-open and -markup-close placeholders.
        • -
        -

        Implementations SHOULD provide formatting result types that -match user needs, including situations that require further processing -of formatted messages. Implementations SHOULD encourage users to -consider a formatted localised string as an opaque data structure, -suitable only for presentation.

        -

        When formatting to a string, the default representation of all -markup MUST be an empty string. Implementations MAY offer -functionality for customizing this, such as by emitting XML-ish tags for -each markup.

        -

        Examples

        -

        This section is non-normative.

        -
          -
        1. An implementation might choose to return an interstitial object -so that the caller can “decorate” portions of the formatted value. In -ICU4J, the NumberFormatter class returns a -FormattedNumber object, so a pattern such as -This is my number {42 :number} might return the character -sequence This is my number followed by a -FormattedNumber object representing the value -42 in the current locale.

        2. -
        3. A formatter in a web browser could format a message as a DOM -fragment rather than as a representation of its HTML source.

        4. -
        -

        Formatting Fallback Values

        -

        If the resolved pattern includes any fallback -values and the formatting result is a concatenated string or a -sequence of strings, the string representation of each fallback -value MUST be the concatenation of a U+007B LEFT CURLY BRACKET -{, the fallback value as a string, and a U+007D -RIGHT CURLY BRACKET }.

        -
        -

        For example, a message that is not well-formed -would format to a string as {�}, unless a fallback string -is defined in the formatting context, in which case that string -would be used instead.

        -
        -

        Handling Bidirectional Text

        -

        Messages contain text. Any text can be bidirectional -text. That is, the text can can consist of a mixture of -left-to-right and right-to-left spans of text. The display of -bidirectional text is defined by the Unicode Bidirectional -Algorithm [UAX9].

        -

        The directionality of the formatted message as a whole is -provided by the formatting context.

        -
        -

        [!NOTE] Keep in mind the difference between the formatted output of a -message, which is the topic of this section, and the syntax of -message prior to formatting. The processing of a -message depends on the logical sequence of Unicode code points, -not on the presentation of the message. Affordances to allow -users appropriate control over the appearance of the message’s -syntax have been provided.

        -
        -

        When a message is formatted, placeholders are -replaced with their formatted representation. Applying the Unicode -Bidirectional Algorithm to the text of a formatted message -(including its formatted parts) can result in unexpected or undesirable -spillover -effects. Applying bidi -isolation to each affected formatted value helps avoid this -spillover in a formatted message.

        -

        Note that both the message and, separately, each -placeholder need to have direction metadata for this to work. -If an implementation supports formatting to something other than a -string (such as a sequence of parts), the directionality of each -formatted placeholder needs to be available to the caller.

        -

        If a formatted expression itself contains spans with -differing directionality, its formatter SHOULD perform any necessary -processing, such as inserting controls or isolating such parts to ensure -that the formatted value displays correctly in a plain text context.

        -
        -

        For example, an implementation could provide a :currency -formatting function which inserts strongly directional characters, such -as U+200F RIGHT-TO-LEFT MARK (RLM), U+200E LEFT-TO-RIGHT MARK (LRM), or -U+061C ARABIC LETTER MARKER (ALM), to coerce proper display of the sign -and currency symbol next to a formatted number. An example of this is -formatting the value -1234.56 as the currency -AED in the ar-AE locale. The formatted value -appears like this:

        -
        ‎-1,234.56 د.إ.‏
        -

        The code point sequence for this string, as produced by the ICU4J -NumberFormat function, includes U+200F -U+200E at the start and U+200F at the end of -the string. If it did not do this, the same string would appear like -this instead:

        -
        - - -
        -
        -

        A bidirectional isolation -strategy is functionality in the formatter’s -processing of a message that produces bidirectional output text -that is ready for display.

        -

        The Default Bidi Strategy is a -bidirectional isolation strategy that uses isolating Unicode -control characters around placeholder’s formatted values. It is -primarily intended for use in plain-text strings, where markup or other -mechanisms are not available. Implementations MUST provide the -Default Bidi Strategy as one of the bidirectional isolation -strategies.

        -

        Implementations MAY provide other bidirectional isolation -strategies.

        -

        Implementations MAY supply a bidirectional isolation -strategy that performs no processing.

        -

        The Default Bidi Strategy is defined as follows:

        -
          -
        1. Let out be the empty string.
        2. -
        3. Let msgdir be the directionality of the whole message, -one of « 'LTR', 'RTL', 'unknown' -». These correspond to the message having left-to-right directionality, -right-to-left directionality, and to the message’s directionality not -being known.
        4. -
        5. For each part part in pattern: -
            -
          1. If part is a plain literal (text) part, append -part to out.
          2. -
          3. Else: -
              -
            1. Assert part is a placeholder.
            2. -
            3. Let exp be part.
            4. -
            5. Let fmt be the formatted string representation of the -resolved value of exp.
            6. -
            7. Let dir be the directionality of fmt, one -of « 'LTR', 'RTL', 'unknown' », -with the same meanings as for msgdir.
            8. -
            9. Let the boolean value isolate be True if the -u:dir option of the resolved value of -exp has a value other than 'inherit', or False -otherwise.
            10. -
            11. If dir is 'LTR': -
                -
              1. If msgdir is 'LTR' and -isolate is False, append fmt to -out.
              2. -
              3. Else: -
                  -
                1. Append U+2066 LEFT-TO-RIGHT ISOLATE to out.
                2. -
                3. Append fmt to out.
                4. -
                5. Append U+2069 POP DIRECTIONAL ISOLATE to out.
                6. -
              4. -
            12. -
            13. Else, if dir is 'RTL': -
                -
              1. Append U+2067 RIGHT-TO-LEFT ISOLATE to out.
              2. -
              3. Append fmt to out.
              4. -
              5. Append U+2069 POP DIRECTIONAL ISOLATE to out.
              6. -
            14. -
            15. Else: -
                -
              1. Append U+2068 FIRST STRONG ISOLATE to out.
              2. -
              3. Append fmt to out.
              4. -
              5. Append U+2069 POP DIRECTIONAL ISOLATE to out.
              6. -
            16. -
          4. -
        6. -
        7. Emit out as the formatted output of the message.
        8. -
        - - - From daa95adc012a45c40e2429192de63f3b7030c205 Mon Sep 17 00:00:00 2001 From: Tim Chevalier Date: Wed, 11 Dec 2024 15:54:10 -0800 Subject: [PATCH 10/12] Update spec/formatting.md Co-authored-by: Addison Phillips --- spec/formatting.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/spec/formatting.md b/spec/formatting.md index d1d2a320ee..755f8611ce 100644 --- a/spec/formatting.md +++ b/spec/formatting.md @@ -939,8 +939,10 @@ The _Default Bidi Strategy_ is defined as follows: 1. If `part` is a plain literal (text) part, append `part` to `out`. 1. Else: 1. Assert `part` is a _placeholder_. - 1. Let `exp` be `part`. - 1. Let `fmt` be the formatted string representation of the _resolved value_ of `exp`. + 1. If `part` is _markup_, append the _resolved value_ of `part` to `out`. + Note that this is normally the empty string. + 1. Else: + 1. Let `fmt` be the formatted string representation of the _resolved value_ of `part`. 1. Let `dir` be the directionality of `fmt`, one of « `'LTR'`, `'RTL'`, `'unknown'` », with the same meanings as for `msgdir`. 1. Let the boolean value `isolate` be From 0b2f7a3d4ae0d76a603ad0a1fc911958a262656b Mon Sep 17 00:00:00 2001 From: Tim Chevalier Date: Wed, 11 Dec 2024 15:57:33 -0800 Subject: [PATCH 11/12] Fix indentation --- spec/formatting.md | 40 ++++++++++++++++++++-------------------- 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/spec/formatting.md b/spec/formatting.md index 755f8611ce..3ba6c9ba15 100644 --- a/spec/formatting.md +++ b/spec/formatting.md @@ -943,27 +943,27 @@ The _Default Bidi Strategy_ is defined as follows: Note that this is normally the empty string. 1. Else: 1. Let `fmt` be the formatted string representation of the _resolved value_ of `part`. - 1. Let `dir` be the directionality of `fmt`, - one of « `'LTR'`, `'RTL'`, `'unknown'` », with the same meanings as for `msgdir`. - 1. Let the boolean value `isolate` be - True if the `u:dir` _option_ of the _resolved value_ of `exp` has a value other than `'inherit'`, + 1. Let `dir` be the directionality of `fmt`, + one of « `'LTR'`, `'RTL'`, `'unknown'` », with the same meanings as for `msgdir`. + 1. Let the boolean value `isolate` be + True if the `u:dir` _option_ of the _resolved value_ of `exp` has a value other than `'inherit'`, or False otherwise. - 1. If `dir` is `'LTR'`: - 1. If `msgdir` is `'LTR'` - and `isolate` is False, - append `fmt` to `out`. - 1. Else: - 1. Append U+2066 LEFT-TO-RIGHT ISOLATE to `out`. - 1. Append `fmt` to `out`. - 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. - 1. Else, if `dir` is `'RTL'`: - 1. Append U+2067 RIGHT-TO-LEFT ISOLATE to `out.` - 1. Append `fmt` to `out`. - 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. - 1. Else: - 1. Append U+2068 FIRST STRONG ISOLATE to `out`. - 1. Append `fmt` to `out`. - 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. + 1. If `dir` is `'LTR'`: + 1. If `msgdir` is `'LTR'` + and `isolate` is False, + append `fmt` to `out`. + 1. Else: + 1. Append U+2066 LEFT-TO-RIGHT ISOLATE to `out`. + 1. Append `fmt` to `out`. + 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. + 1. Else, if `dir` is `'RTL'`: + 1. Append U+2067 RIGHT-TO-LEFT ISOLATE to `out.` + 1. Append `fmt` to `out`. + 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. + 1. Else: + 1. Append U+2068 FIRST STRONG ISOLATE to `out`. + 1. Append `fmt` to `out`. + 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. 1. Emit `out` as the formatted output of the message. From e9afb9d592c8983028977f3e3570f1a97222e1f6 Mon Sep 17 00:00:00 2001 From: Tim Chevalier Date: Fri, 13 Dec 2024 15:17:02 -0800 Subject: [PATCH 12/12] Apply suggestions from Eemeli --- spec/formatting.md | 49 +++++++++++++++++++++++----------------------- 1 file changed, 24 insertions(+), 25 deletions(-) diff --git a/spec/formatting.md b/spec/formatting.md index 3ba6c9ba15..fbab6e9c51 100644 --- a/spec/formatting.md +++ b/spec/formatting.md @@ -937,33 +937,32 @@ The _Default Bidi Strategy_ is defined as follows: right-to-left directionality, and to the message's directionality not being known. 1. For each part `part` in _pattern_: 1. If `part` is a plain literal (text) part, append `part` to `out`. + 1. Else if `part` is a _markup_ _placeholder_: + 1. Let `fmt` be the formatted string representation of the _resolved value_ of `part`. + Note that this is normally the empty string. + 1. Append `fmt` to `out`. 1. Else: - 1. Assert `part` is a _placeholder_. - 1. If `part` is _markup_, append the _resolved value_ of `part` to `out`. - Note that this is normally the empty string. - 1. Else: - 1. Let `fmt` be the formatted string representation of the _resolved value_ of `part`. - 1. Let `dir` be the directionality of `fmt`, - one of « `'LTR'`, `'RTL'`, `'unknown'` », with the same meanings as for `msgdir`. - 1. Let the boolean value `isolate` be - True if the `u:dir` _option_ of the _resolved value_ of `exp` has a value other than `'inherit'`, + 1. Let `fmt` be the formatted string representation of the _resolved value_ of `part`. + 1. Let `dir` be the directionality of `fmt`, + one of « `'LTR'`, `'RTL'`, `'unknown'` », with the same meanings as for `msgdir`. + 1. Let the boolean value `isolate` be + True if the `u:dir` _option_ of the _resolved value_ of `part` has a value other than `'inherit'`, or False otherwise. - 1. If `dir` is `'LTR'`: - 1. If `msgdir` is `'LTR'` - and `isolate` is False, - append `fmt` to `out`. - 1. Else: - 1. Append U+2066 LEFT-TO-RIGHT ISOLATE to `out`. - 1. Append `fmt` to `out`. - 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. - 1. Else, if `dir` is `'RTL'`: - 1. Append U+2067 RIGHT-TO-LEFT ISOLATE to `out.` - 1. Append `fmt` to `out`. - 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. - 1. Else: - 1. Append U+2068 FIRST STRONG ISOLATE to `out`. - 1. Append `fmt` to `out`. - 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. + 1. If `dir` is `'LTR'`: + 1. If `msgdir` is `'LTR'` and `isolate` is False: + 1. Append `fmt` to `out`. + 1. Else: + 1. Append U+2066 LEFT-TO-RIGHT ISOLATE to `out`. + 1. Append `fmt` to `out`. + 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. + 1. Else if `dir` is `'RTL'`: + 1. Append U+2067 RIGHT-TO-LEFT ISOLATE to `out.` + 1. Append `fmt` to `out`. + 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. + 1. Else: + 1. Append U+2068 FIRST STRONG ISOLATE to `out`. + 1. Append `fmt` to `out`. + 1. Append U+2069 POP DIRECTIONAL ISOLATE to `out`. 1. Emit `out` as the formatted output of the message.