@@ -7,10 +7,12 @@ Status: **Proposed**
77 <dl>
88 <dt>Contributors</dt>
99 <dd>@aphillips</dd>
10+ <dd>@eemeli</dd>
1011 <dt>First proposed</dt>
1112 <dd>2024-03-27</dd>
1213 <dt>Pull Requests</dt>
13- <dd>#754</dd>
14+ <dd><a href="https://github.com/unicode-org/message-format-wg/pull/754">#754</a></dd>
15+ <dd><a href="https://github.com/unicode-org/message-format-wg/pull/781">#781</a></dd>
1416 </dl>
1517</details >
1618
@@ -346,6 +348,66 @@ the results or debug what is wrong with their messages.
346348By contrast, if users insert too many or the wrong controls using the recommended design,
347349the _ message_ would still be functional and would emit no undesired characters.
348350
351+
352+ ### Loose isolation
353+
354+ Apply bidi isolates in a slightly different way.
355+ The main differences to the proposed solution are:
356+ 1 . The open/close isolate characters are not syntactically required to be paired.
357+ This avoids introducing parse errors for missing or required invisible characters,
358+ which would lead to bad user experiences.
359+ 2 . Rather than patching the ` name ` rule with an optional trailing LRM/RLM/ALM,
360+ allow for its proper isolation.
361+
362+ Quoted patterns, quoted literals, and names may be isolated by LRI/RLI/FSI...PDI.
363+ For names and quoted literals, the isolate characters are outside the body of the token,
364+ but for quoted patterns, the isolates are in the middle of the ` {{ ` and ` }} ` characters.
365+ This avoids adding a lookahead requirement for detecting a ` complex-message ` start,
366+ and differentiates a ` quoted-pattern ` from a ` quoted ` ` key ` in a ` variant ` .
367+
368+ Expressions and markup may be isolated by LRI...PDI immediately within the ` { ` and ` } ` .
369+
370+ An LRI is allowed immediately after a newline outside patterns and within expressions.
371+ This is intended to allow left-to-right representation for "code"
372+ even if it contains a newline followed by content
373+ that could otherwise prompt the paragraph direction to be detected as right-to-left.
374+
375+ ``` abnf
376+ name = [open-isolate] name-start *name-char [close-isolate]
377+ quoted = [open-isolate] "|" *(quoted-char / quoted-escape) "|" [close-isolate]
378+ quoted-pattern = "{" [open-isolate] "{" pattern "}" [close-isolate] "}"
379+
380+ literal-expression = "{" [LRI] [s] literal [s annotation] *(s attribute) [s] [close-isolate] "}"
381+ variable-expression = "{" [LRI] [s] variable [s annotation] *(s attribute) [s] [close-isolate] "}"
382+ annotation-expression = "{" [LRI] [s] annotation *(s attribute) [s] [close-isolate] "}"
383+
384+ markup = "{" [LRI] [s] "#" identifier *(s option) *(s attribute) [s] ["/"] [close-isolate] "}"
385+ / "{" [LRI] [s] "/" identifier *(s option) *(s attribute) [s] [close-isolate] "}"
386+
387+ s = 1*( SP / HTAB / CR / LF [LRI] / %x3000 )
388+ LRI = %x2066
389+ open-isolate = %x2066-2068
390+ close-isolate = %x2069
391+ ```
392+
393+ Isolating rather than marking ` name ` helps ensure
394+ that its directionality does not spill over to adjoining syntax.
395+ For example, this allows for the proper rendering of the expression
396+ ```
397+ {:אחת:שתיים}
398+ ```
399+ where "אחת" is the ` namespace ` of the ` identifier ` .
400+ Without ` name ` isolation, this would render as
401+ ```
402+ {:אחת:שתיים}
403+ ```
404+
405+ In the syntax, it's much simpler to include the changes to ` name ` in that rule,
406+ rather than patching every place where ` name ` is used.
407+ Either way, the parsed value of the name should not include the open/close isolates,
408+ just as they're not included in the parsed values of quoted literals or quoted patterns.
409+
410+
349411### Deeper Syntax Changes
350412We could alter the syntax to make it more "bidi robust",
351413such as by using strongly directional instead of neutrals.
0 commit comments