Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 28 additions & 46 deletions exploration/bidi-usability.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,12 @@ Status: **Proposed**
<dl>
<dt>Contributors</dt>
<dd>@aphillips</dd>
<dd>@eemeli</dd>
<dt>First proposed</dt>
<dd>2024-03-27</dd>
<dt>Pull Requests</dt>
<dd>#754</dd>
<dd><a href="https://github.com/unicode-org/message-format-wg/pull/754">#754</a></dd>
<dd><a href="https://github.com/unicode-org/message-format-wg/pull/781">#781</a></dd>
</dl>
</details>

Expand Down Expand Up @@ -247,7 +249,7 @@ This is not the disadvantage to right-to-left languages that it might first appe
correct order and without spillover effects

Permit isolating bidi controls to be used on the **outside** of the following:
- unquoted literals
- names
- quoted literals
- quoted patterns

Expand All @@ -256,13 +258,15 @@ the user to set the base direction of a _literal_ or _pattern_ according to its
actual contents.

This would change the ABNF as follows:
(Notice that this change includes a production `bidi` described further down
in this document)
```abnf
literal = ( open-isolate (quoted / (unquoted [bidi])) close-isolate)
/ (quoted / (unquoted [bidi]))
quoted-pattern = ( open-isolate "{{" pattern "}}" close-isolate)
/ ("{{" pattern "}}")
name = (open-isolate name-body close-isolate) / name-body
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a problem because name is used to build a variety of other constructs (variable, reserved-keyword, identifier, etc.). This change puts the isolates inside these constructs, e.g. $\u2066name\u2069 rather than on the outside.

This will make it harder for implementations, since they can't take the parsed token and compare it immediately. They have to stop to remove isolates. My original design avoided this problem by making the isolates not parse into names/identifiers/tokens.

name-body = name-start *name-char

quoted = (open-isolate quoted-body close-isolate) / quoted-body
quoted-body = "|" *(quoted-char / quoted-escape) "|"

quoted-pattern = (open-isolate "{{" pattern "}}" close-isolate)
/ "{{" pattern "}}"

open-isolate = %x2066-2068
close-isolate = %x2069
Expand All @@ -283,50 +287,28 @@ or _markup_ must be laid out left-to-right.
_Literal_ values can be right-to-left isolated within that or use strongly
directional marks to ensure correct display.

An LTR isolate is also allowed immediately after a newline outside patterns and within expressions.
This is intended to allow left-to-right representation for "code"
even if it contains a newline followed by content
that could otherwise prompt the paragraph direction to be detected as right-to-left.

This would change the ABNF as follows (assuming the above changes are also incorporated):
```abnf
expression = "{" LRI (literal-expression / variable-expression / annotation-expression) close-isolate "}"
/ "{" (literal-expression / variable-expression / annotation-expression) "}"
literal-expression = [s] literal [s annotation] *(s attribute) [s]
variable-expression = [s] variable [s annotation] *(s attribute) [s]
annotation-expression = [s] annotation *(s attribute) [s]
markup = "{" [s] "#" identifier *(s option) *(s attribute) [s] ["/"] "}" ; open and standalone
/ "{" [s] "/" identifier *(s option) *(s attribute) [s] "}" ; close
/ "{" LRI [s] "#" identifier *(s option) *(s attribute) [s] ["/"] close-isolate "}" ; open and standalone
/ "{" LRI [s] "/" identifier *(s option) *(s attribute) [s] close-isolate "}" ; close
LRI = %x2066
```

Permit the use of LRM, RLM, or ALM stronly directional marks immediately following any of the items that
**end** with the `name` production in the ABNF.
This includes _identifiers_ found in the names of
_functions_
and _options_,
plus the names of _variables_,
as well as the contents of _unquoted_ literals.
literal-expression = "{" [LRI] [s] literal [s annotation] *(s attribute) [s] [close-isolate] "}"
variable-expression = "{" [LRI] [s] variable [s annotation] *(s attribute) [s] [close-isolate] "}"
annotation-expression = "{" [LRI] [s] annotation *(s attribute) [s] [close-isolate] "}"

> [!NOTE]
> Notice that _unquoted_ literals can also be surrounded by bidi isolates
> using the previous syntax modification just above.

> [!NOTE]
> Notice that `reserved-annotation` is not in the ABNF changes because it already
> permits the marks in question.
> Any syntax derived from `reserved-annotation`
> (i.e. when unreserving a new statement in a future addition)
> would need to handle bidi explicitly using the model already established here.
markup = "{" [LRI] [s] "#" identifier *(s option) *(s attribute) [s] ["/"] [close-isolate] "}"
/ "{" [LRI] [s] "/" identifier *(s option) *(s attribute) [s] [close-isolate] "}"

```abnf
variable-expression = "{" [s] variable [bidi] [s annotation] *(s attribute) [s] "}"
function = ":" identifier [bidi] *(s option)
option = identifier [bidi] [s] "=" [s] (literal / (variable [bidi])
attribute = "@" identifier [bidi] [[s] "=" [s] (literal / (variable [bidi])]
markup = "{" [s] "#" identifier [bidi] *(s option) *(s attribute) [s] ["/"] "}" ; open and standalone
/ "{" [s] "/" identifier [bidi] *(s option) *(s attribute) [s] "}" ; close
identifier = [(namespace [bidi] ":")] name
bidi = [ %x200E-200F / %x061C ]
s = 1*( SP / HTAB / CR / LF [LRI] / %x3000 )
LRI = %x2066
```

When an `LRI` is used at the start of an expression or markup or after a newline,
it SHOULD be paired with a corresponding `close-isolate` at its end,
unless subsequent whitespace includes a newline before that.

## Alternatives Considered

_What other solutions are available?_
Expand Down