Skip to content

Commit 1b12f58

Browse files
committed
Merge branch 'main' into test-func
2 parents 00a981a + ed6b1da commit 1b12f58

27 files changed

+2865
-684
lines changed

docs/tech-preview-blog-post.md

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
# Blog Post for Technical Preview
2+
3+
Today, Unicode announced the Technical Preview of MessageFormat 2,
4+
a new standard for creating and managing user interface strings.
5+
These messages can dynamically include data values formatted
6+
(using information in the Common Locale Data Repository [CLDR])
7+
according to the needs of the language and culture of the end user.
8+
Such messages can be adjusted to meet the linguistic needs of each
9+
language and are designed to be translated easily and efficiently.
10+
11+
Previously, software developers had to choose between many different
12+
APIs and templating languages to build user interface strings.
13+
These solutions did not always provide for the features of different
14+
human languages. Support was limited to specific platforms
15+
and these formats were not widely supported by translation tools,
16+
making translation and adaptation to specific cultures costly
17+
and time consuming.
18+
Most significantly, message formatting was limited to a small
19+
number of built-in formats.
20+
21+
One of the challenges in adapting software to work for
22+
users with different languages and cultures is the need for **_dynamic messages_**.
23+
Whenever a user interface needs to present data as part of a larger message,
24+
that data needs to be formatted.
25+
In many languages, including English, the message itself needs to be altered
26+
to make it grammatically correct.
27+
28+
For example, if a message in English might read:
29+
30+
> Your item had **1,023** views on **April 8, 2024**.
31+
32+
The equivalent message in French might read:
33+
34+
> Votre article a eu **1 023** vues le **8 avril 2024**.
35+
36+
Or Japanese:
37+
38+
> あなたのアイテムは **2024 年 4 月 8 日****1,023** 回閲覧されました。
39+
40+
But even in English, there are grammatical variations required:
41+
42+
> Your item had _no views_...
43+
>
44+
> Your item had 1 _view_...
45+
>
46+
> Your item had 1,043 _views_...
47+
48+
Once messages have been created, they need to be translated into the various
49+
languages and adapted for the various cultures around the world.
50+
Previously, there was no widely adopted standard,
51+
and existing formats provided only rudimentary support for managing
52+
the variations needed by other languages.
53+
Thus, it could be difficult for translators to do their work effectively.
54+
55+
For example, the same message shown above needs a different set of variations
56+
in order to support Polish:
57+
58+
> Twój przedmiot nie _ma_ żadnych _wyświetleń_.
59+
>
60+
> Twój przedmiot _miał_ 1 _wyświetlenie_.
61+
>
62+
> Twój przedmiot _miał_ 2 _wyświetlenia_.
63+
>
64+
> Twój przedmiot _ma_ 5 _wyświetleń_.
65+
66+
67+
MessageFormat 2 makes it easy to write messages like this
68+
without developers needing to know about such language variation.
69+
In fact, developers don't need to learn about any of the language
70+
and formatting variations needed by languages other than their own
71+
nor write code that manipulates formatting.
72+
73+
MessageFormat 2 messages can be simple strings:
74+
```
75+
Hello, world!
76+
```
77+
78+
A message can also include _placeholders_ that are replaced by user-provided values:
79+
```
80+
Hello {$user}!
81+
```
82+
83+
The user-provided values can be transformed or formatted using functions:
84+
```
85+
Today is {$date :date}
86+
Today is {$date :datetime weekday=long}.
87+
```
88+
89+
Messages can use a function (called a _selector_) to choose between
90+
different versions of a message.
91+
These allow messages to be tailored to the grammatical (or other) requirements of
92+
a given language:
93+
```
94+
.match {$count :integer}
95+
0 {{You have no views.}}
96+
one {{You have {$count} view.}}
97+
* {{You have {$count} views.}}
98+
```
99+
100+
Unlike the previous version of MessageFormat, MessageFormat 2 is designed for
101+
extension by implementers and even end users.
102+
This means that new functionality can be added to messages without modifying
103+
either existing messages or, in some cases, even the core library containing the
104+
MessageFormat 2 code.
105+
106+
MessageFormat 2 provides a rich and extensible set of functionality
107+
to permit the creation of natural-sounding, grammatically-correct,
108+
messages, while enabling rapid, accurate translation
109+
and extension using new and improved internationalization functionality
110+
in any computing system.
111+
112+
The Technical Preview is available for comment.
113+
The stable version of this specification is expected to be part of the
114+
Fall 2024 release of CLDR (v46).
115+
Implementations are available in ICU4J (Java) and ICU4C (C/C++)
116+
as well as JavaScript.
117+
Feedback about implementation experience,
118+
syntax,
119+
functionality,
120+
or other parts of the specification is welcome!
121+
See the end of this article for details on participation and how to comment on this work.
122+
123+
MessageFormat 2 consists of multiple parts:
124+
a syntax, including a formal grammar, for writing messages;
125+
a data model for representing messages (including those ported from other APIs);
126+
a registry of required functions;
127+
a function description mechanism for use by implementations and tools;
128+
and a test suite.

docs/tools/linkify.js

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
// Work in progress: tooling to linkify the HTML produced from
2+
// the MessageFormat 2 markdown.
3+
// this has been tested on the tr35-messageformat.html file
4+
// but not implemented in LDML45
5+
function linkify() {
6+
const terms = findTerms();
7+
const missing = new Set();
8+
const links = document.querySelectorAll("em");
9+
links.forEach((item) => {
10+
const target = generateId(item.textContent);
11+
if (terms.has(target)) {
12+
const el = item.lastElementChild ?? item;
13+
el.innerHTML = `<a href="#${target}">${item.textContent}</a>`;
14+
} else {
15+
missing.add(target);
16+
}
17+
});
18+
// report missing terms
19+
// (leave out sort if you want it in file order)
20+
Array.from(missing).sort().forEach((item)=> {
21+
console.log(item);
22+
});
23+
}
24+
25+
function findTerms() {
26+
const terms = new Set();
27+
document.querySelectorAll("dfn").forEach((item) => {
28+
// console.log(index + ": " + item.textContent);
29+
const term = generateId(item.textContent);
30+
// guard against duplicates
31+
if (terms.has(term)) {
32+
console.log("Duplicate term: " + term);
33+
}
34+
terms.add(term);
35+
item.setAttribute("id", term);
36+
});
37+
return terms;
38+
}
39+
40+
function generateId(term) {
41+
const id = term.toLowerCase().replaceAll(" ", "-");
42+
if (id.endsWith("rategies")) {
43+
// found in the bidi isolation strategies
44+
return id.slice(0, -3) + "y";
45+
} else if (id.endsWith("s") && id !== "status") {
46+
// regular English plurals
47+
return id.slice(0, -1);
48+
}
49+
return id;
50+
}

0 commit comments

Comments
 (0)