Skip to content

Commit a2b2556

Browse files
committed
ICU-23130 Read new RBNF rule data structure from CLDR-8909
1 parent 0b253b8 commit a2b2556

File tree

23 files changed

+280
-354
lines changed

23 files changed

+280
-354
lines changed

docs/processes/cldr-icu.md

Lines changed: 16 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -166,9 +166,10 @@ export CLDR_DATA_DIR=$HOME/cldr-staging/production
166166

167167
1c. ICU variables
168168
```sh
169-
export ICU4C_DIR=$HOME/icu-myfork/icu4c
170-
export ICU4J_ROOT=$HOME/icu-myfork/icu4j
171-
export TOOLS_ROOT=$HOME/icu-myfork/tools
169+
export ICU_DIR=$HOME/icu-myfork
170+
export ICU4C_DIR=$ICU_DIR/icu4c
171+
export ICU4J_ROOT=$ICU_DIR/icu4j
172+
export TOOLS_ROOT=$ICU_DIR/tools
172173
```
173174

174175
1d. Directory for logs/notes (create if does not exist)
@@ -196,13 +197,13 @@ make clean
196197
make check 2>&1 | tee $NOTES/icu4c-oldData-makeCheck.txt
197198
```
198199

199-
2b. Now with ICU4J, build and test without new data first, to verify that
200-
there are no pre-existing errors (or at least to have the pre-existing errors
201-
as a base for comparison):
200+
2b. Build, test, and install ICU4J without new data first. This is to verify that
201+
there are no pre-existing errors, or at least to have the pre-existing errors
202+
as a base for comparison:
202203
```sh
203204
cd $ICU4J_ROOT
204205
mvn clean
205-
mvn verify 2>&1 | tee $NOTES/icu4j-oldData-mvnCheck.txt
206+
mvn install 2>&1 | tee $NOTES/icu4j-oldData-mvnCheck.txt
206207
```
207208

208209
## 3 Make pre-adjustments
@@ -272,30 +273,17 @@ already present in the ICU4C sources. This process uses the `LdmlConverter` in
272273
`$ICU_DIR/tools/cldr/cldr-to-icu/`; see `$ICU_DIR/tools/cldr/cldr-to-icu/README.md`.
273274

274275
* This process will take several minutes, during most of which there will be no log
275-
output (so do not assume nothing is happening). Keep a log so you can investigate
276+
output (so do not assume that nothing is happening). Keep a log so you can investigate
276277
anything that looks suspicious.
277-
* The conversion tool
278-
will automatically run its own "clean" step to delete files it cannot determine to
279-
be ones that it would generate, except for pasts listed in `<retain>` elements such as
280-
`coll/de__PHONEBOOK.txt`, `coll/de_.txt`, etc.
278+
* The conversion tool will automatically run its own "clean" step to delete files it
279+
cannot determine to be ones that it would generate, except for pasts listed in
280+
`<retain>` elements such as `coll/de__PHONEBOOK.txt`, `coll/de_.txt`, etc.
281281
* Before running the tool to regenerate the data, make any necessary changes to the
282282
`config.xml` file, such as adding new locales etc.
283-
* **Temporary note 2025-04-07:** There are some steps mentioned in `$ICU_DIR/tools/cldr/cldr-to-icu/README.md`
284-
that were not mentioned in these instructions but seem to be necessary for the next step to
285-
work properly, these are:
286-
* Build ICU4J:
287-
```
288-
cd "$ICU_DIR"
289-
mvn clean install -f icu4j -DskipTests -DskipITs
290-
```
291-
* Build the conversion tool:
292-
```
293-
cd "$ICU_DIR/tools/cldr/cldr-to-icu/"
294-
mvn clean package -DskipTests -DskipITs
295-
```
296-
297-
```sh
298-
cd $ICU_DIR/tools/cldr/cldr-to-icu
283+
284+
```sh
285+
cd $TOOLS_ROOT/cldr/cldr-to-icu
286+
mvn clean package -DskipTests -DskipITs
299287
java -jar target/cldr-to-icu-1.0-SNAPSHOT-jar-with-dependencies.jar --cldrDataDir="$CLDR_TMP_DIR/production" | tee $NOTES/cldr-newData-builddataLog.txt
300288
```
301289

icu4c/source/data/dtd/cldr/common/dtd/ldml.dtd

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3136,12 +3136,21 @@ CLDR data files are interpreted according to the LDML specification (http://unic
31363136

31373137
<!ELEMENT rbnf ( alias | ( rulesetGrouping*, special* ) ) >
31383138

3139-
<!ELEMENT rulesetGrouping ( alias | ( ruleset*, special* ) ) >
3139+
<!ELEMENT rulesetGrouping ( alias | ( rbnfRules?, ruleset*, special* ) ) >
31403140
<!ATTLIST rulesetGrouping type NMTOKEN #REQUIRED >
31413141
<!--@MATCH:literal/NumberingSystemRules, OrdinalRules, SpelloutRules-->
31423142
<!ATTLIST rulesetGrouping draft (approved | contributed | provisional | unconfirmed | true | false) #IMPLIED >
31433143
<!--@METADATA-->
31443144

3145+
<!ELEMENT rbnfRules ( #PCDATA )>
3146+
3147+
<!ATTLIST rbnfRules alt NMTOKENS #IMPLIED >
3148+
<!--@MATCH:literal/variant-->
3149+
3150+
<!ATTLIST rbnfRules draft (approved | contributed | provisional | unconfirmed | true | false) #IMPLIED >
3151+
<!--@METADATA-->
3152+
<!--@DEPRECATED:true, false-->
3153+
31453154
<!ELEMENT ruleset ( alias | ( rbnfrule*, special* ) ) >
31463155
<!--@ORDERED-->
31473156
<!ATTLIST ruleset type NMTOKEN #REQUIRED >

icu4c/source/data/xml/rbnf/be.xml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,10 @@
1212
</identity>
1313
<rbnf>
1414
<rulesetGrouping type="SpelloutRules">
15-
<ruleset type="lenient-parse" access="private">
16-
<rbnfrule value="0">&amp;[last primary ignorable ] ←← ' ' ←← ',' ←← '-' ←← '­';</rbnfrule>
17-
</ruleset>
15+
<rbnfRules><![CDATA[
16+
%%lenient-parse:
17+
&[last primary ignorable ] << ' ' << ',' << '-' << '­';
18+
]]></rbnfRules>
1819
</rulesetGrouping>
19-
2020
</rbnf>
2121
</ldml>

icu4c/source/data/xml/rbnf/bg.xml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,10 @@
1212
</identity>
1313
<rbnf>
1414
<rulesetGrouping type="SpelloutRules">
15-
<ruleset type="lenient-parse" access="private">
16-
<rbnfrule value="0">&amp;[last primary ignorable ] ←← ' ' ←← ',' ←← '-' ←← '­';</rbnfrule>
17-
</ruleset>
15+
<rbnfRules><![CDATA[
16+
%%lenient-parse:
17+
&[last primary ignorable ] << ' ' << ',' << '-' << '­';
18+
]]></rbnfRules>
1819
</rulesetGrouping>
19-
2020
</rbnf>
2121
</ldml>

icu4c/source/data/xml/rbnf/ca.xml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,10 @@
1212
</identity>
1313
<rbnf>
1414
<rulesetGrouping type="SpelloutRules">
15-
<ruleset type="lenient-parse" access="private">
16-
<rbnfrule value="0">&amp;[last primary ignorable ] ←← ' ' ←← ',' ←← '-' ←← '­';</rbnfrule>
17-
</ruleset>
15+
<rbnfRules><![CDATA[
16+
%%lenient-parse:
17+
&[last primary ignorable ] << ' ' << ',' << '-' << '­';
18+
]]></rbnfRules>
1819
</rulesetGrouping>
19-
2020
</rbnf>
2121
</ldml>

icu4c/source/data/xml/rbnf/cy.xml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,10 @@
1212
</identity>
1313
<rbnf>
1414
<rulesetGrouping type="SpelloutRules">
15-
<ruleset type="lenient-parse" access="private">
16-
<rbnfrule value="0">&amp; ' ' , ',' ;</rbnfrule>
17-
</ruleset>
15+
<rbnfRules><![CDATA[
16+
%%lenient-parse:
17+
& ' ' , ',' ;
18+
]]></rbnfRules>
1819
</rulesetGrouping>
19-
2020
</rbnf>
2121
</ldml>

icu4c/source/data/xml/rbnf/da.xml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,10 @@
1212
</identity>
1313
<rbnf>
1414
<rulesetGrouping type="SpelloutRules">
15-
<ruleset type="lenient-parse" access="private">
16-
<rbnfrule value="0">&amp;[last primary ignorable ] ←← ' ' ←← ',' ←← '-' ←← '­';</rbnfrule>
17-
</ruleset>
15+
<rbnfRules><![CDATA[
16+
%%lenient-parse:
17+
&[last primary ignorable ] << ' ' << ',' << '-' << '­';
18+
]]></rbnfRules>
1819
</rulesetGrouping>
19-
2020
</rbnf>
2121
</ldml>

icu4c/source/data/xml/rbnf/de.xml

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,16 +7,15 @@
77
<!DOCTYPE ldml SYSTEM "../../dtd/cldr/common/dtd/ldml.dtd">
88
<ldml>
99
<identity>
10-
1110
<version number="$Revision$"/>
1211
<language type="de"/>
1312
</identity>
1413
<rbnf>
1514
<rulesetGrouping type="SpelloutRules">
16-
<ruleset type="lenient-parse" access="private">
17-
<rbnfrule value="0">&amp;ue=ü&amp;ae=ä&amp;oe=ö&amp;[last primary ignorable ] ←← ' ' ←← ',' ←← '-' ←← '­';</rbnfrule>
18-
</ruleset>
15+
<rbnfRules><![CDATA[
16+
%%lenient-parse:
17+
&ue=ü&ae=ä&oe=ö&[last primary ignorable ] << ' ' << ',' << '-' << '­';
18+
]]></rbnfRules>
1919
</rulesetGrouping>
20-
2120
</rbnf>
2221
</ldml>

icu4c/source/data/xml/rbnf/en.xml

Lines changed: 29 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -7,48 +7,42 @@
77
<!DOCTYPE ldml SYSTEM "../../dtd/cldr/common/dtd/ldml.dtd">
88
<ldml>
99
<identity>
10-
1110
<version number="$Revision$"/>
1211
<language type="en"/>
1312
</identity>
1413
<rbnf>
1514
<rulesetGrouping type="DurationRules">
16-
<ruleset type="with-words">
17-
<rbnfrule value="0">0 seconds; 1 second; =0= seconds;</rbnfrule>
18-
<rbnfrule value="60" radix="60">←%%min←[, →→];</rbnfrule>
19-
<rbnfrule value="3600" radix="60">←%%hr←[, →→→];</rbnfrule>
20-
</ruleset>
21-
<ruleset type="min" access="private">
22-
<rbnfrule value="0">0 minutes; 1 minute; =0= minutes;</rbnfrule>
23-
</ruleset>
24-
<ruleset type="hr" access="private">
25-
<rbnfrule value="0">0 hours; 1 hour; =0= hours;</rbnfrule>
26-
</ruleset>
27-
<ruleset type="in-numerals">
28-
<rbnfrule value="0">=0= sec.;</rbnfrule>
29-
<rbnfrule value="60">=%%min-sec=;</rbnfrule>
30-
<rbnfrule value="3600">=%%hr-min-sec=;</rbnfrule>
31-
</ruleset>
32-
<ruleset type="min-sec" access="private">
33-
<rbnfrule value="0">:=00=;</rbnfrule>
34-
<rbnfrule value="60" radix="60">←0←→→;</rbnfrule>
35-
</ruleset>
36-
<ruleset type="hr-min-sec" access="private">
37-
<rbnfrule value="0">:=00=;</rbnfrule>
38-
<rbnfrule value="60" radix="60">←00←→→;</rbnfrule>
39-
<rbnfrule value="3600" radix="60">←#,##0←:→→→;</rbnfrule>
40-
</ruleset>
41-
<ruleset type="duration">
42-
<rbnfrule value="0">=%in-numerals=;</rbnfrule>
43-
</ruleset>
44-
<ruleset type="lenient-parse" access="private">
45-
<rbnfrule value="0">&amp; ':' = '.' = ' ' = '-';</rbnfrule>
46-
</ruleset>
15+
<rbnfRules><![CDATA[
16+
%with-words:
17+
0: 0 seconds; 1 second; =0= seconds;
18+
60/60: <%%min<[, >>];
19+
3600/60: <%%hr<[, >>>];
20+
%%min:
21+
0: 0 minutes; 1 minute; =0= minutes;
22+
%%hr:
23+
0: 0 hours; 1 hour; =0= hours;
24+
%in-numerals:
25+
0: =0= sec.;
26+
60: =%%min-sec=;
27+
3600: =%%hr-min-sec=;
28+
%%min-sec:
29+
0: :=00=;
30+
60/60: <0<>>;
31+
%%hr-min-sec:
32+
0: :=00=;
33+
60/60: <00<>>;
34+
3600/60: <#,##0<:>>>;
35+
%duration:
36+
0: =%in-numerals=;
37+
%%lenient-parse:
38+
& ':' = '.' = ' ' = '-';
39+
]]></rbnfRules>
4740
</rulesetGrouping>
4841
<rulesetGrouping type="SpelloutRules">
49-
<ruleset type="lenient-parse" access="private">
50-
<rbnfrule value="0">&amp;[last primary ignorable ] ←← ' ' ←← ',' ←← '-' ←← '­';</rbnfrule>
51-
</ruleset>
42+
<rbnfRules><![CDATA[
43+
%%lenient-parse:
44+
&[last primary ignorable ] << ' ' << ',' << '-' << '­';
45+
]]></rbnfRules>
5246
</rulesetGrouping>
5347
</rbnf>
5448
</ldml>

icu4c/source/data/xml/rbnf/fo.xml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,10 @@
1212
</identity>
1313
<rbnf>
1414
<rulesetGrouping type="SpelloutRules">
15-
<ruleset type="lenient-parse" access="private">
16-
<rbnfrule value="0">&amp;[last primary ignorable ] ←← ' ' ←← ',' ←← '-' ←← '­';</rbnfrule>
17-
</ruleset>
15+
<rbnfRules><![CDATA[
16+
%%lenient-parse:
17+
&[last primary ignorable ] << ' ' << ',' << '-' << '­';
18+
]]></rbnfRules>
1819
</rulesetGrouping>
19-
2020
</rbnf>
2121
</ldml>

0 commit comments

Comments
 (0)