Skip to content

Conversation

btangmu
Copy link
Member

@btangmu btangmu commented Aug 18, 2025

-Revise ldml.dtd to allow numberSystem attribute for pattern element

-Add numberSystem (latn) for pattern/decimal/group elements where needed in locales ca, el, en_150, en_AT, eu, gl, it, tr

-Re-enable the check for numberSystem in CheckNumbers.java

-Use parts.containsAttribute instead of parts.getAttribute(2, ...) since the element index may not be 2

CLDR-18722

  • This PR completes the ticket.

ALLOW_MANY_COMMITS=true

-Revise ldml.dtd to allow numberSystem attribute for pattern element

-Add numberSystem (latn) for pattern/decimal/group elements where needed in locales ca, el, en_150, en_AT, eu, gl, it, tr

-Re-enable the check for numberSystem in CheckNumbers.java

-Use parts.containsAttribute instead of parts.getAttribute(2, ...) since the element index may not be 2
@btangmu btangmu self-assigned this Aug 18, 2025
@btangmu btangmu requested a review from macchiati August 18, 2025 19:39
@btangmu
Copy link
Member Author

btangmu commented Aug 18, 2025

Locally, after making these changes, I ran console check as follows:

java -DCLDR_DIR=$(pwd) -jar tools/cldr-code/target/cldr-code.jar check -S common,seed -1 -e -z FINAL_TESTING -c comprehensive

Note that -c comprehensive was required, otherwise the paths that previously had numberSystem errors would be skipped. There were 62 errors, but none of them involved numberSystem.

-Remove numberSystem from a path where it should not have been added (redundant)
-Revise ldml.dtd to handle numberSystem attribute for decimal/group element like elsewhere, not deprecated
<displayName draft="contributed">Peseta española</displayName>
<displayName count="one" draft="contributed">peseta</displayName>
<displayName count="other" draft="contributed">pesetas</displayName>
<symbol>₧</symbol>
<decimal>,</decimal>
<group>❰NBTSP❱</group>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is something wrong here; unrelated to this ticket. The [❰❱] characters should never occur in values for our data files. It looks like vetters might be pasting them into values. I'll file a separate ticket about this.

@@ -2051,7 +2051,7 @@ CLDR data files are interpreted according to the LDML specification (http://unic
<!ATTLIST decimal references CDATA #IMPLIED >
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering why this file exists; I've noticed that it causes errors when it drifts way from the regular ldml.dtd. If we keep it, we should have a test that it is identical (and maybe we do, and that's what alerted you to it?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In eclipse I see the following error with that file:

Description Resource Path Location Type
The element 'gmtUnknownFormat' has not been declared. ldml.dtd /cldr-code/src/test/resources/org/unicode/cldr/unittest/data/common/dtd line 1723 DTD Problem

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by that too. There are 3 ldml.dtd files:

  1. common/dtd/ldml.dtd
  2. tools/cldr-code/src/test/resources/org/unicode/cldr/unittest/data/common/dtd/ldml.dtd
  3. tools/cldr-code/target/test-classes/org/unicode/cldr/unittest/data/common/dtd/ldml.dtd

I was accidentally editing the wrong one before I noticed that...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact I did that again even after realizing there are 3 files; this PR still fails maybe for that reason. I'll make another (4th) commit shortly.

@srl295 do you have insight about how the ldml.dtd files relate to each other?

Copy link
Member

@macchiati macchiati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good; a couple of questions

@@ -259,12 +259,8 @@ public CheckCLDR handleCheck(
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it that NumericType still gets these items, so they aren't skipped by line 258.

Copy link
Member

@macchiati macchiati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this, and it is really past when we can make changes that could affect ICU. So we might want to put this on hold until we branch. Let's discuss in infra.

@macchiati
Copy link
Member

Changing to Draft so we don't mistakenly merge into 48 main.

@macchiati macchiati marked this pull request as draft August 20, 2025 17:05
-A previous commit mistakenly did this with the wrong ldml.dtd; there are 2 files with that name, not counting the one in target
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants