Skip to content

Conversation

@btangmu
Copy link
Member

@btangmu btangmu commented Nov 10, 2025

-Move sets and ExemplarType from CheckExemplars into new org.unicode.cldr.util.ExemplarSets

-Replace CLDRFile.ExemplarType with ExemplarSets.ExemplarType

-Related refactoring; move some code into subroutines; change public to private where possible

-Some automated refactoring by IntelliJ, from accepting suggestions linked to compiler warnings

-Change Japn to Jpan and add reference tickets for TODO comments

CLDR-19098

  • This PR completes the ticket.

ALLOW_MANY_COMMITS=true

-Move sets and ExemplarType from CheckExemplars into new org.unicode.cldr.util.ExemplarSets

-Replace CLDRFile.ExemplarType with ExemplarSets.ExemplarType

-Related refactoring; move some code into subroutines; change public to private where possible

-Some automated refactoring by IntelliJ, from accepting suggestions linked to compiler warnings
@btangmu
Copy link
Member Author

btangmu commented Nov 10, 2025

Please note: this includes the changes that were in the earlier PR #5185

@btangmu
Copy link
Member Author

btangmu commented Nov 10, 2025

@macchiati these changes are based on your suggestions in the first merged PR #5184

@btangmu btangmu marked this pull request as ready for review November 10, 2025 17:06
@btangmu btangmu requested review from macchiati and srl295 November 10, 2025 17:07
srl295
srl295 previously approved these changes Nov 11, 2025
Copy link
Member

@srl295 srl295 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great improvements. Comments.

new UnicodeSet(
"[[[:Nd:][:script=common:][:script=inherited:]-[:Default_Ignorable_Code_Point:]-[:C:] - [_]] [\u05BE \u05F3 \u066A-\u066C]"
+ "[[؉][་ །༌][ཱ]‎‎{য়}য়]"
+ // TODO Fix this Hack
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ticket for the todo?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 24 to 30
static {
ExemplarSets.Japn.set(UScript.HAN);
ExemplarSets.Japn.set(UScript.HIRAGANA);
ExemplarSets.Japn.set(UScript.KATAKANA);
ExemplarSets.Kore.set(UScript.HAN);
ExemplarSets.Kore.set(UScript.HANGUL);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Japn is a typo, should be Jpan
  • all of this should be dynamic, using the script data as below:
        <scriptData>
                <scriptVariant type="compound" id='Hanb' base='Hani Bopo'/> <!-- Han with Bopomofo (alias for Han + Bopomofo) -->
                <scriptVariant type="compound" id='Jpan' base='Hani Hira Kana'/> <!-- Japanese (alias for Han + Hiragana + Katakana) -->
                <scriptVariant type="compound" id='Hrkt' base='Hira Kana'/> <!-- Japanese syllabaries (alias for Hiragana + Katakana) -->
                <scriptVariant type="compound" id='Kore' base='Hang Hani'/> <!-- Korean (alias for Hangul + Han) -->
                <scriptVariant type="compound" id='Hntl' base='Hant Latn'/> <!-- Han (Traditional variant) with Latin (alias for Hant + Latn) -->

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll study this -- do you mean that data exists and the Java code should be based on that data?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need ticket for adding API to SupplementalDataInfo for accessing that date, and replacing hardcoded lists here and elsewhere. See CLDR-18950

@btangmu
Copy link
Member Author

btangmu commented Nov 11, 2025

The 2nd commit changes Japn to Jpan and adds references in TODO comments.

@btangmu btangmu merged commit 79c9a73 into unicode-org:main Nov 12, 2025
14 checks passed
@btangmu btangmu deleted the t19098_d branch November 12, 2025 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants