Skip to content

Conversation

@btangmu
Copy link
Member

@btangmu btangmu commented Dec 3, 2025

-Clarify that the warning is in response to a vote

-Include the exception cause (PH for path ... Caused by...)

-Include the USV (U+...)

-Fix message that said minor category (subgroup) but meant major category (group)

-Change Probably mismatch to Possible mismatch (in this case it was not the cause)

CLDR-19142

  • This PR completes the ticket.

ALLOW_MANY_COMMITS=true

-Clarify that the warning is in response to a vote

-Include the exception cause (PH for path ... Caused by...)

-Include the USV (U+...)

-Fix message that said minor category (subgroup) but meant major category (group)

-Change Probably mismatch to Possible mismatch (in this case it was not the cause)
@btangmu btangmu self-assigned this Dec 3, 2025
@btangmu btangmu requested review from macchiati and srl295 December 3, 2025 18:33
@btangmu
Copy link
Member Author

btangmu commented Dec 3, 2025

This PR changes the warning from

[WARNING ] PH for path //ldml/annotations/annotation[@cp="🫝"][@type="tts"]java.lang.IllegalArgumentException: Probably mismatch in Page/Section enum, or too few capturing groups in regex for //ldml/annotations/annotation[@cp="🫝"][@type="tts"]

to

[INFO] [WARNING ] PH for path //ldml/annotations/annotation[@cp="🫝"][@type="tts"]java.lang.IllegalArgumentException: Possible mismatch in Page/Section enum, or too few capturing groups in regex for //ldml/annotations/annotation[@cp="🫝"][@type="tts"]
[INFO] Caused by: org.unicode.cldr.util.InternalCldrException: No minor category (aka subgroup) found for 🫝 (U+1FADD). Update emoji-test.txt to latest, and adjust PathHeader.. functionMap.put("minor", ...
[INFO] [WARNING ] Ignoring invalid vote for path //ldml/annotations/annotation[@cp="🫝"][@type="tts"]

The cause (No minor category...) and the context (vote) are added, and the cause includes the USV for the emoji (U+1FADD). This info should make such errors easier to diagnose in the future.

The question remains, what more should be done, if anything. We could delete the votes from the database. That would de-noise the logs, and prevent bogus votes from harming future data supposing U+1FADD is assigned a completely different emoji. On the other hand, if "apple core" does get added, the data could still be useful.

@macchiati
Copy link
Member

My suggestion is to remove from main. If we need that date in the future, we can recover the data from the release (or one of the tags)

@btangmu
Copy link
Member Author

btangmu commented Dec 4, 2025

remove from main

The XML data was already removed in #4895 and it could be recovered from git.

I'm wondering whether we should remove the votes from the database. I don't have a strong opinion one way or the other. If we don't remove the votes, there will be noise in the logs (not a big problem), and when/if U+1FADD apple core is approved, the votes may get imported and effectively bring back the annotations without needing to recover them from git. However, if U+1FADD gets assigned some other meaning, like "orange peel", then we'll have confusion between "apple core" and "orange peel" unless the votes have been removed...

@macchiati
Copy link
Member

macchiati commented Dec 4, 2025 via email

@btangmu
Copy link
Member Author

btangmu commented Dec 5, 2025

Excellent catch. Yes, please remove from the database

OK, I've just done so, on the production server, as follows:

mysql> select count(*) from cldr_vote_value_48 where xpath=704266 or xpath=704271;
+----------+
| count(*) |
+----------+
|      440 |
+----------+
1 row in set (0.71 sec)

mysql> delete from cldr_vote_value_48 where xpath=704266 or xpath=704271;
Query OK, 440 rows affected (3.51 sec)

mysql> select count(*) from cldr_vote_value_48 where xpath=704266 or xpath=704271;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (1.80 sec)

@btangmu btangmu merged commit 19265d1 into unicode-org:main Dec 5, 2025
15 checks passed
@btangmu btangmu deleted the t19142_a branch December 5, 2025 21:26
@srl295
Copy link
Member

srl295 commented Dec 6, 2025

Excellent catch. Yes, please remove from the database

On Thu, Dec 4, 2025, 05:54 Tom Bishop @.***> wrote:

btangmu left a comment (#5214)

#5214 (comment)

remove from main

The XML data was already removed in #4895

#4895 and it could be recovered

from git.

I'm wondering whether we should remove the votes from the database. I

don't have a strong opinion one way or the other. If we don't remove the

votes, there will be noise in the logs (not a big problem), and when/if

U+1FADD apple core is approved, the votes may get imported and effectively

bring back the annotations without needing to recover the from git.

However, if U+1FADD gets assigned some other meaning, like "orange peel",

then we'll have confusion between "apple core" and "orange peel" unless the

votes have been removed...

Reply to this email directly, view it on GitHub

#5214 (comment),

or unsubscribe

https://github.com/notifications/unsubscribe-auth/ACJLEMC2AVHUUXBBR5WUDET4AA4HPAVCNFSM6AAAAACN6K4SH6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTMMJSGM3TOMJWGY

.

You are receiving this because your review was requested.Message ID:

@.***>

Concur

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants