Skip to content

Comments

Emoji 18.0 alpha data#1279

Merged
eggrobin merged 9 commits intomainfrom
ned/emoji_18_alpha
Feb 3, 2026
Merged

Emoji 18.0 alpha data#1279
eggrobin merged 9 commits intomainfrom
ned/emoji_18_alpha

Conversation

@nedley
Copy link
Contributor

@nedley nedley commented Jan 31, 2026

[185-C31] Consensus: Accept nine (9) new emoji characters with the following character names, based on Section 1 of document L2/25-230, for Unicode Version 18.0:

    1F6D9 LIGHTHOUSE
    1FA8B METEOR
    1FA8C ERASER
    1FA8D NET WITH HANDLE
    1FACC MONARCH BUTTERFLY
    1FADD PICKLE
    1FAEB FACE WITH SQUINTING EYES
    1FAF9 LEFTWARDS THUMB SIGN
    1FAFA RIGHTWARDS THUMB SIGN

[186-C2] Consensus: Change the character approved for Unicode Version 18.0 at U+1FAEB from FACE WITH SQUINTING EYES to CRACKING FACE, based on Section 1 of document L2/26-008 and L2/26-048.

I tried to keep track of the steps I took but as always I start getting a little woozy at some point such that some of the later steps may be slightly out of order. The steps themselves are below but basically what it amounts to is I keep re-running GenerateEmoji.java until it stops throwing errors and the data files themselves stop changing, meaning they have been completely bootstrapped.

  1. Set "emoji-beta" as described in Emoji.java
  2. Update candidateData.txt with new emoji with Status=Draft Candidate
  3. Update docRegistry.txt as appropriate
  4. Run GenerateEmoji.java, adding new emoji to data files as E0.0
  5. Run GenerateEmoji.java again, complains about ordering
  6. Add new emoji to emojiOrdering.txt based on candidateData.txt
  7. Run CandidateData.java, add result to proposalData.txt
  8. Run GenerateEmoji.java one last time, updates new emoji with correct version

@nedley nedley requested a review from eggrobin January 31, 2026 00:22
@nedley nedley marked this pull request as draft January 31, 2026 01:50
@nedley nedley force-pushed the ned/emoji_18_alpha branch from 954f0d6 to a1199e6 Compare January 31, 2026 01:52
@nedley
Copy link
Contributor Author

nedley commented Jan 31, 2026

@eggrobin Apparently this is my first attempt at adding characters rather than simply reviewing. Do I need to manually edit the UCD files or is there a more streamlined process?

@nedley
Copy link
Contributor Author

nedley commented Jan 31, 2026

I tried adding the new characters to dev/UnicodeData.txt and running MakeUnicodeFiles.java with cleanAndCopy, that seems to have almost done what I want but it also made some unexpected changes to DerivedAge.txt…

@eggrobin
Copy link
Member

eggrobin commented Jan 31, 2026

Do I need to manually edit the UCD files or is there a more streamlined process?

The process is described here (here I expect it would basically mean add to UnicodeData, add to Scripts, and as you found, regenerate UCD with MakeUnicodeFiles).

@eggrobin
Copy link
Member

it also made some unexpected changes to DerivedAge.txt…

What changes? (Remember to clear the stupid BIN cache which never works, see #1125. I will go right ahead and fix this issue, this is utterly ridiculous.)

@nedley
Copy link
Contributor Author

nedley commented Feb 2, 2026

What changes?

Some characters ended up with the wrong age, e.g. the ARCHAIC SHRII characters had 16.0

the stupid BIN cache which never works

That was my problem! Will push up something shortly.

@nedley nedley marked this pull request as ready for review February 2, 2026 18:40
@nedley nedley force-pushed the ned/emoji_18_alpha branch 2 times, most recently from baf7bca to b502bd6 Compare February 2, 2026 19:26
@nedley
Copy link
Contributor Author

nedley commented Feb 2, 2026

Where did the Derived files go? Whoops…

@nedley nedley force-pushed the ned/emoji_18_alpha branch from b502bd6 to 29f7fea Compare February 2, 2026 19:29
@nedley
Copy link
Contributor Author

nedley commented Feb 2, 2026

So, rather than fixing all of the dumb merge conflicts in extracted/ I deleted the files, assuming they would be entirely rebuilt. Let that be a lesson, I suppose.

@nedley nedley force-pushed the ned/emoji_18_alpha branch from 29f7fea to a1b326c Compare February 2, 2026 20:02
@nedley
Copy link
Contributor Author

nedley commented Feb 2, 2026

For some reason the extracted/ files still don’t have the new characters. Help me @eggrobin, you’re my only hope.

@eggrobin
Copy link
Member

eggrobin commented Feb 2, 2026

I ran the following commands (Windows, in-source; paths and syntactic details will probably differ for you).

mvn compile exec:java '-Dexec.mainClass="org.unicode.text.UCD.MakeUnicodeFiles"'  '-Dexec.args="-c"' -am -pl unicodetools  "-DCLDR_DIR=..\cldr\"  "-DUNICODETOOLS_GEN_DIR=Generated"  "-DUNICODETOOLS_REPO_DIR=."
git commit -am "Regenerate UCD"
n compile exec:java '-Dexec.mainClass="org.unicode.tools.GenerateLinkData"' -am -pl unicodetools  "-DCLDR_DIR=..\cldr\"  "-DUNICODETOOLS_GEN_DIR=Generated"  "-DUNICODETOOLS_REPO_DIR=."
git add *LinkTerm.txt
git commit -m "And regenerate LinkTerm too"

I need to make MakeUnicodeFiles regenerate link data so I don’t need to have those last three lines, this is getting annoying.

P.-S.: GitHub actions seems to be in a bad mood and isn’t giving us runners to run the tests.

@nedley
Copy link
Contributor Author

nedley commented Feb 2, 2026

P.-S.: GitHub actions seems to be in a bad mood and isn’t giving us runners to run the tests.

They’re probably just upset with all the dumb stuff I made them check.

@eggrobin
Copy link
Member

eggrobin commented Feb 2, 2026

GitHub Actions stopped sulking and complained about a couple of things. Hopefully fixed now.

eggrobin
eggrobin previously approved these changes Feb 3, 2026
@eggrobin eggrobin merged commit 803d843 into main Feb 3, 2026
16 checks passed
@eggrobin eggrobin deleted the ned/emoji_18_alpha branch February 3, 2026 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants