Upgrade to Unicode 16.0 and add support for almost all gemoji aliases#214
Open
shane-tw wants to merge 2 commits into
Open
Upgrade to Unicode 16.0 and add support for almost all gemoji aliases#214shane-tw wants to merge 2 commits into
shane-tw wants to merge 2 commits into
Conversation
d0545a0 to
cb64ec6
Compare
rafaeljusto
approved these changes
Mar 24, 2025
67c2d17 to
e6167a2
Compare
Author
|
Made various changes to hopefully improve the reviewability and reduce breaking changes in this PR |
8c9b327 to
6750954
Compare
- Add 330 new emojis from Unicode 10.0–16.0 - Add 742 new gemoji aliases (2711 total, up from 1969) - Gender-explicit aliases corrected to ZWJ variants: 👰♀️, 💂♂️, 🤵♂️, 👳♂️, 🙆♀️ now map to their gender-specific forms - Fix erroneous leading ♾ on 🏴☠️ / :jolly_roger: - emojis.json ordered with existing entries in original positions and new entries appended, using the original \uXXXX surrogate-pair encoding - FE0F handled transparently at the trie level: the trie accepts emoji sequences with or without FE0F; EmojiParser stores actual emojiEndIndex from getEmojiEndPos() so positions are correct when FE0F is in input - Bump version to 5.1.2 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
14e587d to
1832975
Compare
218 existing entries had FE0F appended to their emojiChar field (e.g. © → ©️, ® → ®️, ☺ →☺️ , ✂ → ✂️, etc.) without a corresponding change to the emoji field used at runtime. One entry (satellite) also had FE0F added to its emoji field (🛰 → 🛰️). Both changes are incorrect for pre-existing entries: the emoji field (getUnicode() return value) must match emojiChar or the EMOJIS.md docs become misleading. Newly-added emojis that include FE0F are unaffected. Regenerate EMOJIS.md from the corrected emojis.json. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Upgrades the emoji database to Unicode 16.0 and adds support for almost all gemoji aliases.
What's included
getUnicode()return values andemojiCharvalues are unchanged from v5.1.1.:pirate_flag:/:jolly_roger:Alias mapping changes
Six existing aliases now point to different emoji. These are intentional corrections:
:bride_with_veil::guardsman::man_in_tuxedo::man_with_turban::ok_woman::email::envelope:now maps to ✉)Note:
getUnicode()for the gendered entries returns the ZWJ sequence without FE0F (e.g.💂♂); FE0F is included inemojiCharfor correct emoji rendering.All other existing alias targets are preserved.
Differences from gemoji
This repo intentionally diverges from gemoji on 5 aliases for backward compatibility. The gemoji-preferred emoji is still reachable via an alternative alias:
:beetle::stag_beetle::jar::mason_jar::ng::squared_ng::om::om_symbol::satellite::satellite_antenna:This repo also carries ~800 aliases that gemoji doesn't have, including 238 two-letter country-code flag aliases (
:ac:,:ad:, …) and ~560 others that predate this PR.