fix: normalize slugs to NFC and remove emoji variation selector #2597
+4
−3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Background
While generating slugs, we noticed two issues affecting consistency:
Invisible FE0F (Variation Selector-16)
Certain symbols can appear either as text or emoji, depending on whether they are followed by
U+FE0F
.✏
(U+270F)✏️
(U+270F U+FE0F)If
FE0F
is not explicitly removed, the slug may contain invisible characters, leading to inconsistent URLs.NFKD decomposition side effects
normalize("NFKD")
decomposes precomposed characters likeé
(U+00E9
) intoe + U+0301
.Changes
Remove
U+FE0F
explicitlyEnsures emoji and text variants of the same symbol produce identical slugs.
Example:
"✏ note"
→"note"
"✏️ note"
→"note"
Switch normalization from NFKD → NFC
é
) remain stable instead of being split into combining sequences.Examples
café
café
(e + ◌́
)café
(U+00E9
)✏ note
note
note
✏️ note
(with FE0F)note
(contained FE0F)note
Related issue, if any:
What kind of change does this PR introduce?
For any code change,
Does this PR introduce a breaking change?
Tested in the following browsers: