Skip to content

build(deps): bump unicode-width from =0.1.12 to 0.2.2#15231

Open
RoloEdits wants to merge 1 commit intohelix-editor:masterfrom
RoloEdits:uniwidth
Open

build(deps): bump unicode-width from =0.1.12 to 0.2.2#15231
RoloEdits wants to merge 1 commit intohelix-editor:masterfrom
RoloEdits:uniwidth

Conversation

@RoloEdits
Copy link
Copy Markdown
Contributor

@RoloEdits RoloEdits commented Feb 2, 2026

This is (hopefully) a corrected version of #14643, which was reverted in #15150.

Different this time is that we upgrade to 0.2.2, which includes the newline width = 1 change. In 0.1.14, despite being free from this change, there was actually a larger change that made all control characters have a width of 1. The issue that lead to the reverting of the previous PR was due to this. Its seems that \t was the culprit here. As this needed special handling anyways, I decided to handle both this and the newline change, as the "fix" is the same for both.

I started by consolidating the raw unicode-width calls and wrapped it in a function that would hold our fixes. With the abstracting in place, I then propagated the new width function where needed.

I say "fix" because its not really that the new widths are wrong, terminals just expect them to be 0, and that there could be other areas where this is fixed, like in the rendering, handling the control characters there instead. But these changes piggyback off of what existed in the most straightforward way, and so was chosen.

The main change is to count the new 1 width control characters and then subtract the width that unicode-width gives us. There is some special handling that \r\n has in unicode-width, where the sequence itself has a width of 1, so this required some extra site documentation explaining how just \n can work here.

There is a char::is_control, but I opted for the manual matches! as this could be faster if the full set of options is realistically only a handle full of known characters. Time will tell if this is enough,

Beyond the implementation, it would be really nice if this could be tested out for a bit before merging. This is a tricky area that could have some edge cases. For now I have confirmed that the previous gopls issue is resolved and that it seems to be fine as far as rendering the rest of the UI.

Closes: #14642

@RoloEdits
Copy link
Copy Markdown
Contributor Author

ref: ratatui/ratatui#2188

@RoloEdits
Copy link
Copy Markdown
Contributor Author

So the rabbit-hole was even deeper. Came across some odd behavior while working on #12369, and integrating these changes together, I found that Nerd Fonts seemed to only be reporting a width of 1, even if it was a wide glyph. Turns out (at least as far as I understand), unicode-width follows the recommendations by Unicode to treat unknown(part of the Private User Area) as ambiguous, and also to treat these code-points as narrow, i.e. 1 width.

This is a problem, as most Nerd Fonts we would use are wide (2 width). So some extra handling was added to make sure this is covered, but its not full-proof; not all Nerf Fonts are wide. The current design is to assume that the vast majority of Nerd Fonts are wide, and then we just exclude narrow cases as they are found. I'm not sure how robust this will end up being, but the more I learn about unicode + terminals the more I see how crazy there aren't more issues.

I tried to document the reasoning, but let me know if more is needed or clarified. Happy to take any direction from anyone who knows better.

@RoloEdits
Copy link
Copy Markdown
Contributor Author

RoloEdits commented Apr 14, 2026

Hmm, these changes are leading to artifacting:

image

I saw this before, and I think its related to the diff implementation in helix-tui not properly handling zeroing out the empty 2nd width? It writes to one cell, and skips the other cell, but when that spot gets occupied, and then encounters another small width cell, there is a ghost.

You can find the cheat sheet in text form here: https://github.com/8bitmcu/NerdFont-Cheat-Sheet

@RoloEdits
Copy link
Copy Markdown
Contributor Author

Another issue is if users have a monospace font version. This would force the icons to fit within 1 cell at all times. So we cannot just go off of a visual check of the nerd fonts to see which should be which.

I think it might be fine to just let the width be 1, as the current handling should be fine in most situations. It can also be accounted for better if needed under the assumption they are all 1 width.

The other issue I found was that even things like are ambiguous, and using unicode-width with its width_cjk function would now report this as wide, which is not what we want.

@RoloEdits RoloEdits force-pushed the uniwidth branch 4 times, most recently from cd75a61 to dedb590 Compare April 15, 2026 21:04
@RoloEdits
Copy link
Copy Markdown
Contributor Author

Some more context: ryanoasis/nerd-fonts#1103

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Resync with latest unicode-width versions.

2 participants