feat: switch from iconv-lite to exodus/bytes #24

ChALkeR · 2025-12-16T20:27:47Z

An experiment

Fixes: #22
Fixes: #23

Fixes BOM handling for utf-8
Fixes replacement support for utf-16
Fixes utf-8 mistakes on non-Node.js
(iconv-lite Buffer usage maps to https://npmjs.com/buffer which has discrepancies)
Fixes windows-1252 missing chars (and other legacy single-byte)
Adds support for all missing encodings
Closes all other known incompatibilities with the encoding spec

This is also faster and ~2x smaller than iconv-lite.

I kept the API compatible (including label names)

This brings in some differences though which might be a blocker for now or at least a reason for a semver-major:

The dependency is ESM. That raises the minimum Node.js version requirement to ^20.19.0 || >=22.13.0 (from 18 currently supported by this module). 18 is EOL per Node.js release schedule
This also adds support for replacement, as it is expected to be supported in the hook for standards.
Anything using this module to make a TextDecoder polyfill could get support for replacement unexpectedly.
Users should check that they are not using replacement manually.
An alternative to be specifically block it from being used via this API.

I suggest to wait until an 1.0.0 release before landing (hence a draft), but I wanted to file this as a place to get comments

domenic · 2025-12-17T02:45:37Z

Thanks for working on this! However, since it'd be semver-major anyway, I think it's best to just deprecate this package, and use exodus/bytes directly in the jsdom ecosystem.

ChALkeR · 2025-12-17T03:03:58Z

@domenic Yeah, that's ok
This is also a demo of the API differences / completeness

ChALkeR · 2025-12-17T09:33:37Z

Update:

engines are now compatible with jsdom
normalizeEncoding now doesn't throw but returns null on invalid encoding as labelToName here
That removes try-catch

ChALkeR · 2025-12-17T09:45:24Z

@domenic I checked usage in jsdom. It also happens through https://npmjs.com/html-encoding-sniffer, which also needs labelToName and expects cased names.

It would be easier to keep lowercase -> cased mapping in a single place, at least while migrating

And then perhaps switch to all-lowercase names directly?
I don't think that cased encoding identifiers are a part of the spec and don't want to maintain them in @exodus/bytes for simplicity and bundle size

While semver-major, this is still a drop-in replacement
It will also be a semver-major for https://npmjs.com/html-encoding-sniffer, due to engines
It won't be a semver-major for jsdom

Also there might be other usage in the ecosystem that would benefit from a switch to a fixed implementation, and it would be easier to do that without having to switch APIs

domenic · 2025-12-17T11:44:02Z

I don't think that cased encoding identifiers are a part of the spec

They are? The encoding name concept is defined here https://encoding.spec.whatwg.org/#name and https://encoding.spec.whatwg.org/#names-and-labels is pretty clear that the names are cased, e.g. when it says

For each encoding, ASCII-lowercasing its name yields one of its labels.

But anyway, I'm happy to move to lowercased names throughout the jsdom ecosystem. Although I prefer jsdom's style as it matches the standard better, it doesn't affect any important user-facing behavior, and the benefit of removing an abstraction layer is high.

So, we can do a semver-major rev of html-encoding-sniffer to return the lowercased names instead of the canonical names, and to bump the engines requirements. The jsdom ecosystem treats semver major bumps as cheap so I'm not really worried about migration costs.

ChALkeR · 2025-12-17T16:16:50Z

For each encoding, ASCII-lowercasing its name yields one of its labels.

Hm, true, that implies that name is a string!

Other that that note, those are not exposed anywhere though and could be treated as enums.
The table doesn't list them as strings but as identifiers.

Also:

If these protocols and formats need to expose the encoding’s name or label, they must expose it as "utf-8".

ChALkeR · 2025-12-18T15:30:32Z

@domenic I'm adding some more tests to cover all known browser discrepancies (some of them are already fixed in Chrome and WebKit) and then planning to release a v1.0.0 of @exodus/bytes, after which breaking API changes (e.g. in exported helper methods) would be semver-major

Are there any changes you want me to land before then?
The helper methods usage / compat is demonstrated in this PR

domenic · 2025-12-19T00:59:06Z

Overall it seems great! I guess maybe adding some documentation for those exports would be helpful, especially around your custom concept of "canonicalized encoding label" that your library is based around (i.e., what it uses instead of the spec's encoding names). But that's not a breaking change. I'm excited to use this to fix such long-standing issues in the jsdom ecosystem!

ChALkeR · 2025-12-20T22:24:19Z

@domenic I just published v1.0.0

Added docs on hooks: https://github.com/exodusoss/bytes#exodusbytesencodingjs
Also added more tests and fixed an instance of Error -> TypeError.
Otherwise, no significant changes.

domenic · 2025-12-28T03:53:53Z

Closing all issues and PRs as this package is now deprecated.

This comment was marked as resolved.

Sign in to view

ChALkeR closed this Dec 17, 2025

ChALkeR force-pushed the chalker/exodus-bytes branch from ea2a8cb to 00690a4 Compare December 17, 2025 09:31

ChALkeR reopened this Dec 17, 2025

feat: switch from iconv-lite to exodus/bytes

15cd210

ChALkeR force-pushed the chalker/exodus-bytes branch from 00690a4 to 15cd210 Compare December 17, 2025 09:43

This was referenced Dec 21, 2025

Switch to lowercase and exodus/bytes jsdom/html-encoding-sniffer#18

Merged

Switch from iconv-lite to exodus/bytes for decoding jsdom/jsdom#4004

Merged

domenic closed this Dec 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: switch from iconv-lite to exodus/bytes #24

feat: switch from iconv-lite to exodus/bytes #24

Uh oh!

ChALkeR commented Dec 16, 2025 •

edited

Loading

Uh oh!

domenic commented Dec 17, 2025

Uh oh!

ChALkeR commented Dec 17, 2025

Uh oh!

This comment was marked as resolved.

ChALkeR commented Dec 17, 2025

Uh oh!

ChALkeR commented Dec 17, 2025 •

edited

Loading

Uh oh!

domenic commented Dec 17, 2025

Uh oh!

ChALkeR commented Dec 17, 2025

Uh oh!

ChALkeR commented Dec 18, 2025 •

edited

Loading

Uh oh!

domenic commented Dec 19, 2025

Uh oh!

ChALkeR commented Dec 20, 2025 •

edited

Loading

Uh oh!

domenic commented Dec 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

feat: switch from iconv-lite to exodus/bytes #24

feat: switch from iconv-lite to exodus/bytes #24

Uh oh!

Conversation

ChALkeR commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

domenic commented Dec 17, 2025

Uh oh!

ChALkeR commented Dec 17, 2025

Uh oh!

This comment was marked as resolved.

ChALkeR commented Dec 17, 2025

Uh oh!

ChALkeR commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

domenic commented Dec 17, 2025

Uh oh!

ChALkeR commented Dec 17, 2025

Uh oh!

ChALkeR commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

domenic commented Dec 19, 2025

Uh oh!

ChALkeR commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

domenic commented Dec 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ChALkeR commented Dec 16, 2025 •

edited

Loading

ChALkeR commented Dec 17, 2025 •

edited

Loading

ChALkeR commented Dec 18, 2025 •

edited

Loading

ChALkeR commented Dec 20, 2025 •

edited

Loading