Skip to content

Character Set Decoding/Encoding Metabug #219

@wingman-jr-addon

Description

@wingman-jr-addon

This is a challenging issue. In short, Firefox doesn't provide builtin helpers to provide the input encoding and a suggested output encoding of the web filtering API. So for example, you have to do roughly the same thing the browser does to e.g. detect the inbound bytes are Windows-1252 and decode and encode in a sensible way. The actual algorithm to do this is found here: https://html.spec.whatwg.org/multipage/parsing.html#the-input-byte-stream In short, it's complicated and not something you'd generally expect unless you're implementing a browser or more specialized software yourself. (Also of interest: https://encoding.spec.whatwg.org/#utf-8-decoder)

The design of Wingman Jr. has evolved significantly, from a naive state with no knowledge to a more modern raw "is UTF-8 likely" byte pattern after many steps. Additionally, many encodings have been packed in to extend the limited TextEncoder API.

Related Issues:
#70
#182
#186
#194
#199
#201
#206
#217

Test pages:
https://www.w3.org/2006/11/mwbp-tests/index.xhtml (Test 4 doesn't pass but others do; vanilla FF and Chrome don't pass it either)

https://www.sem-deutschland.de/blog/typen-klassen-attribute/
https://www.finanztip.de/kaufrecht/lieferverzug-schadensersatz/
https://www.diskpart.com/de/help/cmd.html

https://www.deepl.com/de/translator
https://www.fakt-software.com/index_de.html
https://aikasacolle.itch.io/mizuchi
https://abbyhoward.itch.io/scarlet-hollow
https://www.sparen-wie-schwaben.de/
https://www.windows-faq.de/
https://karrierebibel.de/
https://www.nokia.com/phones/de_at/support/api/pdf/nokia-5310-user-guide
https://winfuture.de/news,123262.html
https://uniconverter.wondershare.de/ogg/aac-vs-ogg.html
https://buerohaus-ahner.bueroshops.de/artikeldetails/standard/SCA226002/abfallbehaelter-metall-20-liter-wei-wandmontage-moeglich.html
https://rutracker.org

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions