Skip to content

HTML entities are incorrectly encoded #89

@clauswilke

Description

@clauswilke

I believe all the HTML entities defined here: https://github.com/r-lib/marquee/blob/main/src/named_entities.h are incorrect. They all have an extra 0 in the hex representation.

Examples:

  defined here:

{" ", "\u000A0"},

It should be \u00A0 but it is \u000A0. We can see that marquee returns the latter (with the extra zero).

cat(marquee::marquee_parse("- -")$text[2]) # marquee result
#> -
#> 0-
cat("-\u000A0-") # same code printed directly
#> -
#> 0-
cat("-\u00A0-") # correct code for non-breaking space
#> - -

Another example, > defined here:

{">", "\u0003E"},

cat(marquee::marquee_parse("->-")$text[2]) # marquee result
#> -�E-
cat("-\u0003E-") # same code printed directly
#> -�E-
cat("-\u003E-") # correct code for > sign
#> ->-

It looks to me like the extra 0 is there consistently throughout the entire lookup table, but I haven't checked systematically. Happy to prepare a PR that removes these zeros, as my first contribution to marquee.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions