Skip to content

Use C.UTF-8, not en_US.UTF-8 for regression testing #172

@paravoid

Description

@paravoid

The new 2.0.3 code that made the test suite fail on failures identified a long-standing issue in Debian's regression testing (that's good news!). Specifically, tests utilizing wide characters (like hrulers in footnotes) are, and have always been, failing under a pristine build environment.

The root cause is that the upstream Makefile sets REGRESS_ENV = LC_ALL=en_US.UTF-8. However, en_US.UTF-8 is not present in a pristine minimal chroot environment - it is only part of the "locales-all" package, or generated on the fly for en_US users.

Debian has supported "C.UTF-8" as a locale for purposes like the ones you're using this for, and I can confirm that the regression testing works if I set LC_ALL to C.UTF-8.

C.UTF-8 was adopted sporadically across multiple other Linux distributions, but became a standard with glibc 2.35, released almost 4 years ago. Other systems have supported it as well; I believe this includes both OpenBSD and FreeBSD. A quick web search shows that it may not be supported under macOS, but I don't have any systems to test this for.

It'd be great if the Makefile were to switch from en_US.UTF-8 to C.UTF-8. If there are compatibility reasons to avoid this, then could we make it possible to override the default from the environment? Right now "LC_ALL=C.UTF-8 make regress" fails, because the Makefile always overrides the value.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions