Use C.UTF-8, not en_US.UTF-8 for regression testing

The new 2.0.3 code that made the test suite fail on failures identified a long-standing issue in Debian's regression testing (that's good news!). Specifically, tests utilizing wide characters (like hrulers in footnotes) are, and have always been, failing under a pristine build environment.

The root cause is that the upstream Makefile sets `REGRESS_ENV     = LC_ALL=en_US.UTF-8`. However, `en_US.UTF-8` is not present in a pristine minimal chroot environment - it is only part of the "locales-all" package, or generated on the fly for en_US users.

Debian has supported "C.UTF-8" as a locale for purposes like the ones you're using this for, and I can confirm that the regression testing works if I set LC_ALL to C.UTF-8.

C.UTF-8 was adopted sporadically across multiple other Linux distributions, but became a standard with [glibc 2.35](https://mail.gnu.org/archive/html/info-gnu/2022-02/msg00002.html), released almost 4 years ago. Other systems have supported it as well; I believe this includes both OpenBSD and [FreeBSD](https://github.com/freebsd/freebsd-src/commit/09ef995baf45333d45ab214daf8c03e1a25f8fcc). A quick web search shows that it may not be supported under macOS, but I don't have any systems to test this for.

It'd be great if the Makefile were to switch from en_US.UTF-8 to C.UTF-8. If there are compatibility reasons to avoid this, then could we make it possible to override the default from the environment? Right now "LC_ALL=C.UTF-8 make regress" fails, because the Makefile always overrides the value.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use C.UTF-8, not en_US.UTF-8 for regression testing #172

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Use C.UTF-8, not en_US.UTF-8 for regression testing #172

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions