Skip to content

Linux available Locales affecting jdk-23+ reproduciblity across OS distributions #4284

@andrew-m-leonard

Description

@andrew-m-leonard

jdk23 changed to defaulting to C.UTF-8, or en_US.UTF-8 (as an acceptable fallback if C.UTF-8 not available).

The problem with en_US.UTF-8, is it is not language neutral for things like collation sequence across OS distributions. For example, the sorting of a resource property file containing Japanese characters will differ in en_US.UTF-8 between Ubtuntu and Centos. This means if an original build is built on Centos using en_US.UTF-8, then it is only 100% reproducible on Centos/Rhel.

Ideally, we should have C.UTF-8 available on all our Linux nodes, so we can use the preferred jdk-23+ locale.
The issue we have is C.UTF-8 is not available on Centos7.

Ways forward for 100% reproducible builds:

  • Restrict jdk-23+ reproducible builds to Centos & Rhel, so as to match en_US.UTF-8 locale and collation of the original Centos7 build. (Note: Centos7 aarch64 image does not currently have en_US.UTF-8, so is actually defaulting to C currently, with a configure Warning)
  • Move Temurin jdk-23+ builds to Centos8 or 9, so we can provide C.UTF-8 locale on all Linux architectures.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions