containertool: Use same gzip headers on Linux and macOS #37
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
Packaging the same binary using the same version of
containertoolproduces different application image layers on macOS and Linux:The
application layerhashes are different, even though theycontain the same binary. The
image configurationmetadata blobhashes also differ, but they contain timestamps so this will continue
to happen even after this PR is merged. A future change could
make these timestamps default to the epoch, allowing identical
metadata blobs to be created on Linux and macOS as well.
The image layer is a gzipped TAR archive containing the executable. Saving the intermediate steps shows that the TAR archives are identical and the gzipped streams are different, but only by one byte:
The difference is in the 10th byte of the gzip header: the OS
field. RFC
1952 defines a list of known operating
systems:
0x03is the OS code for Unix, however the RFC was written in 1996so
Macintoshrefers to the classic MacOS. Zlib uses an updatedoperating system list
madler/zlib@ce12c5c
which defines
19/0x13as the OS code for Darwin.Interestingly, using
gzipto compress a file directly produces identical results on macOS and Linux (-nis needed to preventgzipfrom including the current timestamp on macOS):Modifications
By default, Zlib uses the value of
OS_CODEset at compile time. This commit usesdeflateSetHeader() to override the default gzip header, forcing the OS code to be 0x03 (Unix) on both Linux and macOS.
Result
After this change, image layers containing the same binary will use identical gzip headers and should have the same hash whether they
are built on Linux or macOS. It is still possible that different
versions of Zlib might produce different compressed data, causing
the overall hashes to change.
Test Plan
Tested manually on macOS and Linux, verifying that image layers containing identical binaries have identical hashes.
Added a test for
containertool'sgzipfunction.