Skip to content

Commit 23fcf8b

Browse files
rscharfegitster
authored andcommitted
archive-tar: use OS_CODE 3 (Unix) for internal gzip
gzip(1) encodes the OS it runs on in the 10th byte of its output. It uses the following OS_CODE values according to its tailor.h [1]: 0 - MS-DOS 3 - UNIX 5 - Atari ST 6 - OS/2 10 - TOPS-20 11 - Windows NT The gzip.exe that comes with Git for Windows uses OS_CODE 3 for some reason, so this value is used on practically all supported platforms when generating tgz archives using gzip(1). Zlib uses a bigger set of values according to its zutil.h [2], aligned with section 4.4.2 of the ZIP specification, APPNOTE.txt [3]: 0 - MS-DOS 1 - Amiga 3 - UNIX 4 - VM/CMS 5 - Atari ST 6 - OS/2 7 - Macintosh 8 - Z-System 10 - Windows NT 11 - MVS (OS/390 - Z/OS) 13 - Acorn Risc 16 - BeOS 18 - OS/400 19 - OS X (Darwin) Thus the internal gzip implementation in archive-tar.c sets different OS_CODE header values on major platforms Windows and macOS. Git for Windows uses its own zlib-based variant since v2.20.1 by default and thus embeds OS_CODE 10 in tgz archives. The tar archive for a commit is generated consistently on all systems (by the same Git version). The OS_CODE in the gzip header does not influence extraction. Avoid leaking OS information and make tgz archives constistent and reproducable (with the same Git and libz versions) by using OS_CODE 3 everywhere. At least on macOS 12.4 this produces the same output as gzip(1) for the examples I tried: # before $ git -c tar.tgz.command='git archive gzip' archive --format=tgz v2.36.0 | shasum 3abbffb40b7c63cf9b7d91afc682f11682f80759 - # with this patch $ git -c tar.tgz.command='git archive gzip' archive --format=tgz v2.36.0 | shasum dc6dc6ba9636d522799085d0d77ab6a110bcc141 - $ git archive --format=tar v2.36.0 | gzip -cn | shasum dc6dc6ba9636d522799085d0d77ab6a110bcc141 - [1] https://git.savannah.gnu.org/cgit/gzip.git/tree/tailor.h [2] https://github.com/madler/zlib/blob/master/zutil.h [3] https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT Helped-by: Johannes Schindelin <[email protected]> Signed-off-by: René Scharfe <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 76d7602 commit 23fcf8b

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

archive-tar.c

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -463,6 +463,9 @@ static const char internal_gzip_command[] = "git archive gzip";
463463
static int write_tar_filter_archive(const struct archiver *ar,
464464
struct archiver_args *args)
465465
{
466+
#if ZLIB_VERNUM >= 0x1221
467+
struct gz_header_s gzhead = { .os = 3 }; /* Unix, for reproducibility */
468+
#endif
466469
struct strbuf cmd = STRBUF_INIT;
467470
struct child_process filter = CHILD_PROCESS_INIT;
468471
int r;
@@ -473,6 +476,10 @@ static int write_tar_filter_archive(const struct archiver *ar,
473476
if (!strcmp(ar->filter_command, internal_gzip_command)) {
474477
write_block = tgz_write_block;
475478
git_deflate_init_gzip(&gzstream, args->compression_level);
479+
#if ZLIB_VERNUM >= 0x1221
480+
if (deflateSetHeader(&gzstream.z, &gzhead) != Z_OK)
481+
BUG("deflateSetHeader() called too late");
482+
#endif
476483
gzstream.next_out = outbuf;
477484
gzstream.avail_out = sizeof(outbuf);
478485

0 commit comments

Comments
 (0)