Skip to content

Commit dcd58c5

Browse files
pythongh-125260: Change the default gzip.compress() mtime to 0 (python#125261)
This follows GNU gzip, which defaults to using 0 as the mtime for compressing stdin, where no file mtime is involved. This makes the output of gzip.compress() deterministic by default, greatly helping reproducible builds. Co-authored-by: Adam Turner <[email protected]>
1 parent 9944ad3 commit dcd58c5

File tree

4 files changed

+23
-5
lines changed

4 files changed

+23
-5
lines changed

Doc/library/gzip.rst

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -184,11 +184,12 @@ The module defines the following items:
184184
attribute instead.
185185

186186

187-
.. function:: compress(data, compresslevel=9, *, mtime=None)
187+
.. function:: compress(data, compresslevel=9, *, mtime=0)
188188

189189
Compress the *data*, returning a :class:`bytes` object containing
190190
the compressed data. *compresslevel* and *mtime* have the same meaning as in
191-
the :class:`GzipFile` constructor above.
191+
the :class:`GzipFile` constructor above,
192+
but *mtime* defaults to 0 for reproducible output.
192193

193194
.. versionadded:: 3.2
194195
.. versionchanged:: 3.8
@@ -203,6 +204,10 @@ The module defines the following items:
203204
.. versionchanged:: 3.13
204205
The gzip header OS byte is guaranteed to be set to 255 when this function
205206
is used as was the case in 3.10 and earlier.
207+
.. versionchanged:: 3.14
208+
The *mtime* parameter now defaults to 0 for reproducible output.
209+
For the previous behaviour of using the current time,
210+
pass ``None`` to *mtime*.
206211

207212
.. function:: decompress(data)
208213

Lib/gzip.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -580,12 +580,12 @@ def _rewind(self):
580580
self._new_member = True
581581

582582

583-
def compress(data, compresslevel=_COMPRESS_LEVEL_BEST, *, mtime=None):
583+
def compress(data, compresslevel=_COMPRESS_LEVEL_BEST, *, mtime=0):
584584
"""Compress data in one shot and return the compressed string.
585585
586586
compresslevel sets the compression level in range of 0-9.
587-
mtime can be used to set the modification time. The modification time is
588-
set to the current time by default.
587+
mtime can be used to set the modification time.
588+
The modification time is set to 0 by default, for reproducibility.
589589
"""
590590
# Wbits=31 automatically includes a gzip header and trailer.
591591
gzip_data = zlib.compress(data, level=compresslevel, wbits=31)

Lib/test/test_gzip.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -713,6 +713,17 @@ def test_compress_mtime(self):
713713
f.read(1) # to set mtime attribute
714714
self.assertEqual(f.mtime, mtime)
715715

716+
def test_compress_mtime_default(self):
717+
# test for gh-125260
718+
datac = gzip.compress(data1, mtime=0)
719+
datac2 = gzip.compress(data1)
720+
self.assertEqual(datac, datac2)
721+
datac3 = gzip.compress(data1, mtime=None)
722+
self.assertNotEqual(datac, datac3)
723+
with gzip.GzipFile(fileobj=io.BytesIO(datac3), mode="rb") as f:
724+
f.read(1) # to set mtime attribute
725+
self.assertGreater(f.mtime, 1)
726+
716727
def test_compress_correct_level(self):
717728
for mtime in (0, 42):
718729
with self.subTest(mtime=mtime):
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
The :func:`gzip.compress` *mtime* parameter now defaults to 0 for reproducible output.
2+
Patch by Bernhard M. Wiedemann and Adam Turner.

0 commit comments

Comments
 (0)