@@ -30,10 +30,11 @@ Motivation
30
30
==========
31
31
32
32
CPython has modules for several different compression formats, such as
33
- :mod: `zlib (DEFLATE) <zlib> `, :mod: `bzip2 <bz2> `, and :mod: `lzma <lzma> `,
34
- each widely used. Including popular compression algorithms matches Python's
35
- "batteries included" philosophy of incorporating widely useful standards and
36
- utilities. :mod: `!lzma ` is the most recent such module, added in Python 3.3.
33
+ :mod: `zlib (DEFLATE) <zlib> `, :mod: `gzip <gzip> `, :mod: `bzip2 <bz2> `, and
34
+ :mod: `lzma <lzma> `, each widely used. Including popular compression algorithms
35
+ matches Python's "batteries included" philosophy of incorporating widely useful
36
+ standards and utilities. :mod: `!lzma ` is the most recent such module, added in
37
+ Python 3.3.
37
38
38
39
Since then, Zstandard has become the modern *de facto * preferred compression
39
40
library for both high performance compression and decompression attaining high
@@ -216,9 +217,10 @@ used to build libraries CPython depends on for Windows.
216
217
Other compression modules
217
218
-------------------------
218
219
219
- New import names ``compression.lzma ``, ``compression.bz2 ``, and
220
- ``compression.zlib `` will be introduced in Python 3.14 re-exporting the
221
- contents of the existing ``lzma ``, ``bz2 ``, and ``zlib `` modules respectively.
220
+ New import names ``compression.lzma ``, ``compression.bz2 ``,
221
+ ``compression.gzip `` and ``compression.zlib `` will be introduced in Python 3.14
222
+ re-exporting the contents of the existing ``lzma ``, ``bz2 ``, ``gzip `` and
223
+ ``zlib `` modules respectively.
222
224
223
225
The ``_compression `` module, given that it is marked private, will be
224
226
immediately renamed to ``compression._common.streams ``. The new name was
@@ -289,17 +291,80 @@ decision is reached regarding the open issues.
289
291
Rejected Ideas
290
292
==============
291
293
292
- Name the module ``libzstd `` and do not make a new ``compression `` namespace
294
+ Name the module ``zstdlib `` and do not make a new ``compression `` namespace
293
295
---------------------------------------------------------------------------
294
296
295
297
One option instead of making a new ``compression `` namespace would be to find
296
- a different name, such as ``libzstd ``, as the import name. However, the issue
297
- of existing import names is likely to persist for future compression formats
298
- added to the standard library. LZ4, a common high speed compression format,
299
- has `a package on PyPI <https://pypi.org/project/lz4/ >`_, ``lz4 ``, with the
300
- import name ``lz4 ``. Instead of solving this issue for each compression format,
301
- it is better to solve it once and for all by using the already-claimed
302
- ``compression `` namespace.
298
+ a different name, such as ``zstdlib ``, as the import name. Several other names,
299
+ such as ``zst ``, ``libzstd ``, and ``zstdcomp `` were proposed as well. In
300
+ discussion, the names were found to either be too easy to typo, or unintuitive.
301
+ Furthermore, the issue of existing import names is likely to persist for future
302
+ compression formats added to the standard library. LZ4, a common high speed
303
+ compression format, has `a package on PyPI <https://pypi.org/project/lz4/ >`_,
304
+ ``lz4 ``, with the import name ``lz4 ``. Instead of solving this issue for each
305
+ compression format, it is better to solve it once and for all by using the
306
+ already-claimed ``compression `` namespace.
307
+
308
+ Introduce an experimental ``_zstd `` package in Python 3.14
309
+ ----------------------------------------------------------
310
+
311
+ Since this PEP was published close to the beta cutoff for new features for
312
+ Python 3.14, one proposal was to name the package a private module ``_zstd ``
313
+ so that packaging tools could use it sooner, but not deciding on a name. This
314
+ would allow more time for discussion of the final module name during the 3.15
315
+ development window. However, introducing a private module was not popular. The
316
+ expectations and contract for external usage of a private module in the
317
+ standard library are unclear.
318
+
319
+ Introduce a standard library namespace instead of ``compression ``
320
+ -----------------------------------------------------------------
321
+
322
+ One alternative to a ``compression `` namespace would be to introduce a
323
+ ``std `` namespace for the entire standard library. However, this was seen as
324
+ too significant a change for 3.14, with no agreed upon semantics, migration
325
+ path, or name for the package. Furthermore, a future PEP introducing a ``std ``
326
+ namespace could always define that the ``compression `` sub-modules be flattened
327
+ into the ``std `` namespace.
328
+
329
+ Include ``zipfile `` and ``tarfile `` in ``compression ``
330
+ ------------------------------------------------------
331
+
332
+ Compression is often used with archiving tools, so putting both :mod: `zipfile `
333
+ and :mod: `tarfile ` under the ``compression `` namespace is appealing. However,
334
+ compression can be used beyond just archiving tools. For example, network
335
+ requests can be gzip compressed. Furthermore, formats like tar do not include
336
+ compression themselves, instead relying on external compression. Therefore,
337
+ this PEP does not propose moving :mod: `!zipfile ` or :mod: `!tarfile ` under
338
+ ``compression ``.
339
+
340
+ Do not include ``gzip `` under ``compression ``
341
+ ---------------------------------------------
342
+
343
+ The :rfc: `GZip format RFC <1952 >` defines a format which can include multiple
344
+ blocks and metadata about its contents. In this way GZip is rather similar to
345
+ archive formats like ZIP and tar. Despite that, in usage GZip is often treated
346
+ as a compression format rather than an archive format. Looking at how different
347
+ languages classify GZip, the prevailing trend is to classify it as a
348
+ compression format and not an archiving format.
349
+
350
+ ========== ======================== ==============================================================================
351
+ Language Compression or Archive Documentation Link
352
+ ========== ======================== ==============================================================================
353
+ Golang Compression https://pkg.go.dev/compress/gzip
354
+ Ruby Compression https://docs.ruby-lang.org/en/master/Zlib/GzipFile.html
355
+ Rust Compression https://github.com/rust-lang/flate2-rs
356
+ Haskell Compression https://hackage.haskell.org/package/zlib
357
+ C# Compression https://learn.microsoft.com/en-us/dotnet/api/system.io.compression.gzipstream
358
+ Java Archive https://docs.oracle.com/javase/8/docs/api/java/util/zip/package-summary.html
359
+ NodeJS Compression https://nodejs.org/api/zlib.html
360
+ Web APIs Compression https://developer.mozilla.org/en-US/docs/Web/API/Compression_Streams_API
361
+ PHP Compression https://www.php.net/manual/en/function.gzcompress.php
362
+ Perl Compression https://perldoc.perl.org/IO::Compress::Gzip
363
+ ========== ======================== ==============================================================================
364
+
365
+ In addition, the :mod: `!gzip ` module in Python mostly focuses on single block
366
+ content and has an API similar to other compression modules, making it a good
367
+ fit for the ``compression `` namespace.
303
368
304
369
305
370
Copyright
0 commit comments