Skip to content

Commit 35c087e

Browse files
authored
Merge pull request #1281 from encukou/pep-721
PEP 721 — Using `tarfile.data_filter` for source distribution extraction
2 parents f858ac8 + 1b2b35e commit 35c087e

File tree

1 file changed

+80
-0
lines changed

1 file changed

+80
-0
lines changed

source/specifications/source-distribution-format.rst

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,3 +67,83 @@ whatever information they need in the sdist to build the project.
6767
The tarball should use the modern POSIX.1-2001 pax tar format, which specifies
6868
UTF-8 based file names. In particular, source distribution files must be readable
6969
using the standard library tarfile module with the open flag 'r:gz'.
70+
71+
72+
.. _sdist-archive-features:
73+
74+
Source distribution archive features
75+
====================================
76+
77+
Because extracting tar files as-is is dangerous, and the results are
78+
platform-specific, archive features of source distributions are limited.
79+
80+
Unpacking with the data filter
81+
------------------------------
82+
83+
When extracting a source distribution, tools MUST either use
84+
:py:func:`tarfile.data_filter` (e.g. :py:meth:`TarFile.extractall(..., filter='data') <tarfile.TarFile.extractall>`), OR
85+
follow the *Unpacking without the data filter* section below.
86+
87+
As an exception, on Python interpreters without :py:func:`hasattr(tarfile, 'data_filter') <tarfile.data_filter>`
88+
(:pep:`706`), tools that normally use that filter (directly on indirectly)
89+
MAY warn the user and ignore this specification.
90+
The trade-off between usability (e.g. fully trusting the archive) and
91+
security (e.g. refusing to unpack) is left up to the tool in this case.
92+
93+
94+
Unpacking without the data filter
95+
---------------------------------
96+
97+
Tools that do not use the ``data`` filter directly (e.g. for backwards
98+
compatibility, allowing additional features, or not using Python) MUST follow
99+
this section.
100+
(At the time of this writing, the ``data`` filter also follows this section,
101+
but it may get out of sync in the future.)
102+
103+
The following files are invalid in an *sdist* archive.
104+
Upon encountering such an entry, tools SHOULD notify the user,
105+
MUST NOT unpack the entry, and MAY abort with a failure:
106+
107+
- Files that would be placed outside the destination directory.
108+
- Links (symbolic or hard) pointing outside the destination directory.
109+
- Device files (including pipes).
110+
111+
The following are also invalid. Tools MAY treat them as above,
112+
but are NOT REQUIRED to do so:
113+
114+
- Files with a ``..`` component in the filename or link target.
115+
- Links pointing to a file that is not part of the archive.
116+
117+
Tools MAY unpack links (symbolic or hard) as regular files,
118+
using content from the archive.
119+
120+
When extracting *sdist* archives:
121+
122+
- Leading slashes in file names MUST be dropped.
123+
(This is nowadays standard behaviour for ``tar`` unpacking.)
124+
- For each ``mode`` (Unix permission) bit, tools MUST either:
125+
126+
- use the platform's default for a new file/directory (respectively),
127+
- set the bit according to the archive, or
128+
- use the bit from ``rw-r--r--`` (``0o644``) for non-executable files or
129+
``rwxr-xr-x`` (``0o755``) for executable files and directories.
130+
131+
- High ``mode`` bits (setuid, setgid, sticky) MUST be cleared.
132+
- It is RECOMMENDED to preserve the user *executable* bit.
133+
134+
135+
Further hints
136+
-------------
137+
138+
Tool authors are encouraged to consider how *hints for further
139+
verification* in ``tarfile`` documentation apply to their tool.
140+
141+
142+
History
143+
=======
144+
145+
* August 2023: Standardized the source distribution archive features (:pep:`721`)
146+
* September 2022: Standardized the filename of a source distribution (:pep:`625`)
147+
* July 2021: Defined what a source tree is
148+
* November 2020: :pep:`643` converted to this specification
149+
* December 2000: Source distributions standardized in :pep:`643`

0 commit comments

Comments
 (0)