Skip to content

Commit 2d333a0

Browse files
emmatypinghugovkwarsaw
authored
PEP 778: Supporting Symlinks in Wheels (#3786)
* Add PEP 778 and reserve 777 * Add Paul Moore as PEP delegate Co-authored-by: Hugo van Kemenade <[email protected]> * Add Paul Moore as PEP delegate Co-authored-by: Hugo van Kemenade <[email protected]> * Update peps/pep-0778.rst Co-authored-by: Hugo van Kemenade <[email protected]> * Remove extra underline Co-authored-by: Hugo van Kemenade <[email protected]> * Fix misspelling Co-authored-by: Hugo van Kemenade <[email protected]> * Remove PEP 777 The PEP has been split out into #4036 * Add codeowners * Remove third p from suppport Co-authored-by: Hugo van Kemenade <[email protected]> * Mark PEP 778 deferred * Apply suggestions from Barry to PEP text Co-authored-by: Barry Warsaw <[email protected]> * Fix lint error --------- Co-authored-by: Hugo van Kemenade <[email protected]> Co-authored-by: Barry Warsaw <[email protected]>
1 parent 7424d30 commit 2d333a0

File tree

2 files changed

+278
-0
lines changed

2 files changed

+278
-0
lines changed

.github/CODEOWNERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -655,6 +655,7 @@ peps/pep-0774.rst @savannahostrowski
655655
peps/pep-0775.rst @encukou
656656
peps/pep-0776.rst @hoodmane @ambv
657657
peps/pep-0777.rst @warsaw
658+
peps/pep-0778.rst @warsaw
658659
# ...
659660
peps/pep-0779.rst @Yhg1s @colesbury @mpage
660661
peps/pep-0780.rst @lysnikolaou

peps/pep-0778.rst

Lines changed: 277 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,277 @@
1+
PEP: 778
2+
Title: Supporting Symlinks in Wheels
3+
Author: Emma Harper Smith <[email protected]>
4+
Sponsor: Barry Warsaw <[email protected]>
5+
PEP-Delegate: Paul Moore <[email protected]>
6+
Discussions-To: https://discuss.python.org/t/pep-778-supporting-symlinks-in-wheels/53824
7+
Status: Deferred
8+
Type: Standards Track
9+
Topic: Packaging
10+
Requires: 777
11+
Created: 18-May-2024
12+
Post-History: `10-Oct-2024 <https://discuss.python.org/t/pep-778-supporting-symlinks-in-wheels/53824>`__
13+
14+
Abstract
15+
========
16+
17+
Wheels currently do not handle symlinks well, copying content instead of making symlinks when
18+
installed. To properly handle distributing libraries in wheels, we propose a new ``LINKS``
19+
metadata file to handle symlinks in a platform portable manner. This specification requires
20+
a new wheel major version, discussed in :pep:`777`.
21+
22+
PEP Deferral
23+
============
24+
25+
This PEP has been deferred until a better compatibility story for major changes to the wheel
26+
format is established. Once a compatibility story is established for wheels which allows backwards
27+
incompatible behavior in an unobtrusive way, the following points should be addressed in this PEP:
28+
29+
- Re-focus this topic to just symlinks for shared libraries on POSIX platforms, perhaps tied to
30+
platform tags?
31+
- Should the symlinks be materialized as file attributes in the archive or a ``LINKS`` file?
32+
Could it be encoded in ``RECORD``?
33+
- Clarify that this PEP is insufficient to be useful for :pep:`660` editable installs since it will no
34+
longer be cross platform.
35+
- Describe fallback behavior in instances where symlinks are unavailable on POSIX platforms.
36+
37+
Motivation
38+
==========
39+
40+
Today, symlinks in wheels get created as copies of files, as `the zipfile module
41+
<https://docs.python.org/3/library/zipfile.html>`_ in CPython `does not support handling symlinks
42+
in-place <https://github.com/python/cpython/issues/82102>`_ for security reasons.
43+
44+
This `presents problems to projects that would like to ship large compiled libraries
45+
<https://pypackaging-native.github.io/other_issues/#lack-of-support-for-symlinks-in-wheels>`_ in
46+
wheels, as they must choose to either greatly increase the install size of the project on disk,
47+
or omit the symlink and potentially break some downstream use cases.
48+
49+
To ship a library that can properly be loaded for runtime use or build time linking on POSIX, a
50+
library should follow the conventions of POSIX-style loader and linker search. The two main file names for
51+
the loader to use is the "soname" and the "real name". The "soname" is a file like
52+
``libfoo.so.3`` where ``3`` is a number that is incremented when the interface of the library
53+
changes. The "real name" is a file named like ``libfoo.so.3.1.4``, where the extra version
54+
information lets the loader find a specific version of a library. Finally, when compiling code to
55+
link against a library, the linker searches for a "linker name", named like ``libfoo.so``. A more
56+
detailed description is available in `this Linux documentation on shared libraries
57+
<https://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html>`_. To fully support all
58+
runtime and build time use cases, a project requires shipping all 3 files. Normally, this is
59+
handled on POSIX platforms by using symlinks, so that the library is not duplicated on disk 3 times.
60+
61+
Returning to Python packaging, there are many popular projects which ship binary libraries, such as
62+
``numpy``, ``scipy``, and ``pyarrow``. Other site-packages ``dlopen`` libraries in other wheels, such as
63+
``pytorch`` and ``jax``. These projects currently rely on a single library in the wheel, but
64+
this can cause the linker to find the wrong library if there are system libraries that have a
65+
"real name" library version available.
66+
67+
There is also the potential benefit that symlinks in wheels would allow for simpler editable
68+
installs by simply placing a symlink in the user's ``site-packages`` directory, but this PEP
69+
leaves that as an open question to be explored in a future PEP.
70+
71+
Rationale
72+
=========
73+
74+
To support the 3 main namings of a library used in loading and library linking on POSIX, we
75+
propose adding support for symlinks in Python wheels. To allow for tracking symlinks made, and to
76+
potentially support other platforms that may not support POSIX symlinks directly, we propose the
77+
use of a new wheel metadata file ``LINKS``, which will exist in the ``.dist-info`` directory alongside
78+
``METADATA``, ``RECORD``, and other metadata files.
79+
80+
Using a ``LINKS`` file will allow for more cross-platform uses of symlink-like usage. On Windows,
81+
symlinks require either `a group policy allowing the user to make symlinks
82+
<https://learn.microsoft.com/en-us/previous-versions/windows/it-pro/windows-10/security/threat-protection/security-policy-settings/create-symbolic-links>`_
83+
(e.g. by enabling `Dev Mode
84+
<https://learn.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development>`_)
85+
or Administrative permissions. This means that it may be the case that symlinks are unsupported on
86+
some user systems. By using a ``LINKS`` file, installers will be able to potentially use other
87+
methods for handling symlinks, such as junctions on Windows, where otherwise the installer would
88+
have to fail.
89+
90+
This PEP also describes checks that installers must make when installing an updated wheel. These
91+
checks exist to handle security risks from allowing wheels to install symlinks. For more
92+
information on why these checks are important, see `Security Implications`_.
93+
94+
Specification
95+
=============
96+
97+
Wheel Major Version Bump
98+
------------------------
99+
100+
This PEP requires a wheel major version bump, so the ``Wheel-Version`` for wheels generated with
101+
``LINKS`` **MUST** be at least version ``2.0``, so that older installers do not silently fail to
102+
install symlinks and break user environments. For more see :pep:`777`.
103+
104+
New ``LINKS`` Metadata File
105+
---------------------------
106+
107+
To enable cross-platform symlinks, this PEP introduces a new wheel metadata file, ``LINKS``. An
108+
example of a ``LINKS`` file is below::
109+
110+
my_package/libfoo.so.3.1.4,my_package/libfoo.so.3
111+
my_package/libfoo.so.3,my_package/libfoo.so
112+
113+
The format of ``LINKS``, as can seen above, is ``source_path,target_path`` where ``source_path``
114+
is a path relative to the root of any namespace or package root in the wheel. ``target_path`` is a
115+
*non-dangling* path (i.e. a path that exists on the filesystem after the wheel's contents are
116+
extracted) in a package or namespace of any package in the wheel. This means that if a wheel
117+
contains multiple packages, all paths in packages in the wheel are acceptable.
118+
119+
Installer Behavior Specification
120+
--------------------------------
121+
122+
Installers **MUST** resolve the paths of any link contained in the ``LINKS`` file *before*
123+
deciding if any ``source_path`` or ``target_path`` are valid. Installers **MUST** verify that
124+
``source_path`` and ``target_path`` are located inside any namespace or package coming from the
125+
wheel. Installers **MUST** reject cyclic symlinks in wheels. Installers **MAY** error if a long
126+
chain of symlinks (symlinks pointing to symlinks many times repeated) exceeds a limit set by the
127+
installer.
128+
129+
Installers **MUST** follow the following steps when handling a wheel with symlinks:
130+
131+
1. Check for the existence of a ``LINKS`` file in the ``.dist-info``. If it does not exist,
132+
no further steps are required.
133+
2. Extract all files in the wheel packages and data directory as in wheel 1.x.
134+
3. Verify that for each ``source_path`` and ``target_path`` pairs, the ``target_path`` exists in
135+
one of the package namespaces just extracted.
136+
4. Next, check that the installer can make some kind of link for each pair in the site directory.
137+
If the installer cannot make a link for the file/folder ``target_path`` for the current
138+
platform, an error **MUST** be raised. An example of a failure mode would be a POSIX symlink to
139+
a file target, where the installer is running on Windows and the installer cannot make
140+
symlinks but can make junctions. In this case the installer **MUST** error because it cannot
141+
handle the link.
142+
5. Finally, the installer **MUST** add a platform-relevant link between ``source_path`` and
143+
``target_path``.
144+
145+
Installers **MUST NOT** by default copy files instead of generating a symlink when handling
146+
symlinks. Installers **MAY** have such behavior available under an alternate configuration or
147+
command line flag.
148+
149+
Build Backend Specification
150+
---------------------------
151+
152+
When creating a wheel, build backends **MUST** treat symlinks in the same way as its target when
153+
deciding whether to include the symlink in a wheel. Build backends **MUST** verify that there are
154+
no dangling symlinks in the ``LINKS`` file. Build backends **SHOULD** recognize platform-relevant
155+
symlinks that would be included in builds. On POSIX systems this is typically symlinks, on Windows this
156+
includes symlinks and junctions.
157+
158+
Backwards Compatibility
159+
=======================
160+
161+
Introducing symlinks would require an increment to the wheel format major version. This would mean
162+
new wheels that use the new wheel format would raise an error on older installer tools, per the
163+
`wheel specification
164+
<https://packaging.python.org/en/latest/specifications/binary-distribution-format/#file-contents>`_.
165+
166+
Please see :pep:`777` on "Wheel 2.0".
167+
168+
Security Implications
169+
=====================
170+
171+
Symlinks can be quite dangerous if not handled carefully. A simple example would be if a user were
172+
to run ``sudo pip install malicious``, and there were no protections, then the malicious package
173+
could overwrite ``/etc/shadow`` and replace the password hash on the system, allowing malicious
174+
logins.
175+
176+
This PEP lists several requirements on checks to run by installers on symlinks in wheels to ensure
177+
attacks like the one described above cannot happen. This means it is **critical** that installers
178+
carefully implement these security safeguards and prevent malicious use on package installation.
179+
180+
In particular, the following checks **MUST** be made by installers:
181+
182+
1. That the symlinks do not point outside of any packages or namespaces coming from the wheel
183+
2. That the symlinks are not dangling (the target exists at install time)
184+
3. That the symlinks are not cyclical, stopping after a certain depth of checking to avoid denial
185+
of service requests
186+
187+
Do not follow symlinks on removal.
188+
189+
How to Teach This
190+
=================
191+
192+
End users should, once the changes have propagated through the ecosystem, transparently experience
193+
the benefits of symlinks in wheels. It is important for installers to give clear error messages if
194+
symlinks are unsupported on the platform, and explain why installation has failed.
195+
196+
For people building libraries, documentation on ``packaging.python.org`` should describe the use
197+
cases and caveats (especially platform support) of symlinks in wheels. Otherwise it should be
198+
handled transparently by build backends in the same way any normal file would be handled.
199+
200+
Reference Implementation
201+
========================
202+
203+
TODO
204+
205+
Rejected Ideas
206+
==============
207+
208+
Just Use POSIX Symlinks Everywhere
209+
----------------------------------
210+
211+
This PEP wants to allow for ``LINKS`` to be used for a potential future :pep:`660` editable
212+
installation. This future PEP should support Windows, so it may need to use junctions.
213+
214+
Don't Use Junctions in ``LINKS``
215+
--------------------------------
216+
217+
Junctions are a limited way to support symlinks between folders on Windows. They do not support
218+
files. This PEP allows for junctions as users may wish to only link folders to a different
219+
location, and future :pep:`660` implementations may need to rely on this feature.
220+
221+
Put symlinks in the ``RECORD`` Metadata File
222+
--------------------------------------------
223+
224+
While this could be done, it would clutter the ``RECORD`` file. Furthermore the most
225+
straightforward implementation would place the target at the end of the record. This would
226+
make it harder to scan across the line and visually see what symlinks exist in the wheel.
227+
228+
Library Maintainers Should Use Python to Locate Libraries
229+
---------------------------------------------------------
230+
231+
Using Python to locate libraries would be much easier. However, some libraries like ``libtorch``
232+
are used by extension modules and themselves require loading dependencies. Some compiled libraries
233+
cannot use Python to find their loader dependencies.
234+
235+
Include Support for Hardlinks
236+
-----------------------------
237+
238+
This PEP does not specify any behavior around hardlinks. This is intentional. This is left as an
239+
extension to a future PEP.
240+
241+
Open Issues
242+
===========
243+
244+
PEP 660 and Deferring Editable Installation Support
245+
---------------------------------------------------
246+
247+
This PEP leaves the specification and implementation of a :pep:`660` editable installation
248+
mechanism as unresolved for a later PEP; should that be specified in this PEP?
249+
250+
Security
251+
--------
252+
253+
This PEP needs to be reviewed to make sure it would not allow for new security vulnerabilities.
254+
Are there other restrictions we should place on the source or target of symlinks to protect users?
255+
256+
Allow inter-package symlinks
257+
----------------------------
258+
259+
This could be useful for projects that want to shard dependencies such as large libraries between
260+
wheels but make them available in the main parent wheel.
261+
262+
The Format of ``LINKS``
263+
-----------------------
264+
265+
Currently the format is derived from ``RECORD``, but perhaps a better format exists.
266+
267+
Previous Discussion
268+
===================
269+
270+
https://discuss.python.org/t/symbolic-links-in-wheels/1945/25
271+
272+
273+
Copyright
274+
=========
275+
276+
This document is placed in the public domain or under the
277+
CC0-1.0-Universal license, whichever is more permissive.

0 commit comments

Comments
 (0)