Skip to content

Commit 091e45c

Browse files
authored
Merge pull request #1397 from jeanas/formats-discussion
Add discussion of package formats, expanding "Wheel vs Egg" discussion
2 parents 4295ec8 + 5af20bf commit 091e45c

File tree

5 files changed

+204
-67
lines changed

5 files changed

+204
-67
lines changed

source/discussions/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ specific topic. If you're just trying to get stuff done, see
1212
deploying-python-applications
1313
pip-vs-easy-install
1414
install-requires-vs-requirements
15-
wheel-vs-egg
1615
distribution-package-vs-import-package
16+
package-formats
1717
src-layout-vs-flat-layout
1818
setup-py-deprecated
Lines changed: 193 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,193 @@
1+
.. _package-formats:
2+
3+
===============
4+
Package Formats
5+
===============
6+
7+
This page discusses the file formats that are used to distribute Python packages
8+
and the differences between them.
9+
10+
You will find files in two formats on package indices such as PyPI_: **source
11+
distributions**, or **sdists** for short, and **binary distributions**, commonly
12+
called **wheels**. For example, the `PyPI page for pip 23.3.1 <pip-pypi_>`_
13+
lets you download two files, ``pip-23.3.1.tar.gz`` and
14+
``pip-23.3.1-py3-none-any.whl``. The former is an sdist, the latter is a
15+
wheel. As explained below, these serve different purposes. When publishing a
16+
package on PyPI (or elsewhere), you should always upload both an sdist and one
17+
or more wheel.
18+
19+
20+
What is a source distribution?
21+
==============================
22+
23+
Conceptually, a source distribution is an archive of the source code in raw
24+
form. Concretely, an sdist is a ``.tar.gz`` archive containing the source code
25+
plus an additional special file called ``PKG-INFO``, which holds the project
26+
metadata. The presence of this file helps packaging tools to be more efficient
27+
by not needing to compute the metadata themselves. The ``PKG-INFO`` file follows
28+
the format specified in :ref:`core-metadata` and is not intended to be written
29+
by hand [#core-metadata-format]_.
30+
31+
You can thus inspect the contents of an sdist by unpacking it using standard
32+
tools to work with tar archives, such as ``tar -xvf`` on UNIX platforms (like
33+
Linux and macOS), or :ref:`the command line interface of Python's tarfile module
34+
<python:tarfile-commandline>` on any platform.
35+
36+
Sdists serve several purposes in the packaging ecosystem. When :ref:`pip`, the
37+
standard Python package installer, cannot find a wheel to install, it will fall
38+
back on downloading a source distribution, compiling a wheel from it, and
39+
installing the wheel. Furthermore, sdists are often used as the package source
40+
by downstream packagers (such as Linux distributions, Conda, Homebrew and
41+
MacPorts on macOS, ...), who, for various reasons, may prefer them over, e.g.,
42+
pulling from a Git repository.
43+
44+
A source distribution is recognized by its file name, which has the form
45+
:samp:`{package_name}-{version}.tar.gz`, e.g., ``pip-23.3.1.tar.gz``.
46+
47+
.. TODO: provide clear guidance on whether sdists should contain docs and tests.
48+
Discussion: https://discuss.python.org/t/should-sdists-include-docs-and-tests/14578
49+
50+
If you want technical details on the sdist format, read the :ref:`sdist
51+
specification <source-distribution-format>`.
52+
53+
54+
What is a wheel?
55+
================
56+
57+
Conceptually, a wheel contains exactly the files that need to be copied when
58+
installing the package.
59+
60+
There is a big difference between sdists and wheels for packages with
61+
:term:`extension modules <extension module>`, written in compiled languages like
62+
C, C++ and Rust, which need to be compiled into platform-dependent machine code.
63+
With these packages, wheels do not contain source code (like C source files) but
64+
compiled, executable code (like ``.so`` files on Linux or DLLs on Windows).
65+
66+
Furthermore, while there is only one sdist per version of a project, there may
67+
be many wheels. Again, this is most relevant in the context of extension
68+
modules. The compiled code of an extension module is tied to an operating system
69+
and processor architecture, and often also to the version of the Python
70+
interpreter (unless the :ref:`Python stable ABI <cpython-stable-abi>` is used).
71+
72+
For pure-Python packages, the difference between sdists and wheels is less
73+
marked. There is normally one single wheel, for all platforms and Python
74+
versions. Python is an interpreted language, which does not need ahead-of-time
75+
compilation, so wheels contain ``.py`` files just like sdists.
76+
77+
If you are wondering about ``.pyc`` bytecode files: they are not included in
78+
wheels, since they are cheap to generate, and including them would unnecessarily
79+
force a huge number of packages to distribute one wheel per Python version
80+
instead of one single wheel. Instead, installers like :ref:`pip` generate them
81+
while installing the package.
82+
83+
With that being said, there are still important differences between sdists and
84+
wheels, even for pure Python projects. Wheels are meant to contain exactly what
85+
is to be installed, and nothing more. In particular, wheels should never include
86+
tests and documentation, while sdists commonly do. Also, the wheel format is
87+
more complex than sdist. For example, it includes a special file -- called
88+
``RECORD`` -- that lists all files in the wheel along with a hash of their
89+
content, as a safety check of the download's integrity.
90+
91+
At a glance, you might wonder if wheels are really needed for "plain and basic"
92+
pure Python projects. Keep in mind that due to the flexibility of sdists,
93+
installers like pip cannot install from sdists directly -- they need to first
94+
build a wheel, by invoking the :term:`build backend` that the sdist specifies
95+
(the build backend may do all sorts of transformations while building the wheel,
96+
such as compiling C extensions). For this reason, even for a pure Python
97+
project, you should always upload *both* an sdist and a wheel to PyPI or other
98+
package indices. This makes installation much faster for your users, since a
99+
wheel is directly installable. By only including files that must be installed,
100+
wheels also make for smaller downloads.
101+
102+
On the technical level, a wheel is a ZIP archive (unlike sdists which are TAR
103+
archives). You can inspect its contents by unpacking it as a normal ZIP archive,
104+
e.g., using ``unzip`` on UNIX platforms like Linux and macOS, ``Expand-Archive``
105+
in Powershell on Windows, or :ref:`the command line interface of Python's
106+
zipfile module <python:zipfile-commandline>`. This can be very useful to check
107+
that the wheel includes all the files you need it to.
108+
109+
Inside a wheel, you will find the package's files, plus an additional directory
110+
called :samp:`{package_name}-{version}.dist-info`. This directory contains
111+
various files, including a ``METADATA`` file which is the equivalent of
112+
``PKG-INFO`` in sdists, as well as ``RECORD``. This can be useful to ensure no
113+
files are missing from your wheels.
114+
115+
The file name of a wheel (ignoring some rarely used features) looks like this:
116+
:samp:`{package_name}-{version}-{python_tag}-{abi_tag}-{platform_tag}.whl`.
117+
This naming convention identifies which platforms and Python versions the wheel
118+
is compatible with. For example, the name ``pip-23.3.1-py3-none-any.whl`` means
119+
that:
120+
121+
- (``py3``) This wheel can be installed on any implementation of Python 3,
122+
whether CPython, the most widely used Python implementation, or an alternative
123+
implementation like PyPy_;
124+
- (``none``) It does not depend on the Python version;
125+
- (``any``) It does not depend on the platform.
126+
127+
The pattern ``py3-none-any`` is common for pure Python projects. Packages with
128+
extension modules typically ship multiple wheels with more complex tags.
129+
130+
All technical details on the wheel format can be found in the :ref:`wheel
131+
specification <binary-distribution-format>`.
132+
133+
134+
.. _egg-format:
135+
.. _`Wheel vs Egg`:
136+
137+
What about eggs?
138+
================
139+
140+
"Egg" is an old package format that has been replaced with the wheel format. It
141+
should not be used anymore. Since August 2023, PyPI `rejects egg uploads
142+
<pypi-eggs-deprecation_>`_.
143+
144+
Here's a breakdown of the important differences between wheel and egg.
145+
146+
* The egg format was introduced by :ref:`setuptools` in 2004, whereas the wheel
147+
format was introduced by :pep:`427` in 2012.
148+
149+
* Wheel has an :doc:`official standard specification
150+
</specifications/binary-distribution-format>`. Egg did not.
151+
152+
* Wheel is a :term:`distribution <Distribution Package>` format, i.e a packaging
153+
format. [#wheel-importable]_ Egg was both a distribution format and a runtime
154+
installation format (if left zipped), and was designed to be importable.
155+
156+
* Wheel archives do not include ``.pyc`` files. Therefore, when the distribution
157+
only contains Python files (i.e. no compiled extensions), and is compatible
158+
with Python 2 and 3, it's possible for a wheel to be "universal", similar to
159+
an :term:`sdist <Source Distribution (or "sdist")>`.
160+
161+
* Wheel uses standard :ref:`.dist-info directories
162+
<recording-installed-packages>`. Egg used ``.egg-info``.
163+
164+
* Wheel has a :ref:`richer file naming convention <wheel-file-name-spec>`. A
165+
single wheel archive can indicate its compatibility with a number of Python
166+
language versions and implementations, ABIs, and system architectures.
167+
168+
* Wheel is versioned. Every wheel file contains the version of the wheel
169+
specification and the implementation that packaged it.
170+
171+
* Wheel is internally organized by `sysconfig path type
172+
<https://docs.python.org/2/library/sysconfig.html#installation-paths>`_,
173+
therefore making it easier to convert to other formats.
174+
175+
--------------------------------------------------------------------------------
176+
177+
.. [#core-metadata-format] This format is email-based. Although this would
178+
be unlikely to be chosen today, backwards compatibility considerations lead to
179+
it being kept as the canonical format. From the user point of view, this
180+
is mostly invisible, since the metadata is specified by the user in a way
181+
understood by the build backend, typically ``[project]`` in ``pyproject.toml``,
182+
and translated by the build backend into ``PKG-INFO``.
183+
184+
.. [#wheel-importable] Circumstantially, in some cases, wheels can be used
185+
as an importable runtime format, although :ref:`this is not officially supported
186+
at this time <binary-distribution-format-import-wheel>`.
187+
188+
189+
190+
.. _pip-pypi: https://pypi.org/project/pip/23.3.1/#files
191+
.. _pypi: https://pypi.org
192+
.. _pypi-eggs-deprecation: https://blog.pypi.org/posts/2023-06-26-deprecate-egg-uploads/
193+
.. _pypy: https://pypy.org

source/discussions/wheel-vs-egg.rst

Lines changed: 0 additions & 55 deletions
This file was deleted.

source/glossary.rst

Lines changed: 8 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -47,11 +47,11 @@ Glossary
4747
A :term:`Distribution <Distribution Package>` format containing files
4848
and metadata that only need to be moved to the correct location on the
4949
target system, to be installed. :term:`Wheel` is such a format, whereas
50-
distutil's :term:`Source Distribution <Source Distribution (or
50+
:term:`Source Distribution <Source Distribution (or
5151
"sdist")>` is not, in that it requires a build step before it can be
5252
installed. This format does not imply that Python files have to be
5353
precompiled (:term:`Wheel` intentionally does not include compiled
54-
Python files).
54+
Python files). See :ref:`package-formats` for more information.
5555

5656

5757
Distribution Package
@@ -73,9 +73,8 @@ Glossary
7373
Egg
7474

7575
A :term:`Built Distribution` format introduced by :ref:`setuptools`,
76-
which is being replaced by :term:`Wheel`. For details, see
77-
:doc:`The Internal Structure of Python Eggs <setuptools:deprecated/python_eggs>` and
78-
`Python Eggs <http://peak.telecommunity.com/DevCenter/PythonEggs>`_
76+
which has been replaced by :term:`Wheel`. For details, see
77+
:ref:`egg-format`.
7978

8079
Extension Module
8180

@@ -240,7 +239,8 @@ Glossary
240239
A :term:`distribution <Distribution Package>` format (usually generated
241240
using ``python -m build --sdist``) that provides metadata and the
242241
essential source files needed for installing by a tool like :ref:`pip`,
243-
or for generating a :term:`Built Distribution`.
242+
or for generating a :term:`Built Distribution`. See :ref:`package-formats`
243+
for more information.
244244

245245

246246
System Package
@@ -266,11 +266,8 @@ Glossary
266266

267267
Wheel
268268

269-
A :term:`Built Distribution` format introduced by an official
270-
:doc:`standard specification
271-
</specifications/binary-distribution-format/>`,
272-
which is intended to replace the :term:`Egg` format. Wheel is currently
273-
supported by :ref:`pip`.
269+
The standard :term:`Built Distribution` format.
270+
See :ref:`package-formats` for more information.
274271

275272
Working Set
276273

source/specifications/binary-distribution-format.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,8 @@ Place ``.dist-info`` at the end of the archive.
8484
File Format
8585
-----------
8686

87+
.. _wheel-file-name-spec:
88+
8789
File name convention
8890
''''''''''''''''''''
8991

0 commit comments

Comments
 (0)