From 35c60f8ceac4cfe6d5f7d300f91c414a41f19259 Mon Sep 17 00:00:00 2001 From: Ofek Lev Date: Wed, 9 Jul 2025 00:18:27 -0400 Subject: [PATCH 1/2] PEP 752: address maintainer feedback --- peps/pep-0752.rst | 374 +++++++++++++++++++--------------------------- 1 file changed, 156 insertions(+), 218 deletions(-) diff --git a/peps/pep-0752.rst b/peps/pep-0752.rst index 2f33c5438f0..7685393ce20 100644 --- a/peps/pep-0752.rst +++ b/peps/pep-0752.rst @@ -25,13 +25,11 @@ Motivation ========== The current ecosystem lacks a way for projects with many packages to signal a -verified pattern of ownership. Such projects fall into two categories. - -The first category is projects [1]_ that want complete control over their -namespace. A few examples: +verified pattern of ownership, who desire complete control over their namespace +for safety and branding reasons. A few examples: * Major cloud providers like Amazon, Google and Microsoft have a common prefix - for each feature's corresponding package [3]_. For example, most of Google's + for each feature's corresponding package [1]_. For example, most of Google's packages are prefixed by ``google-cloud-`` e.g. ``google-cloud-compute`` for `using virtual machines `__. * `OpenTelemetry `__ is an open standard for @@ -43,28 +41,16 @@ namespace. A few examples: * `Apache Airflow `__ is a platform to programmatically author, schedule and monitor workflows. It has providers, where each provider package is prefixed by ``apache-airflow-providers-``. +* `Typeshed `__ is a community effort to + maintain type stubs for various packages. The stub packages they maintain + mirror the package name they target and are prefixed by ``types-``. For + example, the package ``requests`` has a stub that users would depend on + called ``types-requests``. Unofficial stubs are not supposed to use the + ``types-`` prefix and are expected to use a ``-stubs`` suffix instead. __ https://github.com/open-telemetry/opentelemetry-python __ https://github.com/open-telemetry/opentelemetry-python-contrib -The second category is projects [2]_ that want to share their namespace such -that some packages are officially maintained and third-party developers are -encouraged to participate by publishing their own. Some examples: - -* `Project Jupyter `__ is devoted to the development of - tooling for sharing interactive documents. They support `extensions`__ - which in most cases (and in all cases for officially maintained - extensions) are prefixed by ``jupyter-``. -* `Django `__ is one of the most widely used web - frameworks in existence. They have the concept of `reusable apps`__, which - are commonly installed via - `third-party packages `__ that implement a subset - of functionality to extend Django-based websites. These packages are by - convention prefixed by ``django-`` or ``dj-``. - -__ https://jupyterlab.readthedocs.io/en/stable/user/extensions.html -__ https://docs.djangoproject.com/en/5.1/intro/reusable-apps/ - Such projects are uniquely vulnerable to name-squatting attacks which can ultimately result in `dependency confusion`__. @@ -77,35 +63,29 @@ official integration. It takes a nontrivial amount of time to deliver such an integration due to roadmap prioritization and the time required for implementation. It would be impossible to reserve the name of every potential package so in the interim an attacker may create a package that appears -legitimate which would execute malicious code at runtime. Not only are users -more likely to install such packages but doing so taints the perception of the -entire project. +legitimate which would execute malicious code (like secret exfiltration) at +runtime. Not only are users more likely to install such packages but doing so +taints the perception of the entire project. Community projects like Apache +Airflow have also `experienced this `__. Although :pep:`708` attempts to address this attack vector, it is specifically about the case of multiple repositories being considered during dependency resolution and does not offer any protection to the aforementioned use cases. -Namespacing also would drastically reduce the incidence of +In recent years, `typosquatting `__ -because typos would have to be in the prefix itself which is -`normalized `_ and likely to be a short, well-known identifier like -``aws-``. In recent years, typosquatting has become a popular attack vector -[4]_. - -The `current protection`__ against typosquatting used by PyPI is to normalize -similar characters but that is insufficient for these use cases. +has become a popular attack vector [2]_. The `current protection`__ against +this used by PyPI is to normalize similar characters but that is +insufficient for these use cases. Namespacing would drastically reduce the +incidence of typosquatting: __ https://github.com/pypi/warehouse/blob/8615326918a180eb2652753743eac8e74f96a90b/warehouse/migrations/versions/d18d443f89f0_ultranormalize_name_function.py#L29-L42 -Another problem that namespacing would solve is the issue of choosing new names -for packages following the agreed patterns of naming. Often (this is the case -for Apache Airflow for example), there are public discussions that precede -the decision to create a new package. The decision is based on the agreed -name and follow the pattern of the existing packages. If more package names are -considered during the discussion, all the names have to be reserved via a PyPI -interface before the discussion is public, otherwise the names can be taken by -other users. This has happened in the past as explained -in the associated `discussion `__. +* Typos would have to be in the prefix itself which is `normalized `_ + and likely to be a short, well-known identifier like ``aws-``. +* An index may require namespaces to be applied for and approved, reducing the + likelihood of typosquatting of such events. +* An attacker would be unable to squat a name that includes a namespace. Rationale ========= @@ -159,39 +139,20 @@ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in :rfc:`2119`. -Organization - `Organizations `_ are entities that own projects and have various - users associated with them. +Owner + Owners are entities that are allowed to upload certain package names. Grant A grant is a reservation of a namespace for a package repository. -Open Namespace - An `open `_ namespace allows for uploads from any project - owner. -Restricted Namespace - A restricted namespace only allows uploads from an owner of the namespace. Parent Namespace A namespace's parent refers to the namespace without the trailing hyphenated component e.g. the parent of ``foo-bar`` is ``foo``. Child Namespace - A namespace's child refers to the namespace with additional trailing - hyphenated components e.g. ``foo-bar`` is a valid child of ``foo`` as is - ``foo-bar-baz``. + A namespace's child refers to the namespace with a single trailing + hyphenated component e.g. ``foo-bar`` is a valid child of ``foo``. Specification ============= -.. _orgs: - -Organizations -------------- - -Any package repository that allows for the creation of projects (e.g. -non-mirrors) MAY offer the concept of organizations [6]_. Organizations are -entities that own projects and have various users associated with them. - -Organizations MAY reserve one or more namespaces. Such reservations neither -confer ownership nor grant special privileges to existing projects. - .. _naming: Naming @@ -208,8 +169,7 @@ Semantics A namespace grant bestows ownership over the following: -1. A project matching the namespace itself such as the placeholder package - `microsoft `__. +1. A project that exactly matches the namespace itself. 2. Projects that start with the namespace followed by a hyphen. For example, the namespace ``foo`` would match the normalized project name ``foo-bar`` but not the project name ``foobar``. @@ -217,69 +177,48 @@ A namespace grant bestows ownership over the following: Package name matching acts upon the `normalized `_ namespace. Namespaces are per-package repository and SHALL NOT be shared between -repositories. For example, if PyPI has a namespace ``microsoft`` that is owned -by the company Microsoft, packages starting with ``microsoft-`` that come from -other non-PyPI mirror repositories do not confer the same level of trust. - -Grants MUST NOT overlap. For example, if there is an existing grant -for ``foo-bar`` then a new grant for ``foo`` would be forbidden. An overlap is -determined by comparing the `normalized `_ proposed namespace with the -normalized namespace of every existing root grant. Every comparison must append -a hyphen to the end of the proposed and existing namespace. An overlap is -detected when any existing namespace starts with the proposed namespace. +repositories. For example, if PyPI has a namespace ``acme`` that is owned by +the company Acme, packages starting with ``acme-`` that come from other +non-PyPI mirror repositories do not confer the same level of trust. + +Grants MUST NOT overlap ownership. For example, if there is an existing grant +for ``foo-bar`` then a new grant for ``foo`` would only be possible for the +owner of the former. An overlap is determined by comparing the +`normalized `_ proposed namespace with the normalized namespace of +every existing root grant. Every comparison must append a hyphen to the end of +the proposed and existing namespace. An overlap is detected when any existing +namespace starts with the proposed namespace. + +Repositories SHOULD impose a depth limit on the number of hyphens in a namespace. +For example, if the depth limit is ``1`` then the namespace ``foo-bar`` would be +allowed but ``foo-bar-baz`` could not be granted. .. _uploads: Uploads ------- -If the name of a package being uploaded matches a reserved namespace and either -of the following criteria are true: - -* The project does not yet exist. -* The project is not owned by an organization with an active grant for the - namespace. - -Then the upload MUST fail with a 403 HTTP status code. - -.. _open-namespaces: - -Open Namespaces ------------------ - -The owner of a grant may choose to allow others the ability to release new -projects with the associated namespace. Doing so MUST allow -`uploads `_ for new projects matching the namespace from any user. - -It is possible for the owner of a namespace to both make it open and allow -other organizations to use the grant. In this case, the authorized -organizations have no special permissions and are equivalent to an open grant -without ownership. +Uploads MUST fail with a :rfc:`409 Conflict <9110#name-409-conflict>` HTTP +status code if the name of a package being uploaded matches a reserved namespace +and the project owner does not have an active grant for the namespace. -.. _hidden-grants: - -Hidden Grants -------------- - -Repositories MAY create hidden grants that are not visible to the public which -prevent their namespaces from being claimed by others. Such grants MUST NOT be -`open `_ and SHOULD NOT be exposed in the -`API `_. - -Hidden grants are useful for repositories that wish to enforce upload -restrictions without the need to expose the namespace to the public. +Repositories SHOULD have an exception to this rule for projects that existed +before the namespace was reserved. .. _repository-metadata: Repository Metadata ------------------- -The :pep:`JSON API <691>` version will be incremented from ``1.2`` to ``1.3``. +The :pep:`JSON API <691>` version will be incremented from ``1.3`` to ``1.4``. The following API changes MUST be implemented by repositories that support this PEP. Repositories that do not support this PEP MUST NOT implement these changes so that consumers of the API are able to determine whether the repository supports this PEP. +The following API changes would allow installers to offer users extra +`security policies `_. + .. _project-detail: Project Detail @@ -288,21 +227,26 @@ Project Detail The :pep:`project detail <691#project-detail>` response will be modified as follows. -The ``namespace`` key MUST be ``null`` if the project does not match an active +The ``namespaces`` key MUST be ``null`` if the project does not match an active namespace grant. If the project does match a namespace grant, the value MUST be -a mapping with the following keys: - -* ``prefix``: This is the associated `normalized `_ namespace e.g. - ``foo-bar``. If the owner of the project owns multiple matching grants then - this MUST be the namespace with the most number of characters. For example, - if the project name matched both ``foo-bar`` and ``foo-bar-baz`` then this - key would be the latter. -* ``authorized``: This is a boolean and will be true if the project owner - is an organization and is one of the current owners of the grant. This is - useful for tools that wish to make a distinction between official and - community packages. -* ``open``: This is a boolean indicating whether the namespace is - `open `_. +an array of mappings representing each matching namespace. Every mapping MUST +have the following keys: + +* ``name``: This is the associated `normalized `_ namespace e.g. + ``foo-bar``. +* ``owned``: This is a boolean and will be true if the project owner is + one of the current owners of the grant. This will only be false if the + project existed before the namespace was reserved and the repository + `allows `_ continued uploads. + +Namespace List +'''''''''''''' + +The format of this URL is ``/namespaces``. + +The response MUST be an array of mappings representing each reserved namespace. +Every mapping MUST have a ``name`` key that is the `normalized `_ +namespace e.g. ``foo-bar``. Namespace Detail '''''''''''''''' @@ -311,28 +255,25 @@ The format of this URL is ``/namespace/`` where ```` is the `normalized `_ namespace. For example, the URL for the namespace ``foo.bar`` would be ``/namespace/foo-bar``. -The response will be a mapping with the following keys: +The response MUST be a mapping with the following keys: -* ``prefix``: This is the `normalized `_ version of the namespace e.g. +* ``name``: This is the `normalized `_ version of the namespace e.g. ``foo-bar``. -* ``owner``: This is the organization that is responsible for the namespace. -* ``open``: This is a boolean indicating whether the namespace is - `open `_. * ``parent``: This is the parent namespace if it exists. For example, if the namespace is ``foo-bar`` and there is an active grant for ``foo``, then this would be ``"foo"``. If there is no parent then this key will be ``null``. -* ``children``: This is an array of any child namespaces. For example, if the - namespace is ``foo`` and there are active grants for ``foo-bar`` and - ``foo-bar-baz`` then this would be ``["foo-bar", "foo-bar-baz"]``. +* ``children``: This is an array of direct child namespaces. For example, + if the namespace is ``foo`` and there are active grants for ``foo-bar`` and + ``foo-bar-baz`` then this would be ``["foo-bar"]``. + +The mapping MAY have an ``owner`` key that refers to the current owner of the +namespace. Grant Removal ------------- When a reserved namespace becomes unclaimed, repositories MUST set the -``namespace`` key to ``null`` in the `API `_. - -Namespaces that were previously claimed but are now not SHOULD be eligible for -claiming again by any organization. +``namespaces`` key to ``null`` in the `API `_. Community Buy-in ================ @@ -355,9 +296,12 @@ this PEP (with a link to the discussion): Backwards Compatibility ======================= -There are no intrinsic concerns because there is still a flat namespace and -installers need no modification. Additionally, many projects have already -chosen to signal a shared purpose with a prefix like `typeshed has done`__. +There are no intrinsic concerns because projects continue to use existing +naming semantics. Projects with or without a namespace are indistinguishable +from the perspective of the user. Installers need no modification. + +Additionally, many projects have already chosen to signal a shared purpose with +a prefix like `typeshed has done`__. __ https://github.com/python/typeshed/issues/2491#issuecomment-578456045 @@ -366,18 +310,20 @@ __ https://github.com/python/typeshed/issues/2491#issuecomment-578456045 Security Implications ===================== -* There is an opportunity to build on top of :pep:`740` and :pep:`480` so that - one could prove cryptographically that a specific release came from an owner - of the associated namespace. This PEP makes no effort to describe how this - will happen other than that work is planned for the future. +Installers could support enabling a security policy that would only allow +packages that match a specific set of namespaces and whose owner has an active +grant for the namespace. How to Teach This ================= -For consumers of packages we will document how metadata is exposed in the -`API `_ and potentially in future note tooling that -supports utilizing namespaces to provide extra security guarantees during -installation. +We will update the `PyPUG documentation`__ to describe the new +`metadata `_ that is returned by the API. + +__ https://packaging.python.org/en/latest/specifications/simple-repository-api/ + +In future we could also note tooling that supports utilizing namespaces to +provide extra security guarantees during installation. Reference Implementation ======================== @@ -388,8 +334,8 @@ A complete reference implementation of this PEP is available in Rejected Ideas ============== -Granting Reservations to Users ------------------------------- +Explicit Non-User Ownership +--------------------------- As package repositories have a flat namespace, allowing any user to reserve a namespace would be untenable not just because there would be @@ -398,16 +344,13 @@ human operators to manage the vetting of an arbitrary number of users. __ https://en.wikipedia.org/wiki/Tragedy_of_the_commons -.. _artifact-level-association: +An earlier version of this PEP proposed that only `organizations`__ could +reserve namespaces because of these practical considerations. However, +this was rejected as the organization concept has not been specified and +imposing such restrictions based on the anticipated PyPI implementation is +unnecessary. -Artifact-level Namespace Association ------------------------------------- - -An earlier version of this PEP proposed that metadata be associated with -individual artifacts at the point of release. This was rejected because it -had the potential to cause confusion for users who would expect the namespace -authorization guarantee to be at the project level based on current grants -rather than the time at which a given release occurred. +__ https://blog.pypi.org/posts/2023-04-23-introducing-pypi-organizations/ .. _organization-scoping: @@ -425,9 +368,7 @@ be a regression. The runtime environment of Python is also not conducive to scoping. Whereas multiple versions of the same JavaScript package may coexist, Python only allows a single global namespace. Barring major changes to the language itself, -this is nearly impossible to change. Additionally, users have come to expect -that the package name is usually the same as what they would import and -eliminating the flat namespace would do away with that convention. +this is nearly impossible to change. Scoping would be particularly affected by organization changes which are bound to happen over time. An organization may change their name due to internal @@ -441,6 +382,36 @@ packages released with the scoping would be incompatible with older tools and would cause confusion for users along with frustration from maintainers having to triage such complaints. +.. _artifact-level-association: + +Artifact-level Namespace Association +------------------------------------ + +An earlier version of this PEP proposed that metadata be associated with +individual artifacts at the point of release. This was rejected because it +had the potential to cause confusion for users who would expect the namespace +authorization guarantee to be at the project level based on current grants +rather than the time at which a given release occurred. + +Support HTML Simple API +----------------------- + +Exposing project-level metadata in the HTML version of the Simple API could +happen in one of two ways. + +The first is exposing a ``data-`` attribute on the ``/simple/`` page that +enumerates every project. There is no precedent for this, and installers +generally do not use this page. Additionally, this page is often cached for +long periods of time (24 hours in the case of PyPI). + +The other is to add a ``data-`` attribute on every artifact. This is suboptimal +because it may introduce confusion similar to the rejected +`artifact-level association `_ idea. Another +consideration is that in practice many private indices are implemented as +static pages served by cloud storage backed by a CDN. In this scenario, every +namespace change would require a mass update of all artifacts of matching +projects. + .. _dedicated-repositories: Encourage Dedicated Package Repositories @@ -467,12 +438,28 @@ and ``Y``. If each repository has both packages but one is malicious on ``X`` and the other is malicious on ``Y`` then the user would be unable to satisfy their requirements without encountering a malicious package. +Open Namespaces +--------------- + +An earlier version of this PEP proposed that the owner of a grant may choose +to allow others the ability to release new projects with the associated +namespace. This was removed due to insufficient motivation and the fact that +repositories could technically satisfy such use cases with standard grant +semantics. + +Hidden Grants +------------- + +An earlier version of this PEP proposed that repositories could create hidden +grants that are not visible to the public which prevent their namespaces from +being claimed by others. This was removed due to insufficient motivation. + .. _provenance-assertions: Exclusive Reliance on Provenance Assertions ------------------------------------------- -The idea here [5]_ would be to design a general purpose way for clients to make +The idea here [3]_ would be to design a general purpose way for clients to make provenance assertions to verify certain properties of dependencies, each with custom syntax. Some examples: @@ -677,9 +664,6 @@ Another issue with this approach is that projects often have branding in mind __ https://github.com/apache/airflow/discussions/41657#discussioncomment-10417439 -It's unrealistic to expect every company and project to voluntarily change -their existing and future package names. - Use DNS ------- @@ -703,50 +687,14 @@ None at this time. Footnotes ========= -.. [1] Additional examples of projects with restricted namespaces: - - - `Typeshed `__ is a community effort to - maintain type stubs for various packages. The stub packages they maintain - mirror the package name they target and are prefixed by ``types-``. For - example, the package ``requests`` has a stub that users would depend on - called ``types-requests``. Unofficial stubs are not supposed to use the - ``types-`` prefix and are expected to use a ``-stubs`` suffix instead. - - `Sphinx `__ is a documentation framework - popular for large technical projects such as - `Swift `__ and Python itself. They have - the concept of `extensions`__ which are prefixed by ``sphinxcontrib-``, - many of which are maintained within a - `dedicated organization `__. - - `Apache Airflow `__ is a platform to - programmatically orchestrate tasks as directed acyclic graphs (DAGs). - They have the concept of `plugins`__, and also `providers`__ which are - prefixed by ``apache-airflow-providers-``. - -.. [2] Additional examples of projects with open namespaces: - - - `pytest `__ is Python's most popular testing - framework. They have the concept of `plugins`__ which may be developed by - anyone and by convention are prefixed by ``pytest-``. - - `MkDocs `__ is a documentation framework based on - Markdown files. They also have the concept of - `plugins `__ which may be - developed by anyone and are usually prefixed by ``mkdocs-``. - - `Datadog `__ offers observability as a service. - The `Datadog Agent `__ ships - out-of-the-box with - `official integrations `__ - for many products, like various databases and web servers, which are - distributed as Python packages that are prefixed by ``datadog-``. There is - support for creating `third-party integrations`__ which customers may run. - -.. [3] The following shows the package prefixes for the major cloud providers: +.. [1] The following shows the package prefixes for the major cloud providers: - Amazon: `aws-cdk- `__ - Google: `google-cloud- `__ and others based on ``google-`` - Microsoft: `azure- `__ -.. [4] Examples of typosquatting attacks targeting Python users: +.. [2] Examples of typosquatting attacks targeting Python users: - ``django-`` namespace was squatted, among other packages, leading to a `postmortem `__ @@ -759,21 +707,11 @@ Footnotes among other packages. Notice how packages with a known prefix are much more prone to successful attacks. - ``typing-`` namespace was - `squatted `__ - and this would be useful to prevent as a `hidden grant `__. + `squatted `__. -.. [5] `Detailed write-up `__ of the +.. [3] `Detailed write-up `__ of the potential for provenance assertions. -.. [6] As an example, PyPI's concept of organizations is described - `here `__. - -__ https://www.sphinx-doc.org/en/master/usage/extensions/index.html -__ https://airflow.apache.org/docs/apache-airflow/stable/authoring-and-scheduling/plugins.html -__ https://airflow.apache.org/docs/apache-airflow-providers/index.html -__ https://docs.pytest.org/en/stable/how-to/writing_plugins.html -__ https://docs.datadoghq.com/developers/integrations/agent_integration/ - Copyright ========= From 1fbbd0bf2f39bb50000295913f386295bd5327fb Mon Sep 17 00:00:00 2001 From: Ofek Lev Date: Mon, 11 Aug 2025 15:53:29 -0400 Subject: [PATCH 2/2] Accept suggestion Co-authored-by: Dustin Ingram --- peps/pep-0752.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0752.rst b/peps/pep-0752.rst index 7685393ce20..f21e9fb2724 100644 --- a/peps/pep-0752.rst +++ b/peps/pep-0752.rst @@ -210,7 +210,7 @@ before the namespace was reserved. Repository Metadata ------------------- -The :pep:`JSON API <691>` version will be incremented from ``1.3`` to ``1.4``. +The :pep:`JSON API <691>` version will be incremented from ``1.4`` to ``1.5``. The following API changes MUST be implemented by repositories that support this PEP. Repositories that do not support this PEP MUST NOT implement these changes so that consumers of the API are able to determine whether the