diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index 67a6f1bdc51..6ac66492831 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -646,6 +646,7 @@ peps/pep-0765.rst @iritkatriel @ncoghlan peps/pep-0766.rst @warsaw peps/pep-0767.rst @carljm peps/pep-0768.rst @pablogsal +peps/pep-0769.rst @facundobatista peps/pep-0770.rst @sethmlarson @brettcannon # ... peps/pep-0777.rst @warsaw diff --git a/peps/pep-0769.rst b/peps/pep-0769.rst new file mode 100644 index 00000000000..8059175467b --- /dev/null +++ b/peps/pep-0769.rst @@ -0,0 +1,363 @@ +PEP: 769 +Title: Add a 'default' keyword argument to 'attrgetter' and 'itemgetter' +Author: Facundo Batista +Status: Draft +Type: Standards Track +Created: 22-Dec-2024 +Python-Version: 3.14 + + +Abstract +======== + +This proposal aims to enhance the ``operator`` module by adding a +``default`` keyword argument to the ``attrgetter`` and ``itemgetter`` +functions. This addition would allow these functions to return a +specified default value when the targeted attribute or item is missing, +thereby preventing exceptions and simplifying code that handles optional +attributes or items. + + +Motivation +========== + +Currently, ``attrgetter`` and ``itemgetter`` raise exceptions if the +specified attribute or item is absent. This limitation requires +developers to implement additional error handling, leading to more +complex and less readable code. + +Introducing a ``default`` parameter would streamline operations involving +optional attributes or items, reducing boilerplate code and enhancing +code clarity. + + +Rationale +========= + +The primary design decision is to introduce a single ``default`` parameter +applicable to all specified attributes or items. + +This approach maintains simplicity and avoids the complexity of assigning +individual default values to multiple attributes or items. While some +discussions considered allowing multiple defaults, the increased +complexity and potential for confusion led to favoring a single default +value for all cases (more about this below in `Rejected Ideas +`__). + + +Specification +============= + +Proposed behaviours: + +- **attrgetter**: ``f = attrgetter("name", default=XYZ)`` followed by + ``f(obj)`` would return ``obj.name`` if the attribute exists, else + ``XYZ``. + +- **itemgetter**: ``f = itemgetter(2, default=XYZ)`` followed by + ``f(obj)`` would return ``obj[2]`` if that is valid, else ``XYZ``. + +This enhancement applies to single and multiple attribute/item +retrievals, with the default value returned for any missing attribute or +item. + +No functionality change is incorporated if ``default`` is not used. + + +Examples for attrgetter +----------------------- + +Current behaviour, no changes introduced:: + + >>> class C: + ... class D: + ... class X: + ... pass + ... class E: + ... pass + ... + >>> attrgetter("D")(C) + + >>> attrgetter("badname")(C) + Traceback (most recent call last): + File "", line 1, in + AttributeError: type object 'C' has no attribute 'badname' + >>> attrgetter("D", "E")(C) + (, ) + >>> attrgetter("D", "badname")(C) + Traceback (most recent call last): + File "", line 1, in + AttributeError: type object 'C' has no attribute 'badname' + >>> attrgetter("D.X")(C) + + >>> attrgetter("D.badname")(C) + Traceback (most recent call last): + File "", line 1, in + AttributeError: type object 'D' has no attribute 'badname' + +Using ``default``:: + + >>> attrgetter("D", default="noclass")(C) + + >>> attrgetter("badname", default="noclass")(C) + 'noclass' + >>> attrgetter("D", "E", default="noclass")(C) + (, ) + >>> attrgetter("D", "badname", default="noclass")(C) + (, 'noclass') + >>> attrgetter("D.X", default="noclass")(C) + + >>> attrgetter("D.badname", default="noclass")(C) + 'noclass' + + +Examples for itemgetter +----------------------- + +Current behaviour, no changes introduced:: + + >>> obj = ["foo", "bar", "baz"] + >>> itemgetter(1)(obj) + 'bar' + >>> itemgetter(5)(obj) + Traceback (most recent call last): + File "", line 1, in + IndexError: list index out of range + >>> itemgetter(1, 0)(obj) + ('bar', 'foo') + >>> itemgetter(1, 5)(obj) + Traceback (most recent call last): + File "", line 1, in + IndexError: list index out of range + + +Using ``default``:: + + >>> itemgetter(1, default="XYZ")(obj) + 'bar' + >>> itemgetter(5, default="XYZ")(obj) + 'XYZ' + >>> itemgetter(1, 0, default="XYZ")(obj) + ('bar', 'foo') + >>> itemgetter(1, 5, default="XYZ")(obj) + ('bar', 'XYZ') + + +.. _PEP 769 About Possible Implementations: + +About Possible Implementations +------------------------------ + +For the case of ``attrgetter`` is quite direct: it implies using +``getattr`` catching a possible ``AttributeError``. So +``attrgetter("name", default=XYZ)(obj)`` would be like:: + + try: + value = getattr(obj, "name") + except (TypeError, IndexError, KeyError): + value = XYZ + +Note we cannot rely on using ``gettattr`` with a default value, as would +be impossible to distinguish what it returned on each step when an +attribute chain is specified (e.g. +``attrgetter("foo.bar.baz", default=XYZ)``). + +For the case of ``itemgetter`` it's not that easy. The more +straightforward way is similar to above, also simple to define and +understand: attempting ``__getitem__`` and catching a possible exception +(any of the three indicated in ``__getitem__`` reference). This way, +``itemgetter(123, default=XYZ)(obj)`` would be equivalent to:: + + try: + value = obj[123] + except (TypeError, IndexError, KeyError): + value = XYZ + +However, this would be not as efficient as we'd want for particular cases, +e.g. using dictionaries where particularly good performance is desired. A +more complex alternative would be:: + + if isinstance(obj, dict): + value = obj.get(123, XYZ) + else: + try: + value = obj[123] + except (TypeError, IndexError, KeyError): + value = XYZ + +Better performance, more complicated to implement and explain. This is +the first case in the `Open Issues `__ section later. + + +Corner Cases +------------ + +Providing a ``default`` option would only work when accessing to the +item/attribute would fail in a regular situation. In other words, the +object accessed should not handle defaults theirselves. + +For example, the following would be redundant/confusing because +``defaultdict`` will never error out when accessing the item:: + + >>> from collections import defaultdict + >>> from operator import itemgetter + >>> dd = defaultdict(int) + >>> itemgetter("foo", default=-1)(dd) + 0 + +The same applies to any user built object that overloads ``__getitem__`` +or ``__getattr__`` implementing fallbacks. + + +.. _PEP 769 Rejected Ideas: + +Rejected Ideas +============== + +Multiple Default Values +----------------------- + +The idea of allowing multiple default values for multiple attributes or +items was considered. + +Two alternatives were discussed, using an iterable that must have the +same quantity of items than parameters given to +``attrgetter``/``itemgetter``, or using a dictionary with keys matching +those names passed to ``attrgetter``/``itemgetter``. + +The really complex thing to solve in these casse, that would make the +feature hard to explain and with confusing corners, is what would happen +if an iterable or dictionary is the *unique* default desired for all +items. For example:: + + >>> itemgetter("a", default=(1, 2)({}) + (1, 2) + >>> itemgetter("a", "b", default=(1, 2))({}) + ((1, 2), (1, 2)) + +If we allow "multiple default values" using ``default``, the first case +in the example above would raise an exception because more items in the +default than names, and the second case would return ``(1, 2))``. This is +why emerged the possibility of using a different name for multiple +defaults (``defaults``, which is expressive but maybe error prone because +too similar to ``default``). + +As part of this conversation there was another proposal that would enable +multiple defaults, which is allowing combinations of ``attrgetter`` and +``itemgetter``, e.g.:: + + >>> ig_a = itemgetter("a", default=1) + >>> ig_b = itemgetter("b", default=2) + >>> ig_combined = itemgetter(ig_a, ig_b) + >>> ig_combined({"a": 999}) + (999, 2) + >>> ig_combined({}) + (1, 2) + +However, combining ``itemgetter`` or ``attrgetter`` is a totally new +behaviour very complex to define, not impossible, but beyond the scope of +this PEP. + +At the end having multiple default values was deemed overly complex and +potentially confusing, and a single ``default`` parameter was favored for +simplicity and predictability. + + +Tuple Return Consistency +------------------------ + +Another rejected proposal was adding a a flag to always return tuple +regardless of how many keys/names/indices were sourced to arguments. +E.g.:: + + >>> letters = ["a", "b", "c"] + >>> itemgetter(1, return_tuple=True)(letters) + ('b',) + >>> itemgetter(1, 2, return_tuple=True)(letters) + ('b', 'c') + +This would be of a little help for multiple default values consistency, +but requires further discussion and for sure is out of the scope of this +PEP. + + +.. _PEP 769 Open Issues: + +Open Issues +=========== + +Behaviour Equivalence for ``itemgetter`` +---------------------------------------- + +We need to define how ``itemgetter`` would behave, if just attempt to +access the item and capture exceptions no matter which the object, or +validate first if the object provides a ``get`` method and use it to +retrieve the item with a default. See examples in the `About Possible +Implementations `__ subsection +above. + +This would help performance for the case of dictionaries, but would make +the ``default`` feature somewhat more difficult to explain, and a little +confusing if some object that is not a dictionary but provides a ``get`` +method is used. Alternatively, we could call ``.get`` *only* if the +object is an instance of ``dict``. + +In any case, a desirable situation is that we do *not* affect performance +at all if the ``default`` is not triggered. Checking for ``.get`` would +get the default faster in case of dicts, but implies doing a verification +in all cases. Using the try/except model would make it not as fast as it +could in the case of dictionaries, but would not introduce delays if the +default is not triggered. + + +Add a Default to ``getitem`` +---------------------------- + +It was proposed that we could also enhance ``getitem``, as part of the of +this PEP, adding ``default`` also to it. + +This will not only improve ``getitem`` itself, but we would also gain +internal consistency in the ``operator`` module and in comparison with +the ``getattr`` builtin function that also has a default. + +The definition could be as simple as the try/except proposed above, so +doing ``getitem(obj, name, default)`` would be equivalent to:: + + try: + result = obj[name] + except (TypeError, IndexError, KeyError): + result = default + +(However see previous open issue about special case for dictionaries) + + +How to Teach This +================= + +As the basic behaviour is not modified, this new ``default`` can be +avoided when teaching ``attrgetter`` and ``itemgetter`` for the first +time, and can be introduced only when the functionality need arises. + + +Backwards Compatibility +======================= + +The proposed changes are backward-compatible. The ``default`` parameter +is optional; existing code without this parameter will function as +before. Only code that explicitly uses the new ``default`` parameter will +exhibit the new behavior, ensuring no disruption to current +implementations. + + +Security Implications +===================== + +Introducing a ``default`` parameter does not inherently introduce +security vulnerabilities. + + +Copyright +========= + +This document is placed in the public domain or under the +CC0-1.0-Universal license, whichever is more permissive.