Skip to content

ENH: Change ExtensionOpsMixin behaviour to not add new operator method if one is already defined on the ExtensionArray class #50767

@Finndersen

Description

@Finndersen

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

ExtensionScalarOpsMixin is convenient for automatically adding arithmetic and comparison operator support to custom Extension Type ExtensionArrays, using scalar type implementations.

However, performing element-wise operations is very inefficient so it is often desirable to implement custom operator or comparison methods on the ExtensionArray class which could use vectorised approaches for much improved performance.

Currently, MyExtensionArray._add_arithmetic_ops() and MyExtensionArray._add_comparison_ops() overrides any existing defined operator overload methods on MyExtensionArray, so if a developer wishes to define any custom operator implementation for enhanced performance, they must abandon ExtensionScalarOpsMixin entirely and define ALL arithmetic and comparison operators themselves (not possible to have combination of manually defined operators and element-wise implementation fallback for the others)

Feature Description

I think it would be very useful for the _add_*_ops() methods of ExtensionOpsMixin to first check if each operator overload method is already defined on the class before adding it. This would allow the developer to define some particular custom operator methods on the custom ExtensionArray for enhanced performance, but with fallback to the element-wise implementation from ExtensionScalarOpsMixin for those that are not defined.

An example implementation could like like:

Original:

    @classmethod
    def _add_comparison_ops(cls) -> None:
        setattr(cls, "__eq__", cls._create_comparison_method(operator.eq))
        setattr(cls, "__ne__", cls._create_comparison_method(operator.ne))
        setattr(cls, "__lt__", cls._create_comparison_method(operator.lt))
        setattr(cls, "__gt__", cls._create_comparison_method(operator.gt))
        setattr(cls, "__le__", cls._create_comparison_method(operator.le))
        setattr(cls, "__ge__", cls._create_comparison_method(operator.ge))

New:

    @classmethod
    def _add_comparison_ops(cls) -> None:
        for name, op in [
              ("__eq__", operator.eq),
              ("__ne__", operator.ne),
              ("__lt__", operator.lt),
              ("__gt__", operator.gt),
              ("__le__", operator.le),
              ("__ge__", operator.ge)]:
              if not hasattr(cls, name):
                    setattr(cls, name cls._create_comparison_method(op))

This check could also potentially be controlled by an optional flag argument to _add_*_ops() methods to maintain backwards compatible behaviour if desired.

Alternative Solutions

None that I know of

Additional Context

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions