-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
Description
#62353 is about EA arithmetic with a list, which is not in general supported (lists are cast to ndarrays by the Series/Index/DataFrame op) but is for some EA subclasses because we are not consistent. This got me thinking: what would it take to get consistent and what else would that address?
Best case is to stick more boilerplate into unpack_zerodim_and_defer
. Basically all our internal methods use this and 3rd party EAs use it indirectly if they mix in OpsMixin.
I propose we:
-
Make official EA arithmetic/comparison/logical ops accept only
ArrayLike
or scalar-likeother
s (in this context if we see e.g. a tuple or list, we treat it as a scalar). -
Add to
unpack_zerodim_and_defer
a check along the lines of:
if isinstance(other, (np.ndarray, ExtensionArray))
if not self._supports_array_op(other.dtype, op):
raise TypeError("Consistent message about dtypes")
else:
if not self._supports_scalar_op(other, op):
raise TypeError("Consistent message about types")
There will be some cases where the correct thing to do is to return NotImplemented, in which case these validations will not raise and potential raising will be left to the reversed operation.
We can implement OpsMixin._supports_(scalar|array)_op
to always return True, so it is a no-op if the subclass doesn't override it.
- Add to
unpack_zerodim_and_defer
a check like:
if isinstance(other, (np.ndarray, ExtensionArray)) and other.shape != self.shape:
raise ValueError("Consistent message about matching shapes/lengths")
Note that by putting the type check before the shape check, we'll also improve consistency which ATM is haphazard.
Downsides added as I think of them
- small perf hit
- sometimes we don't know if we are validated until after we do some non-trivial checks inside the method, e.g. ArrowEA._cmp_method
- for e.g. TimedeltaArray with bool dtype, we might want a custom exception message about casting to ints.
- doesn't handle our scalars