Skip to content

Tighten-up NativeSeries Protocol #2111

@dangotbanned

Description

@dangotbanned

Description

Noticed during (#2110 (comment)) that the current definition is pretty prone to false positives

class NativeSeries(Protocol):
def __len__(self) -> int: ...

All of these will error at runtime, but list[int] will pass statically as list.__len__ exists:

image

Solution

For all the cases we support in nw.from_native, we can use __array__ to narrow things further

Diff

diff --git a/narwhals/typing.py b/narwhals/typing.py
index 9c868694..c13fcbb7 100644
--- a/narwhals/typing.py
+++ b/narwhals/typing.py
@@ -4,6 +4,7 @@ from typing import TYPE_CHECKING
 from typing import Any
 from typing import Callable
 from typing import Generic
+from typing import Iterator
 from typing import Literal
 from typing import Protocol
 from typing import Sequence
@@ -38,6 +39,9 @@ if TYPE_CHECKING:
 
     class NativeSeries(Protocol):
         def __len__(self) -> int: ...
+        def __getitem__(self, key: Any, /) -> Any: ...
+        def __iter__(self) -> Iterator[Any]: ...
+        def __array__(self, *args: Any, **kwds: Any) -> Any: ...
 
     class DataFrameLike(Protocol):
         def __dataframe__(self, *args: Any, **kwargs: Any) -> Any: ...

image

Open questions

  • Would __array__ be too numpy-centric?
  • Are there any other widely-used, third-party Series-like classes we should be aware of?

Other common methods

  • filter
  • unique
  • value_counts
  • equals
  • take
  • to_numpy

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions