Skip to content

Conversation

@dangotbanned
Copy link
Member

@dangotbanned dangotbanned commented Apr 1, 2025

What type of PR is this? (check all applicable)

  • πŸ’Ύ Refactor
  • ✨ Feature
  • πŸ› Bug Fix
  • πŸ”§ Optimization
  • πŸ“ Documentation
  • βœ… Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

Had this idea nagging away at me while working on (#2116).

Super high-level

This PR adds a Namespace class, that can be created via either:

Namespace.from_backend(...)
Namespace.from_native_object(...)

Supporting these new @classmethods is a new method for Implementation:

Implementation._backend_version()

Notes

I've tried to keep things isolated from the rest of narwhals - for now - but the goal would be replacing similar logic from all of:

  • functions.py
  • translate.py
  • utils.py

Important

Not planning to make this a public class

Example

This further simplifies (#2283) to a one-liner, with the added bonus of typing from a backend string

image

- Currently just a fancy `nw.utils._into_compliant_namespace`
- `_version` defined in the same way as `nw.Schema`
  - Just override in subclass
We *at least* get completions for `self.compliant`, even if they just resolve to `Any`
Will support the pattern from (#2315)
Also checks repr at runtime, which right now is the same as annotations
These are the *kinds* of tasks I'm hoping to simplify and introduce typing to
Might switch more over if they show up as issues in CI
- The stuff in `nw.utils` will be replaced by this afterwards
- Best to avoid depending on them for now
Comment on lines 536 to 554
@property
def _backend_version(self) -> tuple[int, ...]:
native = self.to_native_namespace()
into_version: Any
if self not in {
Implementation.PYSPARK,
Implementation.DASK,
Implementation.SQLFRAME,
}:
into_version = native
elif self is Implementation.PYSPARK:
into_version = get_pyspark()
elif self is Implementation.DASK:
into_version = get_dask()
else:
import sqlframe._version

into_version = sqlframe._version
return parse_version(into_version)
Copy link
Member Author

@dangotbanned dangotbanned Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be used in place of most calls we currently do to parse_version.

Here we'd only need Implementation and use the same property for all of them.

  • is_sqlframe_dataframe
  • SparkLikeLazyFrame
  • sqlframe._version

elif is_sqlframe_dataframe(native_object): # pragma: no cover
from narwhals._spark_like.dataframe import SparkLikeLazyFrame
if series_only:
msg = "Cannot only use `series_only` with SQLFrame DataFrame"
raise TypeError(msg)
if eager_only or eager_or_interchange_only:
msg = "Cannot only use `eager_only` or `eager_or_interchange_only` with SQLFrame DataFrame"
raise TypeError(msg)
import sqlframe._version
backend_version = parse_version(sqlframe._version)
return LazyFrame(
SparkLikeLazyFrame(
native_object,
backend_version=backend_version,
version=version,
implementation=Implementation.SQLFRAME,
),
level="lazy",
)

If you then combine this PR with (#2315), we can do:

from __future__ import annotations

from typing import cast

from narwhals._namespace import Namespace
import narwhals as nw

if TYPE_CHECKING:
    from narwhals._spark_like.dataframe import SQLFrameDataFrame
    
native_object = cast("SQLFrameDataFrame", "pretend im a sqlframe")
nw.LazyFrame(
    Namespace.from_native_object(native_object).compliant.from_native(native_object),
    level="lazy",
)

Copy link
Member Author

@dangotbanned dangotbanned Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the last nw.LazyFrame part, we could even add something like this to Compliant* classes:

ToNarwhals Protocol

from __future__ import annotations

from typing import Any
from typing import Protocol
from typing import TypeVar

ToNarwhalsT_co = TypeVar("ToNarwhalsT_co", covariant=True)

class ToNarwhals(Protocol[ToNarwhalsT_co]):
    def to_narwhals(self, *args: Any, **kwds: Any) -> ToNarwhalsT_co: ...

Since they're already initialized, they have a Version - so we have everything we need:

SparkLikeLazyFrame implementation

from __future__ import annotations

from typing import TYPE_CHECKING

from narwhals.typing import CompliantLazyFrame
from narwhals.utils import Implementation
from narwhals.utils import Version

if TYPE_CHECKING:
    from narwhals._spark_like.expr import SparkLikeExpr
    from narwhals.dataframe import LazyFrame

    SQLFrameDataFrame = BaseDataFrame[Any, Any, Any, Any, Any]

class SparkLikeLazyFrame(CompliantLazyFrame["SparkLikeExpr", "SQLFrameDataFrame"]):
    _native_frame: SQLFrameDataFrame
    _implementation: Implementation
    _backend_version: tuple[int, ...]
    _version: Version

    def to_narwhals(self) -> LazyFrame[SQLFrameDataFrame]:
        if self._version is Version.MAIN:
            from narwhals.dataframe import LazyFrame

            return LazyFrame(self, level="lazy")
        from narwhals.stable.v1 import LazyFrame as LazyFrameV1

        return LazyFrameV1(self, level="lazy")

Putting it all together

from __future__ import annotations

from typing import TYPE_CHECKING
from typing import cast

from narwhals._namespace import Namespace

if TYPE_CHECKING:
    from narwhals._spark_like.dataframe import SQLFrameDataFrame


native_object = cast("SQLFrameDataFrame", "pretend im a sqlframe")
narwhals_object = (
    Namespace.from_native_object(native_object)
    .compliant.from_native(native_object)
    .to_narwhals()
)

I think this is pretty clean πŸ™‚

Comment on lines +107 to +117
class _NativeDask(Protocol):
_partition_type: type[pd.DataFrame]

class _NativeCuDF(Protocol):
def to_pylibcudf(self, *args: Any, **kwds: Any) -> Any: ...

class _ModinDataFrame(Protocol):
_pandas_class: type[pd.DataFrame]

class _ModinSeries(Protocol):
_pandas_class: type[pd.Series[Any]]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried using the actual types first (acb5787), but they broke the @overload(s).

These seem specific enough to match only the intended target(s)

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @dangotbanned

@dangotbanned dangotbanned marked this pull request as draft April 19, 2025 11:36
@dangotbanned
Copy link
Member Author

thanks @dangotbanned

thanks @MarcoGorelli!

Just going to make the replacements now in utils and functions - where this'll be used first

@dangotbanned dangotbanned marked this pull request as ready for review April 19, 2025 19:23
@dangotbanned
Copy link
Member Author

dangotbanned commented Apr 19, 2025

@MarcoGorelli after (#2324 (commits)) I'm gonna call it here, too easy to get carried away πŸ˜…

Hopefully you get an idea for how much of from_native this can simplify.
Especially if we add ToNarwhals later (#2324 (comment))

@dangotbanned dangotbanned changed the title feat(RFC): Adds private Namespace class feat: Adds private Namespace class Apr 19, 2025
@dangotbanned dangotbanned merged commit 62ba157 into main Apr 19, 2025
30 of 31 checks passed
@dangotbanned dangotbanned deleted the nw-namespace branch April 19, 2025 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants