Skip to content

refactor: pass ExprMetadata down to compliant Exprs from narwhals.ExprΒ #1848

@MarcoGorelli

Description

@MarcoGorelli

Currently, we track expression metadata at the compliant level:

  • whether it aggregates, changes length, transforms
  • the function name

This is quite error-prone, as we repeat logic in quite a few places.

I'd prefer it if:

  • we didn't track the function name at all for most backends. It's only really needed for pandas/pyarrow/dask because of limitations in their group-by APIs, so we should only do it there
  • one of the metadatas we could track is what kind of column selection we start with (nth, col, selector, all). Because in group_by-agg, selector and all require some special treatment. But this could be done much more simply at the Narwhals level, without tracking function names everywhere in DuckDB / PySpark / Ibis / whatever else we add, which I hope would also have a modern enough syntax

To do this, I haven't tried, but perhaps:

  • _to_compliant_expr could take a metadata argument, which takes a TypeDict of metadata which we keep track of at the narwhals/expr.py and narwhals/functions.py level
  • compliant expressions just read from the metadata passed by above to determine how and whether to broadcast

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions