-
Notifications
You must be signed in to change notification settings - Fork 170
Open
Labels
Description
Currently, we track expression metadata at the compliant level:
- whether it aggregates, changes length, transforms
- the function name
This is quite error-prone, as we repeat logic in quite a few places.
I'd prefer it if:
- we didn't track the function name at all for most backends. It's only really needed for pandas/pyarrow/dask because of limitations in their group-by APIs, so we should only do it there
- one of the metadatas we could track is what kind of column selection we start with (
nth,col,selector,all). Because in group_by-agg,selectorandallrequire some special treatment. But this could be done much more simply at the Narwhals level, without tracking function names everywhere in DuckDB / PySpark / Ibis / whatever else we add, which I hope would also have a modern enough syntax
To do this, I haven't tried, but perhaps:
_to_compliant_exprcould take ametadataargument, which takes aTypeDictof metadata which we keep track of at thenarwhals/expr.pyandnarwhals/functions.pylevel- compliant expressions just read from the metadata passed by above to determine how and whether to broadcast
dangotbanned