-
Notifications
You must be signed in to change notification settings - Fork 182
Open
Labels
enhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is neededlow prioritypysparkIssue is related to pyspark backendIssue is related to pyspark backend
Description
For LazyFrames, we can support Expr.filter so long as it's followed by an aggregation. This should already be validated at the Narwhals level, I don't think anything needs changing there
For spark-like, Column.filter doesn't exist, but we can use F.expr to accomplish the same:
from sqlframe.duckdb import DuckDBSession
import sqlframe.duckdb.functions as F
df = DuckDBSession().createDataFrame(pd.DataFrame({'a': [1,1,2], 'b': [4,5,6]}))
df = df.select(
F.expr('sum(b) filter (where a==1)').alias('c'),
F.expr('sum(b) filter (where a!=1)').alias('d'),
)
df.show()+---+---+
| c | d |
+---+---+
| 9 | 6 |
+---+---+
If anyone fancies implementing Expr.filter for _spark_like using the above, then my hope is that it will "just work", similar to how it currently works for Dask
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is neededlow prioritypysparkIssue is related to pyspark backendIssue is related to pyspark backend