Skip to content

Commit a837a8d

Browse files
authored
enh: rewrite duckdb groupby so it avoids group-by-all (#3267)
rewrite duckdb groupby so it avoids group-by-all#
1 parent b8c328b commit a837a8d

File tree

2 files changed

+18
-5
lines changed

2 files changed

+18
-5
lines changed

CONTRIBUTING.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -185,9 +185,16 @@ If you add code that should be tested, please add tests.
185185
- Never assume that your data is ordered in any pre-defined way.
186186
- Never materialise your data (only exception: `collect`).
187187
- Avoid calling the schema / column names unnecessarily.
188-
- For DuckDB, use the Python API as much as possible, only falling
188+
189+
- DuckDB:
190+
191+
In addition to the above:
192+
193+
- Use the Python API as much as possible, only falling
189194
back to SQL as the last resort for operations not yet supported
190195
in their Python API (e.g. `over`).
196+
- Use standard SQL constructs where possible instead of non-standard
197+
ones such as `GROUP BY ALL` or `EXCLUDE`.
191198
192199
### Test Failure Patterns
193200

narwhals/_duckdb/group_by.py

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
from itertools import chain
44
from typing import TYPE_CHECKING
55

6+
from narwhals._duckdb.utils import join_column_names
67
from narwhals._sql.group_by import SQLGroupBy
78

89
if TYPE_CHECKING:
@@ -27,7 +28,12 @@ def __init__(
2728
self._compliant_frame = frame.drop_nulls(self._keys) if drop_null_keys else frame
2829

2930
def agg(self, *exprs: DuckDBExpr) -> DuckDBLazyFrame:
30-
agg_columns = tuple(chain(self._keys, self._evaluate_exprs(exprs)))
31-
return self.compliant._with_native(
32-
self.compliant.native.aggregate(agg_columns) # type: ignore[arg-type]
33-
).rename(dict(zip(self._keys, self._output_key_names)))
31+
agg_columns = tuple(self._evaluate_exprs(exprs))
32+
result = self.compliant.native.aggregate(
33+
tuple(chain(self._keys, agg_columns)), # type: ignore[arg-type]
34+
join_column_names(*self._keys),
35+
)
36+
37+
return self.compliant._with_native(result).rename(
38+
dict(zip(self._keys, self._output_key_names))
39+
)

0 commit comments

Comments
 (0)