Skip to content

Commit 55cdea6

Browse files
authored
docs: Add page on empty aggregations (#2609)
* docs: Add page on empty aggregations * suggest workaround
1 parent cf82f73 commit 55cdea6

File tree

2 files changed

+43
-0
lines changed

2 files changed

+43
-0
lines changed
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# Empty aggregations
2+
3+
What is the sum of zero values? As it turns out, tools disagree:
4+
5+
```python exec="1" result="python" session="empty aggregations" source="above"
6+
import narwhals as nw
7+
import polars as pl
8+
import duckdb
9+
10+
polars_df = pl.DataFrame({"a": [None], "b": [1]}, schema={"a": pl.Int64, "b": pl.Int64})
11+
print("Polars result")
12+
print(polars_df.group_by("b").agg(pl.col("a").sum()))
13+
14+
print("DuckDB result")
15+
print(duckdb.sql("""from polars_df select b, sum(a) as a group by b"""))
16+
```
17+
18+
Polars, pandas, and PyArrow think the result is zero. SQL engines think it's `NULL`. Who's correct?
19+
20+
For now, we respect each backend's opinion and leave this result backend-specific, to avoid
21+
interfering with how aggregations compose with other operations. If it's crucial to you
22+
that an empty sum returns `0` for all backends, you can always follow the sum with
23+
`fill_null(0)`.
24+
25+
```python exec="1" result="python" session="empty aggregations" source="above"
26+
from narwhals.typing import IntoFrameT
27+
28+
29+
def custom_group_by_sum(df_native: IntoFrameT) -> IntoFrameT:
30+
return (
31+
nw.from_native(df_native)
32+
.group_by("b")
33+
.agg(nw.col("a").sum())
34+
.with_columns(nw.col("a").fill_null(0))
35+
)
36+
37+
38+
print("Polars result:")
39+
print(custom_group_by_sum(polars_df))
40+
print("DuckDB result:")
41+
print(custom_group_by_sum(duckdb.table("polars_df")))
42+
```

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ nav:
1919
- concepts/column_names.md
2020
- concepts/boolean.md
2121
- concepts/null_handling.md
22+
- concepts/empty_aggregations.md
2223
- Overhead: overhead.md
2324
- Perfect backwards compatibility policy: backcompat.md
2425
- Supported libraries and extending Narwhals: extending.md

0 commit comments

Comments
 (0)