Skip to content

Commit 5da9295

Browse files
committed
Add demo.
1 parent 1c19f7f commit 5da9295

File tree

1 file changed

+13
-1
lines changed

1 file changed

+13
-1
lines changed

src/posts/flox-smart/index.md

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ For example, with a chunk size of 4, monthly mean input data for the "cohort" Ja
6161
Here is a schematic illustration where each month is represented by a different shade of red and a single chunk contains 4 months:
6262
![monthly cohorts](https://flox.readthedocs.io/en/latest/_images/cohorts-month-chunk4.png)
6363
This means that we can run the tree reduction for each cohort (three cohorts in total: `JFMA | MJJA | SOND`) independently and expose more parallelism.
64-
Doing so can significantly reduce compute times and in particular memory required for the computation.
64+
Doing so can significantly reduce memory required for the computation.
6565

6666
Finally, if there isn't much separation of groups into cohorts, like when groups are randomly distributed across chunks, then it's hard to do better than the standard `method="map-reduce"`.
6767

@@ -72,6 +72,18 @@ Worse, they are hard to explain conceptually! I've tried! ([example 1](https://d
7272

7373
What we need is to choose the appropriate strategy automatically.
7474

75+
## Demo
76+
77+
Here's a quick demo of computing monthly mean climatologies with the National Water Model.
78+
79+
For this input dataset, chunked so that approximately a month of data is in a single chunk,
80+
81+
<RawHTML filePath='/posts/flox-smart/dataset-repr.html' />
82+
we run ``` mean_mapreduce = ds.groupby("time.month").mean(method="map-reduce") mean_cohorts
83+
= ds.groupby("time.month").mean() # this is auto-detected! ``` Using the algorithm
84+
described below, flox will **automatically** set `method="cohorts"` for this dataset
85+
unless specified, yielding a 5X decrease in memory need, and 2X longer in time ![](/posts/flox-smart/mem.png)
86+
7587
## Problem statement
7688

7789
Fundamentally, we know:

0 commit comments

Comments
 (0)