Skip to content

Commit d8b7b23

Browse files
Implement new "most common" regridder. (#46)
* Implement new "most common" regridder. * Add 'regrid.stat' for statistical reductions other than the mode * Add fill_value & ensure monotonic sorts only when not sorted * Move from list[str] to list[Hashable] * Refactor reduction methods w/ format_for_regrid, remove duplicate sortbys * Rename expected_groups -> values * Remove second sortby * Add basic tests for regrid.stats * Disable lat/lon coord formatting for stats-based methods * Update demo notebooks * Update docs, changelog * test statistical padding, add extra longitude monotonicity * fix dtype comparison --------- Co-authored-by: Sam Levang <[email protected]>
1 parent bc7be5b commit d8b7b23

File tree

14 files changed

+1891
-335
lines changed

14 files changed

+1891
-335
lines changed

CHANGELOG.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/).
66

77
## Unreleased
88

9+
Changed:
10+
- the "most common" routine has been overhauled, thanks to [@dcherian](https://github.com/dcherian). It is now much more efficient, and can operate fully lazily on dask arrays. Users do need to provide the expected groups (i.e., unique labels in the data), and the regridder is only available for `xr.DataArray` currently ([#46](https://github.com/xarray-contrib/xarray-regrid/pull/46)).
11+
- you can now use `None` as input to the `time_dim` kwarg in the regridding methods to force regridding over the time dimension (as long as it's numeric) ([#46](https://github.com/xarray-contrib/xarray-regrid/pull/46)).
12+
913
Added:
14+
- `.regrid.stat` for reducing datasets using statistical methods such as the variance or median ([#46](https://github.com/xarray-contrib/xarray-regrid/pull/46)).
15+
- a "least common" routine (i.e. anti-mode), which is the inverse of the most common value ([#46](https://github.com/xarray-contrib/xarray-regrid/pull/46)).
1016
- If latitude/longitude coordinates are detected and the domain is global, apply automatic padding at the boundaries, which gives behavior more consistent with common tools like ESMF and CDO ([#45](https://github.com/xarray-contrib/xarray-regrid/pull/45)).
1117
- Conservative regridding weights are converted to sparse matrices if the optional [sparse](https://github.com/pydata/sparse) package is installed, which improves compute and memory performance in most cases ([#49](https://github.com/xarray-contrib/xarray-regrid/pull/49)).
12-
1318

1419
## 0.3.0 (2024-09-05)
1520

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,9 @@ With xarray-regrid it is possible to regrid between two rectilinear grids. The f
88
- Nearest-neighbor
99
- Conservative
1010
- Cubic
11-
- "Most common value" (zonal statistics)
11+
- "Most common value", as well as other zonal statistics (e.g., variance or median).
1212

13-
All regridding methods, except for the "most common value" can operate lazily on [Dask arrays](https://docs.xarray.dev/en/latest/user-guide/dask.html).
13+
All regridding methods can operate lazily on [Dask arrays](https://docs.xarray.dev/en/latest/user-guide/dask.html).
1414

1515
Note that "Most common value" is designed to regrid categorical data to a coarse resolution. For regridding categorical data to a finer resolution, please use "nearest-neighbor" regridder.
1616

docs/getting_started.rst

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,9 @@ Multiple regridding methods are available:
3232
* `nearest-neighbor <autoapi/xarray_regrid/regrid/index.html#xarray_regrid.regrid.Regridder.conservative>`_ (``.regrid.nearest``)
3333
* `cubic interpolation <autoapi/xarray_regrid/regrid/index.html#xarray_regrid.regrid.Regridder.cubic>`_ (``.regrid.cubic``)
3434
* `conservative regridding <autoapi/xarray_regrid/regrid/index.html#xarray_regrid.regrid.Regridder.conservative>`_ (``.regrid.conservative``)
35+
* `zonal statistics <autoapi/xarray_regrid/regrid/index.html#xarray_regrid.regrid.Regridder.stat>`_ (``.regrid.stat``) is available to compute statistics such as the maximum value, or variance.
3536

36-
Additionally, a zonal statistics `method to compute the most common value <autoapi/xarray_regrid/regrid/index.html#xarray_regrid.regrid.Regridder.most_common>`_
37-
is available (``.regrid.most_common``).
38-
This can be used to upscale very fine categorical data to a more course resolution.
37+
Additionally, there are separate methods available to compute the
38+
`most common value <autoapi/xarray_regrid/regrid/index.html#xarray_regrid.regrid.Regridder.most_common>`_
39+
(``.regrid.most_common``) and `least common value <autoapi/xarray_regrid/regrid/index.html#xarray_regrid.regrid.Regridder.least_common>`_
40+
(``.regrid.least_common``). This can be used to upscale very fine categorical data to a more course resolution.

docs/index.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,9 +37,11 @@ The following methods are supported:
3737
* `Nearest-neighbor <autoapi/xarray_regrid/regrid/index.html#xarray_regrid.regrid.Regridder.nearest>`_
3838
* `Conservative <autoapi/xarray_regrid/regrid/index.html#xarray_regrid.regrid.Regridder.conservative>`_
3939
* `Cubic <autoapi/xarray_regrid/regrid/index.html#xarray_regrid.regrid.Regridder.cubic>`_
40+
* `Zonal statistics <autoapi/xarray_regrid/regrid/index.html#xarray_regrid.regrid.Regridder.stat>`_
4041
* `"Most common value" (zonal statistics) <autoapi/xarray_regrid/regrid/index.html#xarray_regrid.regrid.Regridder.most_common>`_
42+
* `"Least common value" (zonal statistics) <autoapi/xarray_regrid/regrid/index.html#xarray_regrid.regrid.Regridder.least_common>`_
4143

42-
Note that "Most common value" is designed to regrid categorical data to a coarse resolution. For regridding categorical data to a finer resolution, please use "nearest-neighbor" regridder.
44+
Note that "Most/least common value" is designed to regrid categorical data to a coarse resolution. For regridding categorical data to a finer resolution, please use "nearest-neighbor" regridder.
4345

4446
For usage examples, please refer to the `quickstart guide <getting_started>`_ and the `example notebooks <notebooks/index>`_.
4547

0 commit comments

Comments
 (0)