diff --git a/docs/_static/examples/weightedmean/edge_face_weightedmean.png b/docs/_static/examples/weightedmean/edge_face_weightedmean.png new file mode 100644 index 000000000..868264662 Binary files /dev/null and b/docs/_static/examples/weightedmean/edge_face_weightedmean.png differ diff --git a/docs/api.rst b/docs/api.rst index 10307a16f..c066dac8c 100644 --- a/docs/api.rst +++ b/docs/api.rst @@ -387,6 +387,7 @@ Dual Mesh Construction Aggregations ------------ + Topological ~~~~~~~~~~~ @@ -413,16 +414,13 @@ on each face. UxDataArray.topological_all UxDataArray.topological_any - - -Intersections -~~~~~~~~~~~~~ - +Weighted +~~~~~~~~ .. autosummary:: :toctree: generated/ - grid.intersections.gca_gca_intersection - grid.intersections.gca_const_lat_intersection + UxDataArray.weighted_mean + Spherical Geometry diff --git a/docs/user-guide/weighted_mean.ipynb b/docs/user-guide/weighted_mean.ipynb new file mode 100644 index 000000000..4b6ec330c --- /dev/null +++ b/docs/user-guide/weighted_mean.ipynb @@ -0,0 +1,208 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "23f93b0f-62ac-4efd-975c-368477c9ebbe", + "metadata": {}, + "source": [ + "# Weighted Mean\n", + "\n", + "UXarray supports computing weighted means based on geometric properties such as face areas or edge lengths. In this section of the user guide, we explain how to calculate variable means using unstructured grid geometries as weights. The examples below demonstrate the application of weighted means using grid face areas and grid edge lengths as weights.\n" + ] + }, + { + "cell_type": "code", + "id": "ed637e46-672c-45ba-a5f1-ed3c3248c8a1", + "metadata": {}, + "source": [ + "import uxarray as ux\n", + "import xarray as xr\n", + "import cartopy.crs as ccrs\n", + "\n", + "import warnings\n", + "\n", + "warnings.filterwarnings(\"ignore\")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "8518335d-2f75-4699-bc3d-862ab25830d1", + "metadata": {}, + "source": [ + "## Data\n", + "\n", + "Data used in this section is a 3-random-variable dataset on a quad hexagon mesh mapped to the edges and faces." + ] + }, + { + "cell_type": "code", + "id": "a2f58f07d7881d4f", + "metadata": {}, + "source": [ + "grid_path = \"../../test/meshfiles/ugrid/quad-hexagon/grid.nc\"\n", + "quad_hex_data_path_edge_centered = (\n", + " \"../../test/meshfiles/ugrid/quad-hexagon/random-edge-data.nc\"\n", + ")\n", + "quad_hex_data_path_face_centered = (\n", + " \"../../test/meshfiles/ugrid/quad-hexagon/random-face-data.nc\"\n", + ")\n", + "\n", + "uxds_face = ux.open_mfdataset(grid_path, quad_hex_data_path_face_centered)\n", + "uxds_edge = ux.open_mfdataset(grid_path, quad_hex_data_path_edge_centered)" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "6fac99462efcccdc", + "metadata": {}, + "source": [ + "(\n", + " uxds_face.uxgrid.plot(line_color=\"black\")\n", + " * uxds_edge[\"random_data_edge\"]\n", + " .plot.points(\n", + " cmap=\"inferno\", size=150, marker=\"square\", clabel=None, tools=[\"hover\"]\n", + " )\n", + " .relabel(\"Edge Data\")\n", + " * uxds_face[\"random_data_face\"]\n", + " .plot.points(\n", + " cmap=\"inferno\", size=150, marker=\"triangle\", clabel=None, tools=[\"hover\"]\n", + " )\n", + " .relabel(\"Face Data\")\n", + ").opts(legend_position=\"top_right\")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "bb77e5067497a0f5", + "metadata": {}, + "source": [ + "## Weighted Mean based on Face Areas\n", + "\n", + "Here we first look at the data values on each face and the faces' respective areas. " + ] + }, + { + "cell_type": "code", + "id": "c87fb018ca75ba5e", + "metadata": {}, + "source": [ + "uxds_face[\"random_data_face\"].values" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "def2b952ba7f1740", + "metadata": {}, + "source": "We can simply call `UxDataArray.weighted_mean()` on the UXDataArray to compute the weighted mean. The differences between the weighted mean and the regular mean is small since the area differences across the faces are small. " + }, + { + "cell_type": "code", + "id": "bd76b23993d9967f", + "metadata": {}, + "source": [ + "result = uxds_face[\"random_data_face\"].weighted_mean()\n", + "result.values" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "a92f056d6b2a567b", + "metadata": {}, + "source": [ + "unweighted_result = uxds_face[\"random_data_face\"].mean()\n", + "unweighted_result.values" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "e8ddb288f83486b8", + "metadata": {}, + "source": [ + "## Weighted Mean Based on Edge Length\n", + "\n", + "Here we show the similar steps but for edge-centered data and the edge lengths. " + ] + }, + { + "cell_type": "code", + "id": "1b01581da7ebacc", + "metadata": {}, + "source": [ + "uxds_edge[\"random_data_edge\"].values" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "a50c11e691337cff", + "metadata": {}, + "source": [ + "uxds_edge.uxgrid.edge_node_distances.data" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "647630d11f001fcf", + "metadata": {}, + "source": "The differences between weighted and unweighted mean is more drastic (~0.1 value difference) since the edge lengths have a larger variance. " + }, + { + "cell_type": "code", + "id": "6bace8386234b942", + "metadata": {}, + "source": [ + "result = uxds_edge[\"random_data_edge\"].weighted_mean()\n", + "result.values" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "d81bc60c195be6ec", + "metadata": {}, + "source": [ + "unweighted_result = uxds_edge[\"random_data_edge\"].mean()\n", + "unweighted_result.values" + ], + "outputs": [], + "execution_count": null + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/userguide.rst b/docs/userguide.rst index b4e6249a0..b62cce86c 100644 --- a/docs/userguide.rst +++ b/docs/userguide.rst @@ -55,6 +55,9 @@ These user guides provide detailed explanations of the core functionality in UXa `Topological Aggregations `_ Aggregate data across grid dimensions +`Weighted Mean `_ + Compute the weighted average + `Calculus Operators `_ Apply calculus operators (gradient, integral) on unstructured grid data @@ -100,6 +103,7 @@ These user guides provide additional details about specific features in UXarray. user-guide/cross-sections.ipynb user-guide/remapping.ipynb user-guide/topological-aggregations.ipynb + user-guide/weighted_mean.ipynb user-guide/calculus.ipynb user-guide/tree_structures.ipynb user-guide/area_calc.ipynb diff --git a/test/test_weighted_mean.py b/test/test_weighted_mean.py new file mode 100644 index 000000000..f08b1c2a8 --- /dev/null +++ b/test/test_weighted_mean.py @@ -0,0 +1,147 @@ +import os +from pathlib import Path + +import pytest + +import dask.array as da +import numpy as np +import numpy.testing as nt + +import uxarray as ux + +current_path = Path(os.path.dirname(os.path.realpath(__file__))) + + +csne30_grid_path = current_path / 'meshfiles' / "ugrid" / "outCSne30" / "outCSne30.ug" +csne30_data_path = current_path / 'meshfiles' / "ugrid" / "outCSne30" / "outCSne30_vortex.nc" + +quad_hex_grid_path = current_path / 'meshfiles' / "ugrid" / "quad-hexagon" / "grid.nc" +quad_hex_data_path_face_centered = current_path / 'meshfiles' / "ugrid" / "quad-hexagon" / "data.nc" +quad_hex_data_path_edge_centered = current_path / 'meshfiles' / "ugrid" / "quad-hexagon" / "random-edge-data.nc" + + +def test_quad_hex_face_centered(): + """Compares the weighted average computation for the quad hexagon grid + using a face centered data variable to the expected value computed by + hand.""" + uxds = ux.open_dataset(quad_hex_grid_path, quad_hex_data_path_face_centered) + + # expected weighted average computed by hand + expected_weighted_mean = 297.55 + + # compute the weighted mean + result = uxds['t2m'].weighted_mean() + + # ensure values are within 3 decimal points of each other + nt.assert_almost_equal(result.values, expected_weighted_mean, decimal=3) + +def test_quad_hex_face_centered_dask(): + """Compares the weighted average computation for the quad hexagon grid + using a face centered data variable on a dask-backed UxDataset & Grid to the expected value computed by + hand.""" + uxds = ux.open_dataset(quad_hex_grid_path, quad_hex_data_path_face_centered) + + # data to be dask + uxda = uxds['t2m'].chunk(n_face=1) + + # weights to be dask + uxda.uxgrid.face_areas = uxda.uxgrid.face_areas.chunk(n_face=1) + + # create lazy result + lazy_result = uxda.weighted_mean() + + assert isinstance(lazy_result.data, da.Array) + + # compute result + computed_result = lazy_result.compute() + + assert isinstance(computed_result.data, np.ndarray) + + expected_weighted_mean = 297.55 + + # ensure values are within 3 decimal points of each other + nt.assert_almost_equal(computed_result.values, expected_weighted_mean, decimal=3) + +def test_quad_hex_edge_centered(): + """Compares the weighted average computation for the quad hexagon grid + using an edge centered data variable to the expected value computed by + hand.""" + uxds = ux.open_dataset(quad_hex_grid_path, quad_hex_data_path_edge_centered) + + # expected weighted average computed by hand + expected_weighted_mean = (uxds['random_data_edge'].values * uxds.uxgrid.edge_node_distances).sum() / uxds.uxgrid.edge_node_distances.sum() + + # compute the weighted mean + result = uxds['random_data_edge'].weighted_mean() + + nt.assert_equal(result, expected_weighted_mean) + +def test_quad_hex_edge_centered_dask(): + """Compares the weighted average computation for the quad hexagon grid + using an edge centered data variable on a dask-backed UxDataset & Grid to the expected value computed by + hand.""" + uxds = ux.open_dataset(quad_hex_grid_path, quad_hex_data_path_edge_centered) + + # data to be dask + uxda = uxds['random_data_edge'].chunk(n_edge=1) + + # weights to be dask + uxda.uxgrid.edge_node_distances = uxda.uxgrid.edge_node_distances.chunk(n_edge=1) + + # create lazy result + lazy_result = uxds['random_data_edge'].weighted_mean() + + assert isinstance(lazy_result.data, da.Array) + + # compute result + computed_result = lazy_result.compute() + + assert isinstance(computed_result.data, np.ndarray) + + # expected weighted average computed by hand + expected_weighted_mean = (uxds[ + 'random_data_edge'].values * uxds.uxgrid.edge_node_distances).sum() / uxds.uxgrid.edge_node_distances.sum() + + # ensure values are within 3 decimal points of each other + nt.assert_almost_equal(computed_result.values, expected_weighted_mean, decimal=3) + +def test_csne30_equal_area(): + """Compute the weighted average with a grid that has equal-area faces and + compare the result to the regular mean.""" + uxds = ux.open_dataset(csne30_grid_path, csne30_data_path) + face_areas = uxds.uxgrid.face_areas + + # set the area of each face to be one + uxds.uxgrid._ds['face_areas'].data = np.ones(uxds.uxgrid.n_face) + + + weighted_mean = uxds['psi'].weighted_mean() + unweighted_mean = uxds['psi'].mean() + + # with equal area, both should be equal + nt.assert_equal(weighted_mean, unweighted_mean) + +@pytest.mark.parametrize("chunk_size", [1, 2, 4]) +def test_csne30_equal_area_dask(chunk_size): + """Compares the weighted average computation for the quad hexagon grid + using a face centered data variable on a dask-backed UxDataset & Grid to the expected value computed by + hand.""" + uxds = ux.open_dataset(csne30_grid_path, csne30_data_path) + + # data and weights to be dask + uxda = uxds['psi'].chunk(n_face=chunk_size) + uxda.uxgrid.face_areas = uxda.uxgrid.face_areas.chunk(n_face=chunk_size) + + # Calculate lazy result + lazy_result = uxds['psi'].weighted_mean() + assert isinstance(lazy_result.data, da.Array) + + # compute result + computed_result = lazy_result.compute() + assert isinstance(computed_result.data, np.ndarray) + + # expected weighted average computed by hand + expected_weighted_mean = (uxds['psi'].values * uxds.uxgrid.face_areas).sum() / uxds.uxgrid.face_areas.sum() + + # ensure values are within 3 decimal points of each other + nt.assert_almost_equal(computed_result.values, expected_weighted_mean, decimal=3) diff --git a/uxarray/core/dataarray.py b/uxarray/core/dataarray.py index bfec83280..5c07a70a5 100644 --- a/uxarray/core/dataarray.py +++ b/uxarray/core/dataarray.py @@ -429,6 +429,77 @@ def integrate( return uxda + def weighted_mean(self, weights=None): + """Computes a weighted mean. + + This function calculates the weighted mean of a variable, + using the specified `weights`. If no weights are provided, it will automatically select + appropriate weights based on whether the variable is face-centered or edge-centered. If + the variable is neither face nor edge-centered a warning is raised, and an unweighted mean is computed instead. + + Parameters + ---------- + weights : np.ndarray or None, optional + The weights to use for the weighted mean calculation. If `None`, the function will + determine weights based on the variable's association: + - For face-centered variables: uses `self.uxgrid.face_areas.data` + - For edge-centered variables: uses `self.uxgrid.edge_node_distances.data` + If the variable is neither face-centered nor edge-centered, a warning is raised, and + an unweighted mean is computed instead. User-defined weights should match the shape + of the data variable's last dimension. + + Returns + ------- + UxDataArray + A new `UxDataArray` object representing the weighted mean of the input variable. The + result is attached to the same `uxgrid` attribute as the original variable. + + Example + ------- + >>> weighted_mean = uxds["t2m"].weighted_mean() + + + Raises + ------ + AssertionError + If user-defined `weights` are provided and the shape of `weights` does not match + the shape of the data variable's last dimension. + + Warnings + -------- + UserWarning + Raised when attempting to compute a weighted mean on a variable without associated + weights. An unweighted mean will be computed in this case. + + Notes + ----- + - The weighted mean is computed along the last dimension of the data variable, which is + assumed to be the geometry dimension (e.g., faces, edges, or nodes). + """ + if weights is None: + if self._face_centered(): + weights = self.uxgrid.face_areas.data + elif self._edge_centered(): + weights = self.uxgrid.edge_node_distances.data + else: + warnings.warn( + "Attempting to perform a weighted mean calculation on a variable that does not have" + "associated weights. Weighted mean is only supported for face or edge centered " + "variables. Performing an unweighted mean." + ) + else: + # user-defined weights + assert weights.shape[-1] == self.shape[-1] + + # compute the total weight + total_weight = weights.sum() + + # compute the weighted mean, with an assumption on the index of dimension (last one is geometry) + weighted_mean = (self * weights).sum(axis=-1) / total_weight + + # create a UxDataArray and return it + return UxDataArray(weighted_mean, uxgrid=self.uxgrid) + def topological_mean( self, destination: Literal["node", "edge", "face"],