-
Notifications
You must be signed in to change notification settings - Fork 168
Implementing convert.nemo_to_sgrid()
#2460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v4-dev
Are you sure you want to change the base?
Implementing convert.nemo_to_sgrid()
#2460
Conversation
And update of tutorial_nemo_curvilinear
And updating tutorial
ANd expading comments on loading in the dataset
To use same ordering as in XLinear
This requires smaller selection_dict for the isel, so hopefully faster code
And also separating offset calculation into its own helper function
It's not clear why this is here, nor why removing it causes a test failure. To be investigated another time. EDIT: This was introduced in Parcels-code@c311fba - though we're investigating if this can be implemented another way since there should be no particular difference with NEMO
Gradually reducing the dependency on the `mesh` param
Updates the API for conversion to be more closely aligned with the input data. Also handles the U and V fields separately - correctly assigning the dimension naming before merging into a single dataset.
Update tests
Avoid assuming there's a U and V field. Maybe this should be refactored later...
6d12f21 to
0b9c787
Compare
|
I'll do a full self review next week. Feel free to take a look in the meantime @erikvansebille if you'd like |
|
Also still a few failing tests that I need to sort out... |
erikvansebille
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks nice already. Some first comments below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and other tutorials still need to be updated to use the new convert workflow? Do you want me to do that? Could be a good test-case how user-friendly the convert functions are
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they would need to be updated.
Do you want me to do that? Could be a good test-case how user-friendly the convert functions are
That would be great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried to change the netcdf ingestion in the Nemo3D-tutorial, but get the error below
Can you check what's going on, @VeckoTheGecko?
data_folder = parcels.download_example_dataset("NemoNorthSeaORCA025-N006_data")
ds_fields = xr.open_mfdataset(
data_folder.glob("ORCA*.nc"),
data_vars="minimal",
coords="minimal",
compat="override",
)
ds_coords = xr.open_dataset(data_folder / "coordinates.nc", decode_times=False)
ds_fset = parcels.convert.nemo_to_sgrid(U=ds_fields["uo"], V=ds_fields["vo"], W=ds_fields["wo"], coords=ds_coords)
fieldset = parcels.FieldSet.from_sgrid_conventions(ds_fset)---------------------------------------------------------------------------
CoordinateValidationError Traceback (most recent call last)
Cell In[7], line 9
2 ds_fields = xr.open_mfdataset(
3 data_folder.glob("ORCA*.nc"),
4 data_vars="minimal",
5 coords="minimal",
6 compat="override",
7 )
8 ds_coords = xr.open_dataset(data_folder / "coordinates.nc", decode_times=False)
----> 9 ds_fset = parcels.convert.nemo_to_sgrid(U=ds_fields["uo"], V=ds_fields["vo"], coords=ds_coords)
10 fieldset = parcels.FieldSet.from_sgrid_conventions(ds_fset)
File ~/Codes/ParcelsCode/src/parcels/convert.py:224, in nemo_to_sgrid(coords, **fields)
222 ds = _discover_U_and_V(ds, _NEMO_CF_STANDARD_NAME_FALLBACKS)
223 ds = _maybe_create_depth_dim(ds)
--> 224 ds = _maybe_bring_UV_depths_to_depth(ds)
225 ds = _drop_unused_dimensions_and_coords(ds, _NEMO_DIMENSION_COORD_NAMES)
226 ds = _assign_dims_as_coords(ds, _NEMO_DIMENSION_COORD_NAMES)
File ~/Codes/ParcelsCode/src/parcels/convert.py:58, in _maybe_bring_UV_depths_to_depth(ds)
56 def _maybe_bring_UV_depths_to_depth(ds):
57 if "U" in ds.variables and "depthu" in ds.U.coords and "depth" in ds.coords:
---> 58 ds["U"] = ds["U"].assign_coords(depthu=ds["depth"].values).rename({"depthu": "depth"})
59 if "V" in ds.variables and "depthv" in ds.V.coords and "depth" in ds.coords:
60 ds["V"] = ds["V"].assign_coords(depthv=ds["depth"].values).rename({"depthv": "depth"})
File ~/Codes/ParcelsCode/.pixi/envs/docs/lib/python3.14/site-packages/xarray/core/common.py:664, in DataWithCoords.assign_coords(self, coords, **coords_kwargs)
661 else:
662 results = self._calc_assign_results(coords_combined)
--> 664 data.coords.update(results)
665 return data
File ~/Codes/ParcelsCode/.pixi/envs/docs/lib/python3.14/site-packages/xarray/core/coordinates.py:636, in Coordinates.update(self, other)
630 # special case for PandasMultiIndex: updating only its dimension coordinate
631 # is still allowed but depreciated.
632 # It is the only case where we need to actually drop coordinates here (multi-index levels)
633 # TODO: remove when removing PandasMultiIndex's dimension coordinate.
634 self._drop_coords(self._names - coords_to_align._names)
--> 636 self._update_coords(coords, indexes)
File ~/Codes/ParcelsCode/.pixi/envs/docs/lib/python3.14/site-packages/xarray/core/coordinates.py:1104, in DataArrayCoordinates._update_coords(self, coords, indexes)
1101 def _update_coords(
1102 self, coords: dict[Hashable, Variable], indexes: dict[Hashable, Index]
1103 ) -> None:
-> 1104 validate_dataarray_coords(
1105 self._data.shape, Coordinates._construct_direct(coords, indexes), self.dims
1106 )
1108 self._data._coords = coords
1109 self._data._indexes = indexes
File ~/Codes/ParcelsCode/.pixi/envs/docs/lib/python3.14/site-packages/xarray/core/coordinates.py:1328, in validate_dataarray_coords(shape, coords, dim)
1326 for d, s in v.sizes.items():
1327 if d in sizes and s != sizes[d]:
-> 1328 raise CoordinateValidationError(
1329 f"conflicting sizes for dimension {d!r}: "
1330 f"length {sizes[d]} on the data but length {s} on "
1331 f"coordinate {k!r}"
1332 )
CoordinateValidationError: conflicting sizes for dimension 'depthu': length 75 on the data but length 1 on coordinate 'depthu'
| vector_interp_method=CGrid_Velocity, | ||
| # ^Seems to work with AGrid as well? (at least, tests aren't failing - | ||
| # either logic needs to be added to choose interpolator, or this interpolator should be renamed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| fields["W"] = Field("W", ds["W"], grid, XLinear) | ||
| fields["UVW"] = VectorField("UVW", fields["U"], fields["V"], fields["W"]) | ||
| fields["UVW"] = VectorField( | ||
| "UVW", fields["U"], fields["V"], fields["W"], vector_interp_method=CGrid_Velocity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here, it would depend on the grid topology whether we use XLinear_Velocity or CGrid_Velocity
| if mesh == "spherical": | ||
| if "U" in fieldset.fields: | ||
| fieldset.U.units = GeographicPolar() | ||
| if "V" in fieldset.fields: | ||
| fieldset.V.units = Geographic() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would all go if we implement #2459. We need to be careful when fixing merge conflicts...
| pytest.param( | ||
| AdvectionRK4_3D, | ||
| marks=pytest.mark.xfail( | ||
| reason="from_nemo had 'fieldset.V.units = GeographicPolar()', I'm not sure _why_ this code is needed to get this to pass. To be further investigated.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Erik van Sebille <[email protected]>
Co-authored-by: Erik van Sebille <[email protected]>
Co-authored-by: Erik van Sebille <[email protected]>
Avoiding a FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also getting problems with the curvilinear grid. The lon and lat are defaulted to range(x) and range(y), for some reason
This leads to the plot below, where the right panel does not have lon and lat on the x- and y-axes
Any idea what's going on here, @VeckoTheGecko ?
data_folder = parcels.download_example_dataset("NemoCurvilinear_data")
ds_fields = xr.open_mfdataset(
data_folder.glob("*_purely_zonal-ORCA025_grid_*.nc4"),
data_vars="minimal",
coords="minimal",
compat="override",
)
ds_coords = xr.open_dataset(data_folder / "mesh_mask.nc4", decode_times=False)
ds_fset = parcels.convert.nemo_to_sgrid(U=ds_fields["U"], V=ds_fields["V"], coords=ds_coords)
display(ds_fset)
fieldset = parcels.FieldSet.from_sgrid_conventions(ds_fset)
print(fieldset.U.grid._ds)<xarray.Dataset> Size: 35MB
Dimensions: (depth: 1, time: 1, y_center: 1021, x: 1442, y: 1021,
x_center: 1442)
Coordinates:
* depth (depth) int64 8B 0
* time (time) float64 8B 3.0
* y_center (y_center) int64 8kB 0 1 2 3 4 5 ... 1015 1016 1017 1018 1019 1020
* x (x) int64 12kB 0 1 2 3 4 5 6 ... 1436 1437 1438 1439 1440 1441
* y (y) int64 8kB 0 1 2 3 4 5 6 ... 1014 1015 1016 1017 1018 1019 1020
* x_center (x_center) int64 12kB 0 1 2 3 4 5 ... 1437 1438 1439 1440 1441
gphif (y, x) float64 12MB -76.98 -76.98 -76.98 ... 50.0 49.99 49.99
glamf (y, x) float64 12MB 72.88 73.12 73.38 73.62 ... 73.0 73.0 73.0
Data variables:
U (depth, time, y_center, x) float32 6MB dask.array<chunksize=(1, 1, 511, 721), meta=np.ndarray>
V (depth, time, y, x_center) float32 6MB dask.array<chunksize=(1, 1, 511, 721), meta=np.ndarray>
grid int64 8B 0
lon (x) int64 12kB 0 1 2 3 4 5 6 ... 1436 1437 1438 1439 1440 1441
lat (y) int64 8kB 0 1 2 3 4 5 6 ... 1014 1015 1016 1017 1018 1019 1020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following patch works
commit a985573e9d6874e11354e3af14c98ae186b4231e
Author: Vecko <[email protected]>
Date: Mon Jan 12 17:04:56 2026 +0100
Fix glamf and gphif
diff --git a/docs/user_guide/examples/tutorial_nemo_curvilinear.ipynb b/docs/user_guide/examples/tutorial_nemo_curvilinear.ipynb
index 3defa1f8..24422013 100644
--- a/docs/user_guide/examples/tutorial_nemo_curvilinear.ipynb
+++ b/docs/user_guide/examples/tutorial_nemo_curvilinear.ipynb
@@ -71,10 +71,12 @@
" compat=\"override\",\n",
")\n",
"\n",
- "# Drop some auxiliary dimensions - otherwise Parcels will complain\n",
- "ds_fields = ds_fields.isel(time=0, z_a=0, z=0, drop=True)\n",
+ "ds_coords = xr.open_dataset(data_folder / \"mesh_mask.nc4\", decode_times=False)\n",
+ "ds_fset = parcels.convert.nemo_to_sgrid(\n",
+ " U=ds_fields[\"U\"], V=ds_fields[\"V\"], coords=ds_coords\n",
+ ")\n",
"\n",
- "fieldset = parcels.FieldSet.from_nemo(ds_fields)"
+ "fieldset = parcels.FieldSet.from_sgrid_conventions(ds_fset)"
]
},
{
@@ -272,7 +274,7 @@
],
"metadata": {
"kernelspec": {
- "display_name": "default",
+ "display_name": "test-notebooks-latest",
"language": "python",
"name": "python3"
},
@@ -286,7 +288,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.14.2"
+ "version": "3.13.9"
}
},
"nbformat": 4,
diff --git a/src/parcels/convert.py b/src/parcels/convert.py
index 637f5270..c3818362 100644
--- a/src/parcels/convert.py
+++ b/src/parcels/convert.py
@@ -244,13 +244,17 @@ def nemo_to_sgrid(*, coords: xr.Dataset, **fields: dict[str, xr.Dataset | xr.Dat
raise ValueError(
"Dataset already has a 'grid' variable (according to cf_roles). Didn't expect there to be grid metadata on copernicusmarine datasets - please open an issue with more information about your dataset."
)
+
+ # Update to use lon and lat internally
+
+ ds = ds.rename({"gphif": "lat", "glamf": "lon"}) # TODO: Logging message about rename
ds["grid"] = xr.DataArray(
0,
attrs=sgrid.Grid2DMetadata(
cf_role="grid_topology",
topology_dimension=2,
node_dimensions=("x", "y"),
- node_coordinates=("glamf", "gphif"),
+ node_coordinates=("lon", "lat"),
face_dimensions=(
sgrid.DimDimPadding("x_center", "x", sgrid.Padding.LOW),
sgrid.DimDimPadding("y_center", "y", sgrid.Padding.LOW),
@@ -260,6 +264,6 @@ def nemo_to_sgrid(*, coords: xr.Dataset, **fields: dict[str, xr.Dataset | xr.Dat
)
# NEMO models are always in degrees
- ds["glamf"].attrs["units"] = "degrees"
- ds["gphif"].attrs["units"] = "degrees"
+ ds["lon"].attrs["units"] = "degrees"
+ ds["lat"].attrs["units"] = "degrees"
return ds
diff --git a/tests/test_convert.py b/tests/test_convert.py
index d7777d49..11a7029d 100644
--- a/tests/test_convert.py
+++ b/tests/test_convert.py
@@ -18,7 +18,7 @@ def test_nemo_to_sgrid():
"topology_dimension": 2,
"node_dimensions": "x y",
"face_dimensions": "x_center:x (padding:low) y_center:y (padding:low)",
- "node_coordinates": "glamf gphif",
+ "node_coordinates": "lon lat",
"vertical_dimensions": "z_center:depth (padding:high)",
}
though I wonder if we should relax the constraint that there are coordinates named "lon" and "lat". Thinking about our conversation @erikvansebille , I don't think we have to do this renaming now we have sgrid metadata and in fact having the same variable name here as in the original dataset would be nice in case users want to debug.
This PR supercedes #2447
implementing the following additional functionality:
FieldSet.from_nemo()functionality into aconvert.pymodule with anemo_to_sgrid()function. Also added corresponding tests.nemo_to_sgridfunction more closely align with the input datasetsFieldSet.from_sgrid_conventionsto infermeshfrom the metadata included in the datasetget_value_by_id()function toGrid*DMetadataclasses allowing to easily refer to parts of the SGRID metadata modelFuture work:
convert.py- its important that we refactor this module so that its clearer, and have logging so that changes to the input dataset are communicated to the user. To support this, we needI'll make an issue to track this future work.