-
Notifications
You must be signed in to change notification settings - Fork 38
Remove need for scipy, fix parallel regridder #461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
for more information, see https://pre-commit.ci
Co-authored-by: David Huard <[email protected]>
|
Not sure though I can comment on the change in sparse format and its impact on computations. |
|
Oh boy... I compared my new code with the old one by creating a large regridder with Well. They are different. But, the old ones seem incorrect. I tried the roundtrip import xesmf as xe
import sparse as sps
reg = xe.Regridder(ds1, ds2, 'bilinear', unmapped_to_nan=False)
w2 = sps.COO.from_scipy_sparse(reg.weights.data.to_scipy_sparse().tolil())
np.testing.assert_array_equal(reg.weights.data.data, w2.data)
# Fails, shapes mismatch
|
|
Update, sorry I was too quick to comment. The round trip removes entries in the matrix that have the same "data" as the "fill_value", which reduces the size of the matrix. A good thing I guess! |
|
@aulemahal sorry I'm late to the party, I have a 3 months old to feed and change every couple of hours :) but I'll try to keep up with this PR |
|
No problem! xESMF is much less important than a 3 month old, please feel free to ignore any ping! I added a manual removal of the "unnecessary" matrix entries and now numpy asserts that the new and the old outputs are exactly the same. |
For some reason, two issues popped up in the conda feedstock build that did not appear in the builds here...
See conda-forge/xesmf-feedstock#51
Scipy is not a requirement of
sparseanymore, soadd_nans_to_weightsfailed without it. I rewrote the function without the sparse conversion, preserving the COO object type. I didn't perform a real performance test, but my small test seems to indicate that this new version is faster!For some reason, the parallel regridder test failed with xarray 2025.6.1 on the CI. I don't remember it failing here and the upstream run seems to work fine. However, the issue looks real :
xr.map_blocksrequires that all xarray objects it receive can be aligned. We already had code to manage the spatial dimensions, but not really the others coords. In fact, I don't understand why it worked as the input and output dataset both had "lon_bounds" variables with different values.This PR goes in more depth in renaming the coordinates of the datasets as so to avoid alignment issues.