-
Couldn't load subscription status.
- Fork 43
Zarr support (backend, in esmvalcore.preprocessor._io.py)
#2785
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+640
−1
Merged
Changes from all commits
Commits
Show all changes
81 commits
Select commit
Hold shift + click to select a range
e347f40
add basic zarr support
valeriupredoi 5c32b55
add basic test
valeriupredoi 682f46d
add sample zarr store
valeriupredoi 81c254c
add sample zarr store
valeriupredoi 6a02757
turn on gha
valeriupredoi 84412ab
add zarr as dependency
valeriupredoi 1f8e127
add zarr as dependency
valeriupredoi 5b97169
account for remote zarrs
valeriupredoi 8bcc15f
add test case for remote zarr
valeriupredoi c0b049c
functional remote Zarr and cleanup
valeriupredoi e5f8c4e
add utility and test for remote zarr
valeriupredoi 9265b0d
add intake-esm as dependency
valeriupredoi 4be6152
add aiohttp as dependency
valeriupredoi 28f647f
fixture
valeriupredoi 6da4183
remove unwanted (for now) fixture altogether
valeriupredoi fb7712a
remove unneeded import
valeriupredoi 95a92c9
add storeage options
valeriupredoi 872be18
semi-working version for publick bucket for esmvaltool
valeriupredoi 971cf34
correct bucket with correct permissions and working test
valeriupredoi 0eeeb50
add yet another test
valeriupredoi f5d13c8
adjust test member docstring
valeriupredoi fa8b90a
make io more robust
valeriupredoi cccdb39
change api
valeriupredoi 1618076
test changed api
valeriupredoi e2ed41c
add basic test for zarr file
valeriupredoi fe7326e
add test for file with issues
valeriupredoi 39df34e
reduce pytest runners to 2
valeriupredoi 2b44ac9
run only test load
valeriupredoi caff216
skip a test
valeriupredoi d48418c
change skip message
valeriupredoi 0909770
restore circle ci configuration
valeriupredoi e87b12b
skip the other test that uses the healpix dataset
valeriupredoi 37d8a31
removed problematic skipped tests
valeriupredoi 0d446af
add dedicated Zarr IO test module
valeriupredoi 37fcfff
add xr to ncdata test
valeriupredoi 94d8677
add pytest marker
valeriupredoi 7ac7b45
run zarr test single proc
valeriupredoi caa3657
mark test
valeriupredoi b4c6b6f
remove pytest marker
valeriupredoi 48db5f3
restore circleci configuration
valeriupredoi 0afcec7
unmark test but dont use cf_time flag
valeriupredoi 0c4a16f
set consolidated to False
valeriupredoi 72d79c2
found hang cause
valeriupredoi 8e54f1e
add Ncdata issue pointer
valeriupredoi 1572fff
replace deprecated use cftime
valeriupredoi b1fe4b8
add zar3 test and fixed deprecated call with cftime
valeriupredoi f5c5979
add test non existing file
valeriupredoi 8cddb55
add CMIP6 Zarr store and metadata test for it
valeriupredoi 0d71de7
add test resources
valeriupredoi 2ab8fc0
add purely diagnostic test
valeriupredoi b01b578
feed the PEP typing moster an actual type
valeriupredoi ea9377a
cleanup tests
valeriupredoi 72d87bc
cleanup implement
valeriupredoi 76b32b4
dict typing
valeriupredoi 90c8963
Merge branch 'main' into zarr_support
valeriupredoi 7af2ec4
Update esmvalcore/preprocessor/_io.py
valeriupredoi c151b57
Update esmvalcore/preprocessor/_io.py
valeriupredoi ab78052
Update esmvalcore/preprocessor/_io.py
valeriupredoi b5c3301
add mention about backend dict
valeriupredoi d514b67
add inline text
valeriupredoi 2852381
removed all Zarr tests and moved to test_zarr.py
valeriupredoi 49fb643
moved all tests from test_load here and removed tests that dont test …
valeriupredoi 6a554d8
add mention about s3 bucket
valeriupredoi 683b6e8
spruce up zarr tests and add an extra test for local files
valeriupredoi f2923e6
add dummy zar plaintext file
valeriupredoi 8c49e20
dont match to exception string
valeriupredoi 8b6f221
add info on further testing
valeriupredoi 63411cb
unrun GHA
valeriupredoi 84a33f2
add str path test
valeriupredoi 8909b7d
Update esmvalcore/preprocessor/_io.py
valeriupredoi eff8956
Update esmvalcore/preprocessor/_io.py
valeriupredoi a2e31ab
Update esmvalcore/preprocessor/_io.py
valeriupredoi a387558
Update esmvalcore/preprocessor/_io.py
valeriupredoi 37266da
Update esmvalcore/preprocessor/_io.py
valeriupredoi e13a19e
Update esmvalcore/preprocessor/_io.py
valeriupredoi cef79ce
Update esmvalcore/preprocessor/_io.py
valeriupredoi 63b817f
fix pytest msg regex
valeriupredoi 71ebe4e
better handling of exceptions
valeriupredoi 171ea74
Update tests/integration/preprocessor/_io/test_zarr.py
valeriupredoi 464c9f3
Update tests/integration/preprocessor/_io/test_zarr.py
valeriupredoi 66f9811
Update tests/integration/preprocessor/_io/test_zarr.py
valeriupredoi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,224 @@ | ||
| """ | ||
| Integration tests for :func:`esmvalcore.preprocessor._io._load_zarr`. | ||
|
|
||
| This is a dedicated test module for Zarr files IO; we have identified | ||
| a number of issues with Zarr IO so it deserves its own test module. | ||
|
|
||
| We have a permanent bucket: esmvaltool-zarr at CEDA's object store | ||
| "url": "https://uor-aces-o.s3-ext.jc.rl.ac.uk/esmvaltool-zarr", | ||
| where will host a number of test files. Bucket is anon/anon | ||
| (read/GET-only, but PUT can be allowed). Bucket operations are done | ||
| via usual MinIO client (mc command) e.g. ``mc list``, ``mc du`` etc. | ||
|
|
||
| Further performance investigations are being run with a number of tests | ||
| that look at ncdata at https://github.com/valeriupredoi/esmvaltool_zarr_tests | ||
| also see https://github.com/pp-mo/ncdata/issues/139 | ||
| """ | ||
|
|
||
| from importlib.resources import files as importlib_files | ||
| from pathlib import Path | ||
|
|
||
| import cf_units | ||
| import pytest | ||
|
|
||
| from esmvalcore.preprocessor._io import load | ||
|
|
||
|
|
||
| @pytest.mark.parametrize("input_type", [str, Path]) | ||
| def test_load_zarr2_local(input_type): | ||
| """Test loading a Zarr2 store from local FS.""" | ||
| zarr_path = ( | ||
| Path(importlib_files("tests")) | ||
| / "sample_data" | ||
| / "zarr-sample-data" | ||
| / "example_field_0.zarr2" | ||
| ) | ||
|
|
||
| cubes = load(input_type(zarr_path)) | ||
|
|
||
| assert len(cubes) == 1 | ||
| cube = cubes[0] | ||
| assert cube.var_name == "q" | ||
| assert cube.standard_name == "specific_humidity" | ||
| assert cube.long_name is None | ||
| assert cube.units == cf_units.Unit("1") | ||
| coords = cube.coords() | ||
| coord_names = [coord.standard_name for coord in coords] | ||
| assert "longitude" in coord_names | ||
| assert "latitude" in coord_names | ||
|
|
||
|
|
||
| def test_load_zarr2_remote(): | ||
| """Test loading a Zarr2 store from a https Object Store.""" | ||
| zarr_path = ( | ||
| "https://uor-aces-o.s3-ext.jc.rl.ac.uk/" | ||
| "esmvaltool-zarr/example_field_0.zarr2" | ||
| ) | ||
|
|
||
| # with "dummy" storage options | ||
| cubes = load( | ||
| zarr_path, | ||
| ignore_warnings=None, | ||
| backend_kwargs={"storage_options": {}}, | ||
| ) | ||
|
|
||
| assert len(cubes) == 1 | ||
| cube = cubes[0] | ||
| assert cube.var_name == "q" | ||
| assert cube.standard_name == "specific_humidity" | ||
| assert cube.long_name is None | ||
| assert cube.units == cf_units.Unit("1") | ||
| coords = cube.coords() | ||
| coord_names = [coord.standard_name for coord in coords] | ||
| assert "longitude" in coord_names | ||
| assert "latitude" in coord_names | ||
|
|
||
| # without storage_options | ||
| cubes = load(zarr_path) | ||
|
|
||
| assert len(cubes) == 1 | ||
| cube = cubes[0] | ||
| assert cube.var_name == "q" | ||
| assert cube.standard_name == "specific_humidity" | ||
| assert cube.long_name is None | ||
| assert cube.units == cf_units.Unit("1") | ||
| coords = cube.coords() | ||
| coord_names = [coord.standard_name for coord in coords] | ||
| assert "longitude" in coord_names | ||
| assert "latitude" in coord_names | ||
|
|
||
|
|
||
| def test_load_zarr3_remote(): | ||
| """Test loading a Zarr3 store from a https Object Store.""" | ||
| zarr_path = ( | ||
| "https://uor-aces-o.s3-ext.jc.rl.ac.uk/" | ||
| "esmvaltool-zarr/example_field_0.zarr3" | ||
| ) | ||
|
|
||
| # with "dummy" storage options | ||
| cubes = load( | ||
| zarr_path, | ||
| ignore_warnings=None, | ||
| backend_kwargs={"storage_options": {}}, | ||
| ) | ||
|
|
||
| assert len(cubes) == 1 | ||
| cube = cubes[0] | ||
| assert cube.var_name == "q" | ||
| assert cube.standard_name == "specific_humidity" | ||
| assert cube.long_name is None | ||
| assert cube.units == cf_units.Unit("1") | ||
| coords = cube.coords() | ||
| coord_names = [coord.standard_name for coord in coords] | ||
| assert "longitude" in coord_names | ||
| assert "latitude" in coord_names | ||
|
|
||
|
|
||
| def test_load_zarr3_cmip6_metadata(): | ||
| """ | ||
| Test loading a Zarr3 store from a https Object Store. | ||
|
|
||
| This test loads just the metadata, no computations. | ||
|
|
||
| This is an actual CMIP6 dataset (Zarr built from netCDF4 via Xarray) | ||
| - Zarr store on disk: 243 MiB | ||
| - compression: Blosc | ||
| - Dimensions: (lat: 128, lon: 256, time: 2352, axis_nbounds: 2) | ||
| - chunking: time-slices; netCDF4.Dataset.chunking() = [1, 128, 256] | ||
|
|
||
| Test takes 8-9s (median: 8.5s) and needs max Res mem: 1GB | ||
| """ | ||
| zarr_path = ( | ||
| "https://uor-aces-o.s3-ext.jc.rl.ac.uk/" | ||
| "esmvaltool-zarr/pr_Amon_CNRM-ESM2-1_02Kpd-11_r1i1p2f2_gr_200601-220112.zarr3" | ||
| ) | ||
|
|
||
| # with "dummy" storage options | ||
| cubes = load( | ||
| zarr_path, | ||
| ignore_warnings=None, | ||
| backend_kwargs={"storage_options": {}}, | ||
| ) | ||
|
|
||
| assert len(cubes) == 1 | ||
| cube = cubes[0] | ||
| assert cube.var_name == "pr" | ||
| assert cube.standard_name == "precipitation_flux" | ||
| assert cube.long_name == "Precipitation" | ||
| assert cube.units == cf_units.Unit("kg m-2 s-1") | ||
| assert cube.has_lazy_data() | ||
|
|
||
|
|
||
| def test_load_zarr_remote_not_zarr_file(): | ||
| """ | ||
| Test loading a Zarr store from a https Object Store. | ||
|
|
||
| This fails due to the file being loaded is not a Zarr file. | ||
| """ | ||
| zarr_path = ( | ||
| "https://uor-aces-o.s3-ext.jc.rl.ac.uk/" | ||
| "esmvaltool-zarr/example_field_0.zarr17" | ||
| ) | ||
|
|
||
| msg = ( | ||
| "File 'https://uor-aces-o.s3-ext.jc.rl.ac.uk/" | ||
| "esmvaltool-zarr/example_field_0.zarr17' can not " | ||
| "be opened as Zarr file at the moment." | ||
| ) | ||
| with pytest.raises(ValueError, match=msg): | ||
| load(zarr_path) | ||
|
|
||
|
|
||
| def test_load_zarr_remote_not_file(): | ||
| """ | ||
| Test loading a Zarr store from a https Object Store. | ||
|
|
||
| This fails due to non-existing file. | ||
| """ | ||
| zarr_path = ( | ||
| "https://uor-aces-o.s3-ext.jc.rl.ac.uk/" | ||
| "esmvaltool-zarr/example_field_0.zarr22" | ||
| ) | ||
|
|
||
| msg = ( | ||
| "File 'https://uor-aces-o.s3-ext.jc.rl.ac.uk/" | ||
| "esmvaltool-zarr/example_field_0.zarr22' can not " | ||
| "be opened as Zarr file at the moment." | ||
| ) | ||
| with pytest.raises(ValueError, match=msg): | ||
| load(zarr_path) | ||
|
|
||
|
|
||
| def test_load_zarr_local_not_file(): | ||
| """ | ||
| Test loading something that has a zarr extension. | ||
|
|
||
| But file doesn't exist (on local FS). | ||
| """ | ||
| zarr_path = "esmvaltool-zarr/example_field_0.zarr22" | ||
|
|
||
| # "Unable to find group" or "No group found" | ||
| # Zarr keeps changing the exception string so matching | ||
| # is bound to fail the test | ||
| with pytest.raises(FileNotFoundError): | ||
| load(zarr_path) | ||
|
|
||
|
|
||
| def test_load_zarr_local_not_zarr_file(): | ||
| """ | ||
| Test loading something that has a zarr extension. | ||
|
|
||
| But file is plaintext (on local FS). | ||
| """ | ||
| zarr_path = ( | ||
| Path(importlib_files("tests")) | ||
| / "sample_data" | ||
| / "zarr-sample-data" | ||
| / "example_field_0.zarr17" | ||
| ) | ||
|
|
||
| # "Unable to find group" or "No group found" | ||
| # Zarr keeps changing the exception string so matching | ||
| # is bound to fail the test | ||
| with pytest.raises(FileNotFoundError): | ||
| load(zarr_path) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| This is not a Zarr file. Go grab lunch! |
3 changes: 3 additions & 0 deletions
3
tests/sample_data/zarr-sample-data/example_field_0.zarr2/.zattrs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| { | ||
| "Conventions": "CF-1.12" | ||
| } |
3 changes: 3 additions & 0 deletions
3
tests/sample_data/zarr-sample-data/example_field_0.zarr2/.zgroup
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| { | ||
| "zarr_format": 2 | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.