-
Notifications
You must be signed in to change notification settings - Fork 49
Zarr data types refactor compat round 2 #677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zarr data types refactor compat round 2 #677
Conversation
for more information, see https://pre-commit.ci
Note that I've found at least two upstream issues contributing to problems here (as well of plenty of other changes needed):
|
This PR now depends on #680 as a (permanent) workaround for zarr-developers/zarr-python#3254, so #680 should be merged first. |
for more information, see https://pre-commit.ci
raise NotImplementedError("Unsure how to handle broadcasting like this") | ||
|
||
if not utils.metadata_identical(self.metadata, other.metadata): | ||
if not self.metadata.to_dict() == other.metadata.to_dict(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment explains why were are now able to simplify this: zarr-developers/zarr-python#2930 (comment)
if roundtrip_func == roundtrip_as_in_memory_icechunk: | ||
pytest.xfail(reason="xarray can't decode the ns datetime fill_value") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As zarr-developers/zarr-python#2616 was closed, these xfails became xpasses!
Except for this one case - roundtripping via Icechunk, which fails with an error inside xarray's zarr decoders... I don't really understand why this matters, but it should be treated as a separable issue from this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raised #686 to track this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 🎉
virtualizarr/utils.py
Outdated
native_dtype = v3_metadata.data_type.to_native_dtype() | ||
v2_compatible_data_type = parse_data_type(native_dtype, zarr_format=2) | ||
|
||
v2_metadata = ArrayV2Metadata( | ||
shape=v3_metadata.shape, | ||
dtype=v3_metadata.data_type.to_numpy(), | ||
dtype=v2_compatible_data_type, | ||
chunks=v3_metadata.chunks, | ||
fill_value=fill_value or v3_metadata.fill_value, | ||
compressor=compressor_config, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@d-v-b here is where I'm potentially unecessarily converting back and forth
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed in 925ffec
Kerchunk CI has been failing too, I assume doe to the same reasons. Do you have recommendations of how to fix on our end? Or should zarr do something? |
Zarr 3.1.0 introduced significant changes in order to accommodate a more flexible data types system. I'm not surprised that Kerchunk breaks too - both Xarray and VirtualiZarr needed to be updated to adapt. But the specific errors in Kerchunk look different to the ones I had to deal with here. (That's presumably because unlike Kerchunk VirtualiZarr mostly deals with data types via importing the semi-private internal dataclass @martindurant you should probably raise an issue on Kerchunk and tag @d-v-b to discuss. |
Follow-up to #618, aiming to close #676
develop
#676Tests addeddocs/releases.rst
New functions/methods are listed inapi.rst
New functionality has documentation