-
-
Notifications
You must be signed in to change notification settings - Fork 364
Closed
Labels
bugPotential issues with the zarr-python libraryPotential issues with the zarr-python library
Milestone
Description
Zarr version
v3
Numcodecs version
n/a
Python Version
n/a
Operating System
n/a
Installation
v3
Description
Discovered while looking at the xarray unit tests for Zarr.
Steps to reproduce
import zarr
import numpy as np
from zarr.api import asynchronous
store = zarr.store.MemoryStore({}, mode="w")
arr = zarr.open_array(store=store, path="a", shape=(), dtype=np.dtype("S6"))
arr.update_attributes({"key": "value"})
The .put
call raises with
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[1], line 7
5 store = zarr.store.MemoryStore({}, mode="w")
6 arr = zarr.open_array(store=store, path="a", shape=(), dtype=np.dtype("S6"))
----> 7 arr.update_attributes({"k": "v"})
File ~/gh/zarr-developers/zarr-python/src/zarr/core/array.py:2055, in Array.update_attributes(self, new_attributes)
2053 def update_attributes(self, new_attributes: dict[str, JSON]) -> Array:
2054 return type(self)(
-> 2055 sync(
2056 self._async_array.update_attributes(new_attributes),
2057 )
2058 )
File ~/gh/zarr-developers/zarr-python/src/zarr/core/sync.py:91, in sync(coro, loop, timeout)
88 return_result = next(iter(finished)).result()
90 if isinstance(return_result, BaseException):
---> 91 raise return_result
92 else:
93 return return_result
File ~/gh/zarr-developers/zarr-python/src/zarr/core/sync.py:50, in _runner(coro)
45 """
46 Await a coroutine and return the result of running it. If awaiting the coroutine raises an
47 exception, the exception will be returned.
48 """
49 try:
---> 50 return await coro
51 except Exception as ex:
52 return ex
File ~/gh/zarr-developers/zarr-python/src/zarr/core/array.py:601, in AsyncArray.update_attributes(self, new_attributes)
600 async def update_attributes(self, new_attributes: dict[str, JSON]) -> AsyncArray:
--> 601 new_metadata = self.metadata.update_attributes(new_attributes)
603 # Write new metadata
604 await self._save_metadata(new_metadata)
File ~/gh/zarr-developers/zarr-python/src/zarr/core/metadata.py:323, in ArrayV3Metadata.update_attributes(self, attributes)
322 def update_attributes(self, attributes: dict[str, JSON]) -> Self:
--> 323 return replace(self, attributes=attributes)
File ~/mambaforge/lib/python3.10/dataclasses.py:1454, in replace(obj, **changes)
1447 changes[f.name] = getattr(obj, f.name)
1449 # Create the new object, which calls __init__() and
1450 # __post_init__() (if defined), using all of the init fields we've
1451 # added and/or left in 'changes'. If there are values supplied in
1452 # changes that aren't fields, this will correctly raise a
1453 # TypeError.
-> 1454 return obj.__class__(**changes)
File ~/gh/zarr-developers/zarr-python/src/zarr/core/metadata.py:189, in ArrayV3Metadata.__init__(self, shape, data_type, chunk_grid, chunk_key_encoding, fill_value, codecs, attributes, dimension_names)
187 chunk_key_encoding_parsed = ChunkKeyEncoding.from_dict(chunk_key_encoding)
188 dimension_names_parsed = parse_dimension_names(dimension_names)
--> 189 fill_value_parsed = parse_fill_value_v3(fill_value, dtype=data_type_parsed)
190 attributes_parsed = parse_attributes(attributes)
191 codecs_parsed_partial = parse_codecs(codecs)
File ~/gh/zarr-developers/zarr-python/src/zarr/core/metadata.py:656, in parse_fill_value_v3(fill_value, dtype)
654 raise ValueError(msg)
655 msg = f"Cannot parse non-string sequence {fill_value} as a scalar with type {dtype}."
--> 656 raise TypeError(msg)
657 return dtype.type(fill_value)
TypeError: Cannot parse non-string sequence b'' as a scalar with type |S6.
Additional output
The Array.update_attributes
call is a bit of a red herring. That just happens to go through Metadata.__init__
, which validates the fill value for a given dtype. Only this type we use fill_value=np.bytes(b"")
, which is incompatible with the dtype np.dtype("S6")
.
So I think the shorter reproducer is something like
arr = zarr.open_array(store=store, path="a", shape=(), dtype=np.dtype("S6"))
assert arr.metadata.fill_value != np.bytes_("b"))
in other words, maybe(?) our inferred fill value is incorrect for this specific dtype.
Metadata
Metadata
Assignees
Labels
bugPotential issues with the zarr-python libraryPotential issues with the zarr-python library
Type
Projects
Status
Done