-
-
Notifications
You must be signed in to change notification settings - Fork 366
Closed
Labels
bugPotential issues with the zarr-python libraryPotential issues with the zarr-python library
Milestone
Description
Zarr version
2.13.3
Numcodecs version
n/a
Python Version
3.10
Operating System
Mac
Installation
poetry
Description
I have discovered that certain zarr methods double count groups in V3. The methods Group.group_keys() and Group.groups() return the group twice when the group contains children.
Instead, each group should only be returned once.
Steps to reproduce
The following test should pass
import zarr
def test_double_counting_group_v3():
store = zarr.MemoryStoreV3()
root_group = zarr.group(store=store, zarr_version=3)
sub_group = root_group.create_group("foo")
array = sub_group.create("bar", shape=10, dtype="i4")
# this works and is in spec (although the spec says it should be "foo/")
assert store.listdir("meta/root") == ['foo', 'foo.group.json']
# these should work but don't: instead we get foo twice
assert list(root_group.group_keys()) == ['foo']
assert list(root_group.groups()) == [("foo", sub_group)]What is happening is that the the listdir call here is returning both foo and foo.group.json. Because contains_group('foo') is valid, the group is returned twice.
Lines 513 to 521 in ce129a5
| for key in sorted(listdir(self._store, dir_name)): | |
| if key.endswith(group_sfx): | |
| key = key[:-len(group_sfx)] | |
| path = self._key_prefix + key | |
| if path.endswith(".array" + self._metadata_key_suffix): | |
| # skip array keys | |
| continue | |
| if contains_group(self._store, path, explicit_only=False): | |
| yield key |
If others agree this is a bug, I will submit a PR to fix it.
Additional output
No response
Metadata
Metadata
Assignees
Labels
bugPotential issues with the zarr-python libraryPotential issues with the zarr-python library